ROCKET-1 / README.md

Improve model card: Add pipeline tag, library, and additional information (#1)

d64b708 verified 12 months ago

2.97 kB

	---
	license: mit
	tags:
	- model_hub_mixin
	- pytorch_model_hub_mixin
	pipeline_tag: robotics
	library_name: pytorch
	---

	This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
	- Library: https://huggingface.co/phython96/ROCKET-1
	- Docs: [More Information Needed]
	- Paper: https://huggingface.co/papers/2410.17856
	- Github: https://github.com/CraftJarvis/ROCKET-1
	- Project: https://craftjarvis.github.io/ROCKET-1

	## Usage
	```python
	from rocket.arm.models import ROCKET1
	from rocket.stark_tech.env_interface import MinecraftWrapper

	model = ROCKET1.from_pretrained("phython96/ROCKET-1").to("cuda")
	memory = None
	input = {
	"img": torch.rand(224, 224, 3, dtype=torch.uint8),
	'segment': {
	'obj_id': torch.tensor(6), # specify the interaction type
	'obj_mask': torch.zeros(224, 224, dtype=torch.uint8), # highlight the regions of interest
	}
	}
	agent_action, memory = model.get_action(input, memory, first=None, input_shape="*")
	env_action = MinecraftWrapper.agent_action_to_env(agent_action)

	# --------------------- the output --------------------- #
	# agent_action = {'buttons': tensor([1], device='cuda:0'), 'camera': tensor([54], device='cuda:0')}
	# env_action = {'attack': array(0), 'back': array(0), 'forward': array(0), 'jump': array(0), 'left': array(0), 'right': array(0), 'sneak': array(0), 'sprint': array(0), 'use': array(0), 'drop': array(0), 'inventory': array(0), 'hotbar.1': array(0), 'hotbar.2': array(0), 'hotbar.3': array(0), 'hotbar.4': array(0), 'hotbar.5': array(0), 'hotbar.6': array(0), 'hotbar.7': array(0), 'hotbar.8': array(0), 'hotbar.9': array(0), 'camera': array([-0.61539427, 10. ])}
	```

	## Interaction Details

	Here are some interaction types:
	\| interaction \| obj_id \| function \|
	\| --- \| --- \| --- \|
	\| Hunt \| 0 \| Approach the animals then kill it. \|
	\| Mine \| 2 \| Approach and mine the target object. \|
	\| Interact \| 3 \| Approach and right click the target object. \|
	\| Craft \| 4 \| Move the cursor to the item and click on it. \|
	\| Switch \| 5 \| Highlight an item in the hotkey bar, then switch to holding state. \|
	\| Approach \| 6 \| Approach the target object. \|

	## Play ROCKET-1 with Gradio
	Click the following picture to learn how to play ROCKET-1 with gradio.
	[![](rocket/assets/gradio.png)](https://www.youtube.com/embed/qXLWw81p-Y0)

	```sh
	cd rocket/arm
	python eval_rocket.py --port 8110 --sam-path "/path/to/sam2-ckpt-directory"
	```


	## Citing ROCKET-1
	If you use ROCKET-1 in your research, please use the following BibTeX entry.

	```
	@article{cai2024rocket,
	title={ROCKET-1: Master Open-World Interaction with Visual-Temporal Context Prompting},
	author={Cai, Shaofei and Wang, Zihao and Lian, Kewei and Mu, Zhancun and Ma, Xiaojian and Liu, Anji and Liang, Yitao},
	journal={arXiv preprint arXiv:2410.17856},
	year={2024}
	}
	```