English
kmo-nvidia commited on
Commit
9e9aff7
·
verified ·
1 Parent(s): 587721f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +98 -6
README.md CHANGED
@@ -1,6 +1,98 @@
1
- ---
2
- license: other
3
- license_name: nvidia-open-model-license
4
- license_link: >-
5
- https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ license_name: nvidia-open-model-license
4
+ license_link: https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
5
+ language:
6
+ - en
7
+ ---
8
+
9
+ # Model Card for PointWorld
10
+
11
+ ## Description
12
+ PointWorld is an action-conditioned 3D world model for robotic manipulation.
13
+ Pre-trained on 500 hours of in-the-wild 3D interactions, PointWorld predicts environment dynamics from RGB-D capture(s) and robot actions with unified state-action representation as 3D point flows.
14
+
15
+ This model card covers the pretrained checkpoints released under the PointWorld checkpoint package.
16
+
17
+ ## License/Terms of Use
18
+ [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/)
19
+
20
+ ## Deployment Geography
21
+ Global
22
+
23
+ ## Use Case
24
+ Given one or a few RGB-D observations and robot actions, it predicts environment dynamics with unified state-action representation as 3D point flows.
25
+ PointWorld is intended for research and development in robotics, computer vision, and world modeling.
26
+
27
+ ## Release Date
28
+ - Paper: 01/07/2026 ([arXiv:2601.03782](https://arxiv.org/abs/2601.03782))
29
+ - Checkpoint release: TBD
30
+
31
+ ## Reference(s)
32
+ - [Project Website](https://point-world.github.io/)
33
+ - [Paper](https://arxiv.org/abs/2601.03782)
34
+ - [Code](https://github.com/NVlabs/PointWorld)
35
+
36
+ ## Model Architecture
37
+ **Architecture Type:** Transformer
38
+ **Network Architecture:** Point Transformer V3
39
+
40
+ ## Input
41
+ **Input Type(s):** RGB-D Images, Robot Actions
42
+ **Input Format(s):** RGB image, depth image, action/state tensors
43
+ **Other Properties Related to Input:** Resolution is `320x180` for RGB/depth images.
44
+
45
+ ## Output
46
+ **Output Type(s):** 3D point flows
47
+ **Output Format:** 3D point trajectories
48
+
49
+ Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
50
+
51
+ ## Software Integration
52
+
53
+ ### Runtime Engine(s)
54
+ * PyTorch
55
+
56
+ ### Supported Hardware Microarchitecture Compatibility
57
+ - NVIDIA Ampere
58
+ - NVIDIA Hopper
59
+
60
+ ### Preferred Operating System(s)
61
+ * Linux
62
+
63
+ ## Model Version(s)
64
+ v1.0
65
+
66
+ ## Training, Testing, and Evaluation Datasets
67
+
68
+ We perform training, testing, and evaluation on the DROID and BEHAVIOR datasets with custom 3D annotations.
69
+
70
+ ### DROID
71
+
72
+ **Link**: https://droid-dataset.github.io/
73
+
74
+ **Data Collection method**: Manual
75
+
76
+ **Labeling Method by dataset**: N/A (no labels)
77
+
78
+ **Properties**: We use a subset of the DROID dataset filtered by the quality of our custom 3D annotations.
79
+
80
+ ### BEHAVIOR
81
+
82
+ **Link**: https://behavior.stanford.edu/
83
+
84
+ **Data Collection method**: Manual
85
+
86
+ **Labeling Method by dataset**: N/A (no labels)
87
+
88
+ **Properties**: We use a subset of the BEHAVIOR dataset filtered by the interaction quality.
89
+
90
+ ## Inference
91
+ **Acceleration Engine:** PyTorch
92
+ **Test Hardware:** NVIDIA RTX 4090, NVIDIA H100, NVIDIA A100
93
+
94
+ ## Ethical Considerations
95
+
96
+ NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
97
+
98
+ Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).