insomnia7 commited on
Commit
bcc6605
·
verified ·
1 Parent(s): 98d00e1

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +1 -0
  2. README.md +166 -0
  3. __init__.py +33 -0
  4. __pycache__/__init__.cpython-310.pyc +0 -0
  5. __pycache__/__init__.cpython-311.pyc +0 -0
  6. __pycache__/adaptor_base.cpython-310.pyc +0 -0
  7. __pycache__/adaptor_base.cpython-311.pyc +0 -0
  8. __pycache__/adaptor_generic.cpython-310.pyc +0 -0
  9. __pycache__/adaptor_generic.cpython-311.pyc +0 -0
  10. __pycache__/adaptor_mlp.cpython-310.pyc +0 -0
  11. __pycache__/adaptor_mlp.cpython-311.pyc +0 -0
  12. __pycache__/adaptor_registry.cpython-310.pyc +0 -0
  13. __pycache__/adaptor_registry.cpython-311.pyc +0 -0
  14. __pycache__/cls_token.cpython-310.pyc +0 -0
  15. __pycache__/cls_token.cpython-311.pyc +0 -0
  16. __pycache__/common.cpython-310.pyc +0 -0
  17. __pycache__/common.cpython-311.pyc +0 -0
  18. __pycache__/configuration_vectorllm.cpython-310.pyc +0 -0
  19. __pycache__/configuration_vectorllm.cpython-311.pyc +0 -0
  20. __pycache__/dinov2_arch.cpython-310.pyc +0 -0
  21. __pycache__/dinov2_arch.cpython-311.pyc +0 -0
  22. __pycache__/dual_hybrid_vit.cpython-310.pyc +0 -0
  23. __pycache__/dual_hybrid_vit.cpython-311.pyc +0 -0
  24. __pycache__/enable_cpe_support.cpython-310.pyc +0 -0
  25. __pycache__/enable_cpe_support.cpython-311.pyc +0 -0
  26. __pycache__/enable_spectral_reparam.cpython-310.pyc +0 -0
  27. __pycache__/enable_spectral_reparam.cpython-311.pyc +0 -0
  28. __pycache__/eradio_model.cpython-310.pyc +0 -0
  29. __pycache__/eradio_model.cpython-311.pyc +0 -0
  30. __pycache__/extra_models.cpython-310.pyc +0 -0
  31. __pycache__/extra_models.cpython-311.pyc +0 -0
  32. __pycache__/extra_timm_models.cpython-310.pyc +0 -0
  33. __pycache__/extra_timm_models.cpython-311.pyc +0 -0
  34. __pycache__/feature_normalizer.cpython-310.pyc +0 -0
  35. __pycache__/feature_normalizer.cpython-311.pyc +0 -0
  36. __pycache__/forward_intermediates.cpython-310.pyc +0 -0
  37. __pycache__/forward_intermediates.cpython-311.pyc +0 -0
  38. __pycache__/hf_model.cpython-310.pyc +0 -0
  39. __pycache__/hf_model.cpython-311.pyc +0 -0
  40. __pycache__/image_processing_vectorllm.cpython-310.pyc +0 -0
  41. __pycache__/image_processing_vectorllm.cpython-311.pyc +0 -0
  42. __pycache__/input_conditioner.cpython-310.pyc +0 -0
  43. __pycache__/input_conditioner.cpython-311.pyc +0 -0
  44. __pycache__/modeling_vectorllm.cpython-310.pyc +0 -0
  45. __pycache__/modeling_vectorllm.cpython-311.pyc +0 -0
  46. __pycache__/open_clip_adaptor.cpython-310.pyc +0 -0
  47. __pycache__/open_clip_adaptor.cpython-311.pyc +0 -0
  48. __pycache__/processing_vectorllm.cpython-310.pyc +0 -0
  49. __pycache__/processing_vectorllm.cpython-311.pyc +0 -0
  50. __pycache__/radio_model.cpython-310.pyc +0 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # VectorLLM HF 0407
2
+
3
+ 这是当前可直接分发使用的 Hugging Face 版本导出目录模板,已经包含:
4
+
5
+ - C-RADIO 视觉塔实现与权重映射逻辑
6
+ - VectorLLM 自定义 `AutoModel` / `AutoProcessor`
7
+ - 本地可加载的 processor / image processor / modeling 代码
8
+ - `bfloat16` 推理配置
9
+
10
+ 当前推荐在 GPU 上使用,推荐环境:
11
+
12
+ ```bash
13
+ /home/zhangtao/env/xtuner/bin/python
14
+ ```
15
+
16
+ ## 目录说明
17
+
18
+ - `model.safetensors`: 合并后的主模型权重
19
+ - `config.json`: 主配置
20
+ - `generation_config.json`: 生成配置
21
+ - `preprocessor_config.json`: 图像预处理配置
22
+ - `test_hf.py`: 单图推理与可视化示例脚本
23
+ - `gradio_bbox_demo.py`: 整图画框、后端裁剪和全图回映射的 Gradio 脚本
24
+ - `conversion_report.json`: 与 xtuner 对齐验证结果
25
+ - `radio_bundle/`: 打包后的 C-RADIO 相关实现
26
+
27
+ ## 设计目标
28
+
29
+ - 可通过 `AutoModel` 和 `AutoProcessor` 拉起
30
+ - 不依赖外部 Hugging Face cache 中的自定义代码
31
+ - 不要求 `trust_remote_code=True`
32
+ - 适用于“单个物体已裁剪好”的输入图像,不做二次裁剪
33
+
34
+ ## 快速开始
35
+
36
+ ### 1. 命令行推理
37
+
38
+ 推荐直接使用目录内置的 [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py):
39
+
40
+ ```bash
41
+ /home/zhangtao/env/xtuner/bin/python \
42
+ /home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py \
43
+ /home/zhangtao/hf_model/vectorllm_hf_0407 \
44
+ /path/to/your_image.png \
45
+ --save-dir /tmp/vectorllm_hf_demo
46
+ ```
47
+
48
+ 输出:
49
+
50
+ - `overlay.png`: 叠加 polygon 的可视化结果
51
+ - `report.json`: 文本输出、网格坐标、回映射后的 polygon
52
+
53
+ ### 2. Python 方式加载
54
+
55
+ 必须先把模型目录的父目录加入 `sys.path`,再导入包本身完成本地注册:
56
+
57
+ ```python
58
+ import sys
59
+ from pathlib import Path
60
+
61
+ import torch
62
+ from PIL import Image
63
+ from transformers import AutoModel, AutoProcessor, GenerationConfig
64
+
65
+ model_path = Path("/home/zhangtao/hf_model/vectorllm_hf_0407")
66
+ sys.path.insert(0, str(model_path.parent))
67
+ import vectorllm_hf_0407 # noqa: F401
68
+
69
+ from vectorllm_hf_0407.test_hf import DEFAULT_RAW_PROMPT, decode_generated_text, get_stop_criteria
70
+
71
+ model = AutoModel.from_pretrained(
72
+ model_path,
73
+ trust_remote_code=False,
74
+ torch_dtype=torch.bfloat16,
75
+ ).cuda().eval()
76
+ processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=False)
77
+ tokenizer = processor.tokenizer
78
+
79
+ image = Image.open("/path/to/your_image.png").convert("RGB")
80
+ model_inputs = processor(text=[DEFAULT_RAW_PROMPT], images=[image], return_tensors="pt")
81
+ model_inputs = {
82
+ key: value.to(model.device) if torch.is_tensor(value) else value
83
+ for key, value in model_inputs.items()
84
+ }
85
+
86
+ stop_criteria = get_stop_criteria(tokenizer, ["<|im_end|>", "<|endoftext|>"])
87
+ output = model.generate(
88
+ **model_inputs,
89
+ generation_config=GenerationConfig(
90
+ max_new_tokens=640,
91
+ do_sample=False,
92
+ eos_token_id=tokenizer.eos_token_id,
93
+ pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
94
+ temperature=0.0,
95
+ top_k=1,
96
+ ),
97
+ bos_token_id=tokenizer.bos_token_id,
98
+ stopping_criteria=stop_criteria,
99
+ output_hidden_states=False,
100
+ return_dict_in_generate=True,
101
+ do_sample=False,
102
+ temperature=0.0,
103
+ top_k=1,
104
+ )
105
+
106
+ text = decode_generated_text(output, model_inputs, tokenizer)
107
+ print(text)
108
+ ```
109
+
110
+ ### 3. Gradio 交互
111
+
112
+ 若输入是整图,希望在图上手动画 bbox,再由后端扩框裁剪后送入模型,可直接启动:
113
+
114
+ ```bash
115
+ /home/zhangtao/env/xtuner/bin/python \
116
+ /home/zhangtao/hf_model/vectorllm_hf_0407/gradio_bbox_demo.py \
117
+ --model-path /home/zhangtao/hf_model/vectorllm_hf_0407 \
118
+ --server-name 0.0.0.0 \
119
+ --server-port 7861
120
+ ```
121
+
122
+ 功能说明:
123
+
124
+ - 前端直接在整图上拖拽一个或多个 bbox
125
+ - 后端按 `1.0-1.3` 可调扩展比例裁剪
126
+ - 裁剪图送入 VectorLLM HF 模型推理
127
+ - 结果会回映射到整图,并展示全图 overlay、裁剪预览和结构化 JSON
128
+
129
+ 使用步骤:
130
+
131
+ 1. 打开页面后先上传整图。
132
+ 2. 在左侧画布上拖拽 bbox,可连续画多个框。
133
+ 3. `Prompt Target` 选择 `Building` 或 `Object`。
134
+ 4. `BBox Expand Ratio` 控制后端扩框比例。`1.0` 表示仅按原框裁剪,`1.15` 或 `1.2` 通常更稳,`1.3` 适合给目标留更多上下文。
135
+ 5. 点击 `Run` 执行推理,点击 `Clear` 清空当前图像和框。
136
+
137
+ 页面输出说明:
138
+
139
+ - `Full-Image Overlay`: 全图可视化结果,同时叠加原始 bbox、扩框后的 bbox 和回映射后的 polygon
140
+ - `Expanded Crop Preview`: 每个 bbox 对应的裁剪图及其局部 polygon,可用来快速检查裁剪是否合理
141
+ - `Model Text Output`: 模型原始输出文本,便于排查 `<xN><yN>` 序列
142
+ - `Structured Result`: 结构化 JSON,包含原框、扩框、裁剪尺寸、网格坐标和全图坐标 polygon
143
+
144
+ 补充说明:
145
+
146
+ - 当前脚本默认监听 `7861` 端口,可通过 `--server-port` 调整
147
+ - 服务启动后若需停止,终端里直接 `Ctrl+C` 即可
148
+ - 首次启动会加载大模��,耗时会明显长于后续推理
149
+
150
+ ## 输入要求
151
+
152
+ - 输入应为单个目标的裁剪图
153
+ - 当前默认 prompt 针对建筑轮廓提取
154
+ - 模型输出为 `<xN><yN>` 格式的离散 polygon 点序列
155
+
156
+ ## 结果说明
157
+
158
+ - 网格坐标范围默认是 `0~127`
159
+ - [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py) 会自动把离散点映射回原图坐标并可视化
160
+ - [gradio_bbox_demo.py](/home/zhangtao/hf_model/vectorllm_hf_0407/gradio_bbox_demo.py) 会把裁剪区域内的离散点恢复到全图坐标后再叠加显示
161
+ - 当前 GPU 实测下,回载后的 HF 输出与 xtuner 已基本对齐,可能存在极轻微的离散点差异
162
+
163
+ ## 备注
164
+
165
+ - 若从别的工作目录调用,不要省略 `sys.path.insert(...)` 和 `import vectorllm_hf_0407`
166
+ - 若只想快速跑 demo,优先使用 [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py)
__init__.py ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from transformers import AutoConfig, AutoModel, AutoModelForCausalLM
2
+
3
+ from .configuration_vectorllm import VectorLLMConfig
4
+ from .hf_model import RADIOConfig, RADIOModel
5
+ from .image_processing_vectorllm import VectorLLMImageProcessor
6
+ from .modeling_vectorllm import VectorLLMForCausalLM
7
+ from .processing_vectorllm import VectorLLMProcessor
8
+
9
+
10
+ def _safe_register(register_fn, *args):
11
+ try:
12
+ register_fn(*args)
13
+ except ValueError:
14
+ pass
15
+
16
+
17
+ def bootstrap_local_registry():
18
+ _safe_register(AutoConfig.register, VectorLLMConfig.model_type, VectorLLMConfig)
19
+ _safe_register(AutoModel.register, VectorLLMConfig, VectorLLMForCausalLM)
20
+ _safe_register(AutoModelForCausalLM.register, VectorLLMConfig, VectorLLMForCausalLM)
21
+
22
+
23
+ bootstrap_local_registry()
24
+
25
+ __all__ = [
26
+ "RADIOConfig",
27
+ "RADIOModel",
28
+ "VectorLLMConfig",
29
+ "VectorLLMForCausalLM",
30
+ "VectorLLMImageProcessor",
31
+ "VectorLLMProcessor",
32
+ "bootstrap_local_registry",
33
+ ]
__pycache__/__init__.cpython-310.pyc ADDED
Binary file (1.03 kB). View file
 
__pycache__/__init__.cpython-311.pyc ADDED
Binary file (1.56 kB). View file
 
__pycache__/adaptor_base.cpython-310.pyc ADDED
Binary file (1.43 kB). View file
 
__pycache__/adaptor_base.cpython-311.pyc ADDED
Binary file (2.14 kB). View file
 
__pycache__/adaptor_generic.cpython-310.pyc ADDED
Binary file (2.17 kB). View file
 
__pycache__/adaptor_generic.cpython-311.pyc ADDED
Binary file (4.34 kB). View file
 
__pycache__/adaptor_mlp.cpython-310.pyc ADDED
Binary file (4.94 kB). View file
 
__pycache__/adaptor_mlp.cpython-311.pyc ADDED
Binary file (9.25 kB). View file
 
__pycache__/adaptor_registry.cpython-310.pyc ADDED
Binary file (1.44 kB). View file
 
__pycache__/adaptor_registry.cpython-311.pyc ADDED
Binary file (2.03 kB). View file
 
__pycache__/cls_token.cpython-310.pyc ADDED
Binary file (1.55 kB). View file
 
__pycache__/cls_token.cpython-311.pyc ADDED
Binary file (2.51 kB). View file
 
__pycache__/common.cpython-310.pyc ADDED
Binary file (2.3 kB). View file
 
__pycache__/common.cpython-311.pyc ADDED
Binary file (3.11 kB). View file
 
__pycache__/configuration_vectorllm.cpython-310.pyc ADDED
Binary file (3.43 kB). View file
 
__pycache__/configuration_vectorllm.cpython-311.pyc ADDED
Binary file (5.65 kB). View file
 
__pycache__/dinov2_arch.cpython-310.pyc ADDED
Binary file (29.8 kB). View file
 
__pycache__/dinov2_arch.cpython-311.pyc ADDED
Binary file (54.4 kB). View file
 
__pycache__/dual_hybrid_vit.cpython-310.pyc ADDED
Binary file (6.56 kB). View file
 
__pycache__/dual_hybrid_vit.cpython-311.pyc ADDED
Binary file (13.4 kB). View file
 
__pycache__/enable_cpe_support.cpython-310.pyc ADDED
Binary file (4.52 kB). View file
 
__pycache__/enable_cpe_support.cpython-311.pyc ADDED
Binary file (7.87 kB). View file
 
__pycache__/enable_spectral_reparam.cpython-310.pyc ADDED
Binary file (8.99 kB). View file
 
__pycache__/enable_spectral_reparam.cpython-311.pyc ADDED
Binary file (17.1 kB). View file
 
__pycache__/eradio_model.cpython-310.pyc ADDED
Binary file (39.4 kB). View file
 
__pycache__/eradio_model.cpython-311.pyc ADDED
Binary file (73.7 kB). View file
 
__pycache__/extra_models.cpython-310.pyc ADDED
Binary file (7.07 kB). View file
 
__pycache__/extra_models.cpython-311.pyc ADDED
Binary file (11.6 kB). View file
 
__pycache__/extra_timm_models.cpython-310.pyc ADDED
Binary file (7.52 kB). View file
 
__pycache__/extra_timm_models.cpython-311.pyc ADDED
Binary file (12.4 kB). View file
 
__pycache__/feature_normalizer.cpython-310.pyc ADDED
Binary file (4.8 kB). View file
 
__pycache__/feature_normalizer.cpython-311.pyc ADDED
Binary file (8.78 kB). View file
 
__pycache__/forward_intermediates.cpython-310.pyc ADDED
Binary file (4.38 kB). View file
 
__pycache__/forward_intermediates.cpython-311.pyc ADDED
Binary file (7.09 kB). View file
 
__pycache__/hf_model.cpython-310.pyc ADDED
Binary file (6.44 kB). View file
 
__pycache__/hf_model.cpython-311.pyc ADDED
Binary file (10.3 kB). View file
 
__pycache__/image_processing_vectorllm.cpython-310.pyc ADDED
Binary file (3.9 kB). View file
 
__pycache__/image_processing_vectorllm.cpython-311.pyc ADDED
Binary file (6.33 kB). View file
 
__pycache__/input_conditioner.cpython-310.pyc ADDED
Binary file (1.52 kB). View file
 
__pycache__/input_conditioner.cpython-311.pyc ADDED
Binary file (2.46 kB). View file
 
__pycache__/modeling_vectorllm.cpython-310.pyc ADDED
Binary file (8.18 kB). View file
 
__pycache__/modeling_vectorllm.cpython-311.pyc ADDED
Binary file (15.3 kB). View file
 
__pycache__/open_clip_adaptor.cpython-310.pyc ADDED
Binary file (1.49 kB). View file
 
__pycache__/open_clip_adaptor.cpython-311.pyc ADDED
Binary file (2.31 kB). View file
 
__pycache__/processing_vectorllm.cpython-310.pyc ADDED
Binary file (4.22 kB). View file
 
__pycache__/processing_vectorllm.cpython-311.pyc ADDED
Binary file (6.91 kB). View file
 
__pycache__/radio_model.cpython-310.pyc ADDED
Binary file (10.5 kB). View file