insomnia7 commited on 7 days ago

Commit

bcc6605

verified ·

1 Parent(s): 98d00e1

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +1 -0
README.md +166 -0
__init__.py +33 -0
__pycache__/__init__.cpython-310.pyc +0 -0
__pycache__/__init__.cpython-311.pyc +0 -0
__pycache__/adaptor_base.cpython-310.pyc +0 -0
__pycache__/adaptor_base.cpython-311.pyc +0 -0
__pycache__/adaptor_generic.cpython-310.pyc +0 -0
__pycache__/adaptor_generic.cpython-311.pyc +0 -0
__pycache__/adaptor_mlp.cpython-310.pyc +0 -0
__pycache__/adaptor_mlp.cpython-311.pyc +0 -0
__pycache__/adaptor_registry.cpython-310.pyc +0 -0
__pycache__/adaptor_registry.cpython-311.pyc +0 -0
__pycache__/cls_token.cpython-310.pyc +0 -0
__pycache__/cls_token.cpython-311.pyc +0 -0
__pycache__/common.cpython-310.pyc +0 -0
__pycache__/common.cpython-311.pyc +0 -0
__pycache__/configuration_vectorllm.cpython-310.pyc +0 -0
__pycache__/configuration_vectorllm.cpython-311.pyc +0 -0
__pycache__/dinov2_arch.cpython-310.pyc +0 -0
__pycache__/dinov2_arch.cpython-311.pyc +0 -0
__pycache__/dual_hybrid_vit.cpython-310.pyc +0 -0
__pycache__/dual_hybrid_vit.cpython-311.pyc +0 -0
__pycache__/enable_cpe_support.cpython-310.pyc +0 -0
__pycache__/enable_cpe_support.cpython-311.pyc +0 -0
__pycache__/enable_spectral_reparam.cpython-310.pyc +0 -0
__pycache__/enable_spectral_reparam.cpython-311.pyc +0 -0
__pycache__/eradio_model.cpython-310.pyc +0 -0
__pycache__/eradio_model.cpython-311.pyc +0 -0
__pycache__/extra_models.cpython-310.pyc +0 -0
__pycache__/extra_models.cpython-311.pyc +0 -0
__pycache__/extra_timm_models.cpython-310.pyc +0 -0
__pycache__/extra_timm_models.cpython-311.pyc +0 -0
__pycache__/feature_normalizer.cpython-310.pyc +0 -0
__pycache__/feature_normalizer.cpython-311.pyc +0 -0
__pycache__/forward_intermediates.cpython-310.pyc +0 -0
__pycache__/forward_intermediates.cpython-311.pyc +0 -0
__pycache__/hf_model.cpython-310.pyc +0 -0
__pycache__/hf_model.cpython-311.pyc +0 -0
__pycache__/image_processing_vectorllm.cpython-310.pyc +0 -0
__pycache__/image_processing_vectorllm.cpython-311.pyc +0 -0
__pycache__/input_conditioner.cpython-310.pyc +0 -0
__pycache__/input_conditioner.cpython-311.pyc +0 -0
__pycache__/modeling_vectorllm.cpython-310.pyc +0 -0
__pycache__/modeling_vectorllm.cpython-311.pyc +0 -0
__pycache__/open_clip_adaptor.cpython-310.pyc +0 -0
__pycache__/open_clip_adaptor.cpython-311.pyc +0 -0
__pycache__/processing_vectorllm.cpython-310.pyc +0 -0
__pycache__/processing_vectorllm.cpython-311.pyc +0 -0
__pycache__/radio_model.cpython-310.pyc +0 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,166 @@

+# VectorLLM HF 0407
+这是当前可直接分发使用的 Hugging Face 版本导出目录模板，已经包含：
+- C-RADIO 视觉塔实现与权重映射逻辑
+- VectorLLM 自定义 `AutoModel` / `AutoProcessor`
+- 本地可加载的 processor / image processor / modeling 代码
+- `bfloat16` 推理配置
+当前推荐在 GPU 上使用，推荐环境：
+```bash
+/home/zhangtao/env/xtuner/bin/python
+```
+## 目录说明
+- `model.safetensors`: 合并后的主模型权重
+- `config.json`: 主配置
+- `generation_config.json`: 生成配置
+- `preprocessor_config.json`: 图像预处理配置
+- `test_hf.py`: 单图推理与可视化示例脚本
+- `gradio_bbox_demo.py`: 整图画框、后端裁剪和全图回映射的 Gradio 脚本
+- `conversion_report.json`: 与 xtuner 对齐验证结果
+- `radio_bundle/`: 打包后的 C-RADIO 相关实现
+## 设计目标
+- 可通过 `AutoModel` 和 `AutoProcessor` 拉起
+- 不依赖外部 Hugging Face cache 中的自定义代码
+- 不要求 `trust_remote_code=True`
+- 适用于“单个物体已裁剪好”的输入图像，不做二次裁剪
+## 快速开始
+### 1. 命令行推理
+推荐直接使用目录内置的 [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py)：
+```bash
+/home/zhangtao/env/xtuner/bin/python \
+  /home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py \
+  /home/zhangtao/hf_model/vectorllm_hf_0407 \
+  /path/to/your_image.png \
+  --save-dir /tmp/vectorllm_hf_demo
+```
+输出：
+- `overlay.png`: 叠加 polygon 的可视化结果
+- `report.json`: 文本输出、网格坐标、回映射后的 polygon
+### 2. Python 方式加载
+必须先把模型目录的父目录加入 `sys.path`，再导入包本身完成本地注册：
+```python
+import sys
+from pathlib import Path
+import torch
+from PIL import Image
+from transformers import AutoModel, AutoProcessor, GenerationConfig
+model_path = Path("/home/zhangtao/hf_model/vectorllm_hf_0407")
+sys.path.insert(0, str(model_path.parent))
+import vectorllm_hf_0407  # noqa: F401
+from vectorllm_hf_0407.test_hf import DEFAULT_RAW_PROMPT, decode_generated_text, get_stop_criteria
+model = AutoModel.from_pretrained(
+    model_path,
+    trust_remote_code=False,
+    torch_dtype=torch.bfloat16,
+).cuda().eval()
+processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=False)
+tokenizer = processor.tokenizer
+image = Image.open("/path/to/your_image.png").convert("RGB")
+model_inputs = processor(text=[DEFAULT_RAW_PROMPT], images=[image], return_tensors="pt")
+model_inputs = {
+    key: value.to(model.device) if torch.is_tensor(value) else value
+    for key, value in model_inputs.items()
+}
+stop_criteria = get_stop_criteria(tokenizer, ["<|im_end|>", "<|endoftext|>"])
+output = model.generate(
+    **model_inputs,
+    generation_config=GenerationConfig(
+        max_new_tokens=640,
+        do_sample=False,
+        eos_token_id=tokenizer.eos_token_id,
+        pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
+        temperature=0.0,
+        top_k=1,
+    ),
+    bos_token_id=tokenizer.bos_token_id,
+    stopping_criteria=stop_criteria,
+    output_hidden_states=False,
+    return_dict_in_generate=True,
+    do_sample=False,
+    temperature=0.0,
+    top_k=1,
+)
+text = decode_generated_text(output, model_inputs, tokenizer)
+print(text)
+```
+### 3. Gradio 交互
+若输入是整图，希望在图上手动画 bbox，再由后端扩框裁剪后送入模型，可直接启动：
+```bash
+/home/zhangtao/env/xtuner/bin/python \
+  /home/zhangtao/hf_model/vectorllm_hf_0407/gradio_bbox_demo.py \
+  --model-path /home/zhangtao/hf_model/vectorllm_hf_0407 \
+  --server-name 0.0.0.0 \
+  --server-port 7861
+```
+功能说明：
+- 前端直接在整图上拖拽一个或多个 bbox
+- 后端按 `1.0-1.3` 可调扩展比例裁剪
+- 裁剪图送入 VectorLLM HF 模型推理
+- 结果会回映射到整图，并展示全图 overlay、裁剪预览和结构化 JSON
+使用步骤：
+1. 打开页面后先上传整图。
+2. 在左侧画布上拖拽 bbox，可连续画多个框。
+3. `Prompt Target` 选择 `Building` 或 `Object`。
+4. `BBox Expand Ratio` 控制后端扩框比例。`1.0` 表示仅按原框裁剪，`1.15` 或 `1.2` 通常更稳，`1.3` 适合给目标留更多上下文。
+5. 点击 `Run` 执行推理，点击 `Clear` 清空当前图像和框。
+页面输出说明：
+- `Full-Image Overlay`: 全图可视化结果，同时叠加原始 bbox、扩框后的 bbox 和回映射后的 polygon
+- `Expanded Crop Preview`: 每个 bbox 对应的裁剪图及其局部 polygon，可用来快速检查裁剪是否合理
+- `Model Text Output`: 模型原始输出文本，便于排查 `<xN><yN>` 序列
+- `Structured Result`: 结构化 JSON，包含原框、扩框、裁剪尺寸、网格坐标和全图坐标 polygon
+补充说明：
+- 当前脚本默认监听 `7861` 端口，可通过 `--server-port` 调整
+- 服务启动后若需停止，终端里直接 `Ctrl+C` 即可
+- 首次启动会加载大模��，耗时会明显长于后续推理
+## 输入要求
+- 输入应为单个目标的裁剪图
+- 当前默认 prompt 针对建筑轮廓提取
+- 模型输出为 `<xN><yN>` 格式的离散 polygon 点序列
+## 结果说明
+- 网格坐标范围默认是 `0~127`
+- [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py) 会自动把离散点映射回原图坐标并可视化
+- [gradio_bbox_demo.py](/home/zhangtao/hf_model/vectorllm_hf_0407/gradio_bbox_demo.py) 会把裁剪区域内的离散点恢复到全图坐标后再叠加显示
+- 当前 GPU 实测下，回载后的 HF 输出与 xtuner 已基本对齐，可能存在极轻微的离散点差异
+## 备注
+- 若从别的工作目录调用，不要省略 `sys.path.insert(...)` 和 `import vectorllm_hf_0407`
+- 若只想快速跑 demo，优先使用 [test_hf.py](/home/zhangtao/hf_model/vectorllm_hf_0407/test_hf.py)

__init__.py ADDED Viewed

	@@ -0,0 +1,33 @@

+from transformers import AutoConfig, AutoModel, AutoModelForCausalLM
+from .configuration_vectorllm import VectorLLMConfig
+from .hf_model import RADIOConfig, RADIOModel
+from .image_processing_vectorllm import VectorLLMImageProcessor
+from .modeling_vectorllm import VectorLLMForCausalLM
+from .processing_vectorllm import VectorLLMProcessor
+def _safe_register(register_fn, *args):
+    try:
+        register_fn(*args)
+    except ValueError:
+        pass
+def bootstrap_local_registry():
+    _safe_register(AutoConfig.register, VectorLLMConfig.model_type, VectorLLMConfig)
+    _safe_register(AutoModel.register, VectorLLMConfig, VectorLLMForCausalLM)
+    _safe_register(AutoModelForCausalLM.register, VectorLLMConfig, VectorLLMForCausalLM)
+bootstrap_local_registry()
+__all__ = [
+    "RADIOConfig",
+    "RADIOModel",
+    "VectorLLMConfig",
+    "VectorLLMForCausalLM",
+    "VectorLLMImageProcessor",
+    "VectorLLMProcessor",
+    "bootstrap_local_registry",
+]

__pycache__/__init__.cpython-310.pyc ADDED Viewed

Binary file (1.03 kB). View file

__pycache__/__init__.cpython-311.pyc ADDED Viewed

Binary file (1.56 kB). View file

__pycache__/adaptor_base.cpython-310.pyc ADDED Viewed

Binary file (1.43 kB). View file

__pycache__/adaptor_base.cpython-311.pyc ADDED Viewed

Binary file (2.14 kB). View file

__pycache__/adaptor_generic.cpython-310.pyc ADDED Viewed

Binary file (2.17 kB). View file

__pycache__/adaptor_generic.cpython-311.pyc ADDED Viewed

Binary file (4.34 kB). View file

__pycache__/adaptor_mlp.cpython-310.pyc ADDED Viewed

Binary file (4.94 kB). View file

__pycache__/adaptor_mlp.cpython-311.pyc ADDED Viewed

Binary file (9.25 kB). View file

__pycache__/adaptor_registry.cpython-310.pyc ADDED Viewed

Binary file (1.44 kB). View file

__pycache__/adaptor_registry.cpython-311.pyc ADDED Viewed

Binary file (2.03 kB). View file

__pycache__/cls_token.cpython-310.pyc ADDED Viewed

Binary file (1.55 kB). View file

__pycache__/cls_token.cpython-311.pyc ADDED Viewed

Binary file (2.51 kB). View file

__pycache__/common.cpython-310.pyc ADDED Viewed

Binary file (2.3 kB). View file

__pycache__/common.cpython-311.pyc ADDED Viewed

Binary file (3.11 kB). View file

__pycache__/configuration_vectorllm.cpython-310.pyc ADDED Viewed

Binary file (3.43 kB). View file

__pycache__/configuration_vectorllm.cpython-311.pyc ADDED Viewed

Binary file (5.65 kB). View file

__pycache__/dinov2_arch.cpython-310.pyc ADDED Viewed

Binary file (29.8 kB). View file

__pycache__/dinov2_arch.cpython-311.pyc ADDED Viewed

Binary file (54.4 kB). View file

__pycache__/dual_hybrid_vit.cpython-310.pyc ADDED Viewed

Binary file (6.56 kB). View file

__pycache__/dual_hybrid_vit.cpython-311.pyc ADDED Viewed

Binary file (13.4 kB). View file

__pycache__/enable_cpe_support.cpython-310.pyc ADDED Viewed

Binary file (4.52 kB). View file

__pycache__/enable_cpe_support.cpython-311.pyc ADDED Viewed

Binary file (7.87 kB). View file

__pycache__/enable_spectral_reparam.cpython-310.pyc ADDED Viewed

Binary file (8.99 kB). View file

__pycache__/enable_spectral_reparam.cpython-311.pyc ADDED Viewed

Binary file (17.1 kB). View file

__pycache__/eradio_model.cpython-310.pyc ADDED Viewed

Binary file (39.4 kB). View file

__pycache__/eradio_model.cpython-311.pyc ADDED Viewed

Binary file (73.7 kB). View file

__pycache__/extra_models.cpython-310.pyc ADDED Viewed

Binary file (7.07 kB). View file

__pycache__/extra_models.cpython-311.pyc ADDED Viewed

Binary file (11.6 kB). View file

__pycache__/extra_timm_models.cpython-310.pyc ADDED Viewed

Binary file (7.52 kB). View file

__pycache__/extra_timm_models.cpython-311.pyc ADDED Viewed

Binary file (12.4 kB). View file

__pycache__/feature_normalizer.cpython-310.pyc ADDED Viewed

Binary file (4.8 kB). View file

__pycache__/feature_normalizer.cpython-311.pyc ADDED Viewed

Binary file (8.78 kB). View file

__pycache__/forward_intermediates.cpython-310.pyc ADDED Viewed

Binary file (4.38 kB). View file

__pycache__/forward_intermediates.cpython-311.pyc ADDED Viewed

Binary file (7.09 kB). View file

__pycache__/hf_model.cpython-310.pyc ADDED Viewed

Binary file (6.44 kB). View file

__pycache__/hf_model.cpython-311.pyc ADDED Viewed

Binary file (10.3 kB). View file

__pycache__/image_processing_vectorllm.cpython-310.pyc ADDED Viewed

Binary file (3.9 kB). View file

__pycache__/image_processing_vectorllm.cpython-311.pyc ADDED Viewed

Binary file (6.33 kB). View file

__pycache__/input_conditioner.cpython-310.pyc ADDED Viewed

Binary file (1.52 kB). View file

__pycache__/input_conditioner.cpython-311.pyc ADDED Viewed

Binary file (2.46 kB). View file

__pycache__/modeling_vectorllm.cpython-310.pyc ADDED Viewed

Binary file (8.18 kB). View file

__pycache__/modeling_vectorllm.cpython-311.pyc ADDED Viewed

Binary file (15.3 kB). View file

__pycache__/open_clip_adaptor.cpython-310.pyc ADDED Viewed

Binary file (1.49 kB). View file

__pycache__/open_clip_adaptor.cpython-311.pyc ADDED Viewed

Binary file (2.31 kB). View file

__pycache__/processing_vectorllm.cpython-310.pyc ADDED Viewed

Binary file (4.22 kB). View file

__pycache__/processing_vectorllm.cpython-311.pyc ADDED Viewed

Binary file (6.91 kB). View file

__pycache__/radio_model.cpython-310.pyc ADDED Viewed

Binary file (10.5 kB). View file