File size: 1,526 Bytes
5efc61e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | # Earthmind-R1
EarthMind-4B fine-tuned with GRPO (Group Relative Policy Optimization) for geospatial visual question answering.
## Model Details
- **Base Model**: EarthMind-4B (InternVL-based architecture)
- **Training Method**: GRPO with LoRA adapters
- **Training Data**: Geospatial instruction dataset
- **Output Format**: Chain-of-thought reasoning with `<think>` and `<answer>` tags
## Usage
```python
import torch
from transformers import AutoModel, AutoTokenizer
from PIL import Image
model = AutoModel.from_pretrained(
"aadex/Earthmind-R1",
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("aadex/Earthmind-R1", trust_remote_code=True)
# Prepare for generation
model.preparing_for_generation(tokenizer=tokenizer, max_new_tokens=512, torch_dtype=torch.bfloat16)
# Load your image
image = Image.open("your_image.jpg")
# Create prompt
question = "Describe what you see in this satellite image."
prompt = f"""User: <image>
{question} First output the thinking process in <think> </think> tags and then output the final answer in <answer> </answer> tags.
Assistant:"""
# Generate (use model's chat method or manual generation)
response = model.chat(tokenizer, pixel_values, question, generation_config)
print(response)
```
## Training
Trained using GRPO with:
- LoRA rank: 16
- LoRA alpha: 32
- Learning rate: 5e-6
- Epochs: 3
- Reward functions: accuracy, format
## License
Please refer to the base EarthMind-4B model license.
|