XAI / perception_models /apps /plm /docs /finetune_example.md

Upload folder using huggingface_hub

9855f47 verified about 1 month ago

7.45 kB

	# Example to Finetune PLM on New Data

	We provide a step-by-step walkthrough for finetuning PLM on a custom dataset based on the high-level instructions in [training.md](training.md). For this example, we will finetune PLM-8B on a specific domain ([Radiology images](https://huggingface.co/datasets/unsloth/Radiology_mini)) and compare model performance before and after finetuning.

	### Setup
	Install required packages:
	```bash
	pip install datasets tqdm
	```


	### 1. Download dataset and prepare for training

	``` python
	import json
	import os
	import tqdm
	from datasets import load_dataset

	def convert_to_training_jsonl(dataset, split):

	out_dir = "apps/plm/dummy_datasets/Radiology_mini"
	os.makedirs(f"{out_dir}/images", exist_ok=True)

	parsed_data = []
	for entry in tqdm.tqdm(dataset[split]):

	# save image
	image_path = f"{out_dir}/images/{entry["image_id"]}.png"
	entry["image"].save(image_path)

	# create training conversation template
	conversations = [
	{"from": "human", "value": "You are an expert radiographer. Describe accurately what you see in this image."},
	{"from": "assistant", "value": entry["caption"]}
	]

	parsed_data.append({
	"image": f"{entry["image_id"]}.png",
	"conversations": conversations,
	})

	# Write jsonl for training / evaluation
	with open(f"{out_dir}/{split}.jsonl", "w") as f:
	for entry in parsed_data:
	f.write(json.dumps(entry) + "\n")


	dataset = load_dataset("unsloth/Radiology_mini")
	convert_to_training_jsonl(dataset, "train")
	convert_to_training_jsonl(dataset, "test")
	```

	After running this code, the training data will be ready for use with the codebase:
	```
	apps/plm/dummy_datasets/Radiology_mini
	├── train.jsonl
	├── test.jsonl
	├── images
	│ ├── ROCOv2_2023_test_000022.png
	│ ├── ROCOv2_2023_train_059888.png
	│ ├── ...
	```

	where each data jsonl will contain data in the required training format.
	```
	# train.jsonl
	{"image": "ROCOv2_2023_train_054311.png", "conversations": [{"from": "human", "value": "You are an expert radiographer. Describe accurately what you see in this image."}, {"from": "assistant", "value": "Panoramic radiography shows an osteolytic lesion in the right posterior maxilla with resorption of the floor of the maxillary sinus (arrows)."}]}
	{"image": "ROCOv2_2023_train_058916.png", "conversations": [{"from": "human", "value": "You are an expert radiographer. Describe accurately what you see in this image."}, {"from": "assistant", "value": "ERCP showing distal CBD compression. ERCP - endoscopic retrograde cholangiopancreatography; CBD - common bile duct"}]}
	...
	```


	### 2. Add dataset config to configs/datasets.yaml
	Point to the newly created data in [configs/datasets.yaml](../configs/datasets.yaml) by adding these lines at the bottom.
	```
	radiology_finetune:
	annotation: apps/plm/dummy_datasets/Radiology_mini/train.jsonl
	root_dir: apps/plm/dummy_datasets/Radiology_mini/images
	```

	### 3. Copy and modify the provided finetuning config
	The stage # 3 configs can be used to further finetune PLM [configs/stage_3](../configs/stage_3).
	```bash
	cp apps/plm/configs/stage_3/plm_8b.yaml apps/plm/configs/finetune/plm_8b_custom.yaml
	```

	Copy the config and modify the fields below.
	```yaml
	# Set the path to save checkpoints to
	dump_dir: checkpoints/finetune_example/

	# Total number of training iterations
	steps: 500

	# Pointer to previously created datamix. Ideally, you would incorporate the new data into a larger datamix
	# but for now, we finetune only on this data
	data:
	datamix: radiology_finetune:1

	# Pointer to the initial model weights
	checkpoint:
	init_ckpt_path: facebook/Perception-LM-8B
	```

	Various other parameters can be changed such as learning rate, batch_size, etc. See comments in [configs/stage_3/plm_8b.yaml](../configs/stage_3/plm_8b.yaml) for details.

	### 4. Finetune the model
	Finetune a model on a single node. For multi-node training, refer to the main [training.md](training.md) doc.
	```
	torchrun --nproc-per-node 8 -m apps.plm.train \
	config=apps/plm/configs/finetune/plm_8b_custom.yaml
	```

	This will start training and save checkpoints, logs and configs in the previously specified `dump_dir`.
	```
	checkpoints/finetune_example/
	├── checkpoints
	│ └── 0000000500
	│ ├── __0_0.distcp
	│ ├── __1_0.distcp
	│ ├── ...
	│ ├── params.json
	│ ├── train_state_00000.json
	│ ├── train_state_00001.json
	│ ├── ...
	├── config.yaml
	├── metrics.jsonl
	└── train.log
	```

	### 5. Consolidate the checkpoint
	Models trained with FSDP require their weights to be consolidated before inference to create `consolidated.pth`.
	```bash
	python apps/plm/consolidate.py --ckpt checkpoints/finetune_example/checkpoints/0000000500/
	```

	### 6. Test and compare model generation
	Use the provided generate helper script to compare the base model (before finetuning) to the finetuned version on an unseen test image from the same dataset.

	```bash
	python apps/plm/generate.py \
	--ckpt facebook/Perception-LM-8B \
	--media_type image \
	--media_path apps/plm/dummy_datasets/Radiology_mini/images/ROCOv2_2023_test_000022.png \
	--question 'You are an expert radiographer. Describe accurately what you see in this image.'

	# Generation:
	# The image is a medical scan of a person's abdomen, likely an MRI or CT scan. The scan shows the internal organs of the abdomen, including the liver, stomach, and intestines. The liver is located on the left side of the image, and it appears to be slightly enlarged. The stomach is located in the center of the image, and it appears to be normal in size. The intestines are located on the right side of the image, and they appear to be normal in size and shape. There are no visible abnormalities or tumors in the image. The scan is in black and white, with the organs appearing in shades of gray. The background of the image is black, which helps to highlight the details of the organs. Overall, the image suggests that the person's abdominal organs are healthy and normal.
	```


	```bash
	python apps/plm/generate.py \
	--ckpt checkpoints/finetune_example/checkpoints/0000000500/ \
	--media_type image \
	--media_path apps/plm/dummy_datasets/Radiology_mini/images/ROCOv2_2023_test_000022.png \
	--question 'You are an expert radiographer. Describe accurately what you see in this image.'

	# Generation:
	# CT scan of the abdomen demonstrating a large liver metastasis (yellow arrow) in segment VII.
	```

	Comparing the two, we see the finetuned model provide concise descriptions following the style of the training set. Note that we use the same prompt as training since the dataset is small and the model has likely overfit to it. For robust training, include the new data in a large data mix (e.g., our provided [SFT blend](../configs/stage_3/plm_8b.yaml)).


	### Wrap up
	From here, the model is trained and ready for evaluation. The [generation script](../generate.py) can be modified to directly evaluate the model on the radiology image captioning task (test set) using captioning metrics (e.g., CIDEr). Alternately, if trained with a larger SFT blend, it can be used for domain-specific QA (e.g., [VQA-Radiology](https://huggingface.co/datasets/flaviagiammarino/vqa-rad)).