ThomasTheMaker
/

pico-decoder-tiny-experiments

Model card Files Files and versions

pico-decoder-tiny-experiments / scripts /README.md

ThomasTheMaker's picture

Upload folder using huggingface_hub

feba2ad verified 9 months ago

|

2.96 kB

	# Scripts Directory

	This directory contains utility scripts for the Pico training framework.

	## generate_data.py

	A script to automatically generate `data.json` from training log files for the dashboard.

	### What it does

	This script parses log files from the `runs/` directory and extracts:
	- Training metrics: Loss, learning rate, and inf/NaN counts at each step
	- Evaluation results: Paloma evaluation metrics
	- Model configuration: Architecture parameters (d_model, n_layers, etc.)

	### Usage

	```bash
	# Generate data.json from the default runs directory
	python scripts/generate_data.py

	# Specify custom runs directory
	python scripts/generate_data.py --runs-dir /path/to/runs

	# Specify custom output file
	python scripts/generate_data.py --output /path/to/output.json
	```

	### How it works

	1. Scans runs directory: Looks for subdirectories containing training runs
	2. Finds log files: Locates `.log` files in each run's `logs/` subdirectory
	3. Parses log content: Uses regex patterns to extract structured data
	4. Generates JSON: Creates a structured JSON file for the dashboard

	### Log Format Requirements

	The script expects log files with the following format:

	```
	2025-08-29 02:09:12 - pico-train - INFO - Step 500 -- 🔄 Training Metrics
	2025-08-29 02:09:12 - pico-train - INFO - ├── Loss: 10.8854
	2025-08-29 02:09:12 - pico-train - INFO - ├── Learning Rate: 3.13e-06
	2025-08-29 02:09:12 - pico-train - INFO - └── Inf/NaN count: 0
	```

	And evaluation results:

	```
	2025-08-29 02:15:26 - pico-train - INFO - Step 1000 -- 📊 Evaluation Results
	2025-08-29 02:15:26 - pico-train - INFO - └── paloma: 7.125172406420199e+27
	```

	### Output Format

	The generated `data.json` has this structure:

	```json
	{
	"runs": [
	{
	"run_name": "model-name",
	"log_file": "log_filename.log",
	"training_metrics": [
	{
	"step": 0,
	"loss": 10.9914,
	"learning_rate": 0.0,
	"inf_nan_count": 0
	}
	],
	"evaluation_results": [
	{
	"step": 1000,
	"paloma": 59434.76600609756
	}
	],
	"config": {
	"d_model": 96,
	"n_layers": 12,
	"max_seq_len": 2048,
	"vocab_size": 50304,
	"lr": 0.0003,
	"max_steps": 200000,
	"batch_size": 8
	}
	}
	],
	"summary": {
	"total_runs": 1,
	"run_names": ["model-name"]
	}
	}
	```

	### When to use

	- After training: Generate updated dashboard data
	- Adding new runs: Include new training sessions in the dashboard
	- Debugging: Verify log parsing is working correctly
	- Dashboard setup: Initial setup of the training metrics dashboard

	### Troubleshooting

	If the script doesn't find any data:
	1. Check that log files exist in `runs/*/logs/`
	2. Verify log format matches the expected pattern
	3. Ensure log files contain training metrics entries
	4. Check file permissions and encoding