BASH-Lab
/

LLaSA-7B

Question Answering

Model card Files Files and versions

LLaSA-7B / README.md

shoubornowpi's picture

Update README.md

08539c6 verified 5 months ago

|

history blame contribute delete

3.12 kB

	---
	license: other
	license_name: hippocratic-license
	license_link: >-
	https://firstdonoharm.dev/version/3/0/cl-eco-extr-ffd-law-media-mil-my-soc-sv-tal-usta.html
	datasets:
	- BASH-Lab/OpenSQA
	language:
	- en
	base_model:
	- lmsys/vicuna-7b-v1.5
	pipeline_tag: question-answering
	---

	# LLaSA-7B

	LLaSA-7B is a large language and sensor assistant that can interpret IMU data for human activities.

	## Abstract

	Wearable systems can recognize activities from IMU data but often fail to explain their underlying causes or contextual significance. To address this limitation, we introduce two large-scale resources: SensorCap, comprising 35,960 IMU--caption pairs, and OpenSQA, with 199,701 question--answer pairs designed for causal and explanatory reasoning. OpenSQA includes a curated tuning split (Tune-OpenSQA) optimized for scientific accuracy, narrative clarity, and diagnostic insight. Leveraging these datasets, we develop LLaSA (Large Language and Sensor Assistant), a family of compact sensor-aware language models (7B and 13B) that generate interpretable, context-rich responses to open-ended questions grounded in raw IMU data. LLaSA outperforms commercial LLMs, including GPT-3.5 and GPT-4o-mini, on benchmark and real-world tasks, demonstrating the effectiveness of domain supervision and model alignment for sensor reasoning.

	### Model Summary



	- Developed by: BASH Lab, WPI
	- Model type: sensor-text-to-text
	- Language(s) (NLP): English
	- Finetuned from model: lmsys/vicuna-7b-v1.5

	### Model Sources

	- Repository: https://github.com/BASHLab/LLaSA
	- Paper: https://arxiv.org/abs/2406.14498
	- Project Website: https://bashlab.github.io/llasa_project/

	### Usage

	```bash
	git clone https://github.com/BASHLab/LLaSA.git
	cd LLaSA/LLaSA
	pip install -e .
	hf download BASH-Lab/LLaSA-7B
	```

	You can run any of the inference scripts (zero-shot classification or question-answering) following the scripts in the eval subdirectory of the LLaSA GitHub repository, or you can run one sample as follows.
	```Python
	from llava.eval.run_llava import eval_model
	from llava.mm_utils import get_model_name_from_path


	sensor_reading = "imu.npy" # 20Hz, 2 sec (shape: (120,6))
	prompt = "Narrate this activity by analyzing the data."
	model_path = "LLaSA-7B"
	args = type('Args', (), {
	"model_path": model_path,
	"model_base": None,
	"model_name": get_model_name_from_path(model_path),
	"query": prompt,
	"conv_mode": None,
	"image_file": sensor_reading,
	"sep": ",",
	"temperature": 0,
	"top_p": None,
	"num_beams": 1,
	"max_new_tokens": 300
	})()
	llasa_answer = eval_model(args)
	print(llasa_answer)
	```

	## Citation

	<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

	BibTeX:

	```
	@article{imran2024llasa,
	title={LLaSA: A Sensor-Aware LLM for Natural Language Reasoning of Human Activity from IMU Data},
	author={Imran, Sheikh Asif and Khan, Mohammad Nur Hossain and Biswas, Subrata and Islam, Bashima},
	journal={arXiv preprint arXiv:2406.14498},
	year={2024}
	}
	```