CSUBioGroup
/

MoChat

Model card Files Files and versions

MoChat / README.md

CSUBioGroup's picture

Update README.md

f8a570a verified 7 months ago

|

history blame contribute delete

1.25 kB

	# Official models of "MoChat: Joints-Grouped Spatio-Temporal Grounding LLM for Multi-Turn Motion Comprehension and Description"

	## Overview

	MoChat is a Multimodal Large Language Model (MLLM) that revolutionizes human motion understanding through precise spatio-temporal grounding. Unlike conventional motion analysis systems, MoChat integrates:
	- Motion Understanding: Performs fundamental motion comprehension and summarization.
	- Spatial Limb Grounding: Accurately locates body parts involved in described movements.
	- Temporal Action Grounding: Precisely identifies time boundaries corresponding to specific motion descriptions.

	## Models

	We provide the following trained models for download:
	- [Joints-Grouped Skeleton Encoder](https://huggingface.co/CSUBioGroup/MoChat/blob/main/JGSE_epoch120) for motion sequences representation.
	- Two variants of motion comprehension models:
	- [MoChat](https://huggingface.co/CSUBioGroup/MoChat/tree/main/MoChat): Base model.
	- [MoChat-R](https://huggingface.co/CSUBioGroup/MoChat/tree/main/MoChat-R): Extended model with regression head.

	## Resources
	- Codebase: [Github](https://github.com/CSUBioGroup/MoChat)
	- Paper: [Arxiv](https://arxiv.org/abs/2410.11404)