OpenCausaLab/CauGym

CauGym model is a model trained via GRPO (Group Relative Policy Optimization) on VERL framework (https://github.com/verl-project/verl), and it is specialized for causal inference.

Model Details

  • Developed by: OpenCausaLab
  • Model type: LLM.
  • Language(s) (NLP): Englsih.

Model Sources

Evaluation

We have evaluated this model on CALM benchmark and CauGym benchmark, and the evaluation metric is accuracy.

Benchmark ATE CDE ETT NDE NIE PN PS
CALM 0.990 0.994 0.900 0.940 0.930 0.928 0.866
CauGym-rephrased 0.948 0.982 0.856 0.890 0.888 0.778 0.816
CauGym-ommitted 0.935 0.963 0.837 0.934 0.838 0.900 0.907
CauGym-deconfounding 0.976 0.986 0.854 0.572 0.872 0.952 0.848
CauGym-redundant 0.972 0.966 0.918 0.850 0.888 0.934 0.910
CauGym-insufficient 0.884 0.902 0.686 0.696 0.958 0.940 0.954

Citation

@misc{chen2026posttrainingtransformllmscausal,
      title={Can Post-Training Transform LLMs into Causal Reasoners?}, 
      author={Junqi Chen and Sirui Chen and Chaochao Lu},
      year={2026},
      eprint={2602.06337},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.06337}, 
}
Downloads last month
14
Safetensors
Model size
15B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OpenCausaLab/CauGym

Finetuned
(72)
this model

Paper for OpenCausaLab/CauGym