README / README.md
sjin4861's picture
Update README.md
aa10a14 verified
---
title: README
emoji: 📚
colorFrom: blue
colorTo: pink
sdk: static
pinned: false
---
# Knowledge Tracing with Math Solutions
## Motivation
Knowledge Tracing (KT) is a core research task that models the evolution of a learner’s knowledge state based on their problem-solving history.
This capability is essential for **Intelligent Tutoring Systems (ITS)** to provide adaptive feedback and personalized guidance.
Traditional KT research has primarily relied on student–item interactions in the form of binary correctness (1/0).
While deep learning-based models such as **DKT, SAINT, and AKT** have brought notable improvements,
they still face **limitations in transferability and generalization** across datasets.
---
## Challenges
KT continues to face long-standing issues:
- **Cold start problem**
- **Lack of interpretability**
Recent approaches have introduced natural language as a new modality:
- **LKT**: models questions as natural language prompts to mitigate cold start
- **EFKT**: applies cognitive frameworks to enhance interpretability
- **LBMKT**: uses LLM encoders to summarize a learner’s knowledge state in natural language
These works suggest the potential of natural language to overcome KT limitations, but their performance gains remain modest.
---
## Related Progress in Programming Education
Programming education has seen stronger improvements by leveraging **richer interaction data** such as:
- Students’ code submissions
- Textual questions
Recent studies integrating these signals into KT architectures have shown significant improvements.
For example, an **ACL 2025 paper** demonstrated that student question texts yielded **state-of-the-art performance** in programming education KT.
---
## Advances in LLMs
Recent LLMs have enabled more **systematic and consistent step-by-step reasoning** through reinforcement learning and alignment:
- **Math-Shepherd**: leveraged verifiable reward signals → substantial gains on GSM8K and MATH
- **PRM-Guided GFlowNets**: improved reasoning trace quality and diversity → better generalization on unseen datasets
---
## Our Approach
Building on these developments, our project integrates **LLM-generated step-by-step math solutions** into KT inputs.
This provides **richer interaction signals** beyond simple correctness.
**Hypothesis:**
Modeling student–item interactions with synthesized solutions can break through the current performance ceiling of KT models.
---
## Research Question
> Can incorporating LLM-generated mathematical solutions into KT inputs
> push Knowledge Tracing beyond its existing limitations?