Spaces:
Running
Running
| title: README | |
| emoji: 📚 | |
| colorFrom: blue | |
| colorTo: pink | |
| sdk: static | |
| pinned: false | |
| # Knowledge Tracing with Math Solutions | |
| ## Motivation | |
| Knowledge Tracing (KT) is a core research task that models the evolution of a learner’s knowledge state based on their problem-solving history. | |
| This capability is essential for **Intelligent Tutoring Systems (ITS)** to provide adaptive feedback and personalized guidance. | |
| Traditional KT research has primarily relied on student–item interactions in the form of binary correctness (1/0). | |
| While deep learning-based models such as **DKT, SAINT, and AKT** have brought notable improvements, | |
| they still face **limitations in transferability and generalization** across datasets. | |
| --- | |
| ## Challenges | |
| KT continues to face long-standing issues: | |
| - **Cold start problem** | |
| - **Lack of interpretability** | |
| Recent approaches have introduced natural language as a new modality: | |
| - **LKT**: models questions as natural language prompts to mitigate cold start | |
| - **EFKT**: applies cognitive frameworks to enhance interpretability | |
| - **LBMKT**: uses LLM encoders to summarize a learner’s knowledge state in natural language | |
| These works suggest the potential of natural language to overcome KT limitations, but their performance gains remain modest. | |
| --- | |
| ## Related Progress in Programming Education | |
| Programming education has seen stronger improvements by leveraging **richer interaction data** such as: | |
| - Students’ code submissions | |
| - Textual questions | |
| Recent studies integrating these signals into KT architectures have shown significant improvements. | |
| For example, an **ACL 2025 paper** demonstrated that student question texts yielded **state-of-the-art performance** in programming education KT. | |
| --- | |
| ## Advances in LLMs | |
| Recent LLMs have enabled more **systematic and consistent step-by-step reasoning** through reinforcement learning and alignment: | |
| - **Math-Shepherd**: leveraged verifiable reward signals → substantial gains on GSM8K and MATH | |
| - **PRM-Guided GFlowNets**: improved reasoning trace quality and diversity → better generalization on unseen datasets | |
| --- | |
| ## Our Approach | |
| Building on these developments, our project integrates **LLM-generated step-by-step math solutions** into KT inputs. | |
| This provides **richer interaction signals** beyond simple correctness. | |
| **Hypothesis:** | |
| Modeling student–item interactions with synthesized solutions can break through the current performance ceiling of KT models. | |
| --- | |
| ## Research Question | |
| > Can incorporating LLM-generated mathematical solutions into KT inputs | |
| > push Knowledge Tracing beyond its existing limitations? | |