--- title: README emoji: 📚 colorFrom: blue colorTo: pink sdk: static pinned: false --- # Knowledge Tracing with Math Solutions ## Motivation Knowledge Tracing (KT) is a core research task that models the evolution of a learner’s knowledge state based on their problem-solving history. This capability is essential for **Intelligent Tutoring Systems (ITS)** to provide adaptive feedback and personalized guidance. Traditional KT research has primarily relied on student–item interactions in the form of binary correctness (1/0). While deep learning-based models such as **DKT, SAINT, and AKT** have brought notable improvements, they still face **limitations in transferability and generalization** across datasets. --- ## Challenges KT continues to face long-standing issues: - **Cold start problem** - **Lack of interpretability** Recent approaches have introduced natural language as a new modality: - **LKT**: models questions as natural language prompts to mitigate cold start - **EFKT**: applies cognitive frameworks to enhance interpretability - **LBMKT**: uses LLM encoders to summarize a learner’s knowledge state in natural language These works suggest the potential of natural language to overcome KT limitations, but their performance gains remain modest. --- ## Related Progress in Programming Education Programming education has seen stronger improvements by leveraging **richer interaction data** such as: - Students’ code submissions - Textual questions Recent studies integrating these signals into KT architectures have shown significant improvements. For example, an **ACL 2025 paper** demonstrated that student question texts yielded **state-of-the-art performance** in programming education KT. --- ## Advances in LLMs Recent LLMs have enabled more **systematic and consistent step-by-step reasoning** through reinforcement learning and alignment: - **Math-Shepherd**: leveraged verifiable reward signals → substantial gains on GSM8K and MATH - **PRM-Guided GFlowNets**: improved reasoning trace quality and diversity → better generalization on unseen datasets --- ## Our Approach Building on these developments, our project integrates **LLM-generated step-by-step math solutions** into KT inputs. This provides **richer interaction signals** beyond simple correctness. **Hypothesis:** Modeling student–item interactions with synthesized solutions can break through the current performance ceiling of KT models. --- ## Research Question > Can incorporating LLM-generated mathematical solutions into KT inputs > push Knowledge Tracing beyond its existing limitations?