Spaces:
Sleeping
Sleeping
| title: Graduation Project-v1.2 | |
| emoji: π | |
| colorFrom: indigo | |
| colorTo: blue | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # π€ AI-Powered Graduation Project Recommendation System | |
| ## π Overview | |
| This project implements an intelligent AI-powered recommendation and semantic similarity platform for graduation projects using: | |
| * Natural Language Processing (NLP) | |
| * Semantic Search | |
| * Vector Embeddings | |
| * Hybrid Ranking Systems | |
| * Large Language Models (LLMs) | |
| The system helps students: | |
| * discover unique graduation project ideas | |
| * avoid duplicate projects | |
| * analyze originality | |
| * generate intelligent project features | |
| * receive context-aware recommendations through an AI chatbot | |
| --- | |
| # βοΈ System Pipeline | |
| ## 1οΈβ£ Data Preprocessing | |
| * Text normalization | |
| * Duplicate removal | |
| * Smart content merging | |
| * Technical keyword extraction | |
| * Feature engineering | |
| ## 2οΈβ£ Feature Extraction | |
| * KeyBERT-based keyword extraction | |
| * Automatic technical term detection | |
| * Semantic feature generation | |
| ## 3οΈβ£ Embedding Generation | |
| * SentenceTransformer embeddings | |
| * Normalized vector representations | |
| * Semantic encoding of projects | |
| ## 4οΈβ£ Semantic Retrieval | |
| * FAISS vector indexing | |
| * Nearest-neighbor semantic search | |
| * Fast project similarity lookup | |
| ## 5οΈβ£ Hybrid Ranking | |
| The final ranking combines: | |
| * Semantic similarity | |
| * Feature similarity | |
| * Coverage ratio | |
| * Confidence estimation | |
| * Originality analysis | |
| ## 6οΈβ£ AI Recommendation Engine | |
| * Context-aware project generation | |
| * Feature recommendation | |
| * Novelty checking | |
| * Conversational chatbot assistance | |
| --- | |
| # π§ AI & NLP Technologies Used | |
| ## πΉ Machine Learning & NLP | |
| * SentenceTransformers | |
| * KeyBERT | |
| * Scikit-learn | |
| * SciPy | |
| * FAISS | |
| ## πΉ LLM Integration | |
| * Google Gemini API | |
| * Ollama | |
| * Mistral | |
| ## πΉ Backend & Infrastructure | |
| * FastAPI | |
| * Pandas | |
| * NumPy | |
| * Python | |
| --- | |
| # ποΈ Project Architecture | |
| ```text | |
| User Query | |
| β | |
| Intent Classification | |
| β | |
| Context Builder | |
| β | |
| Feature Extraction | |
| β | |
| Embedding Generation | |
| β | |
| FAISS Semantic Search | |
| β | |
| Hybrid Ranking Engine | |
| β | |
| Originality & Duplicate Analysis | |
| β | |
| AI Recommendation Response | |
| ``` | |
| --- | |
| # π Similarity Engine Workflow | |
| ```text | |
| Raw Dataset | |
| β | |
| Preprocessing | |
| β | |
| Feature Extraction | |
| β | |
| Sentence Embeddings | |
| β | |
| FAISS Indexing | |
| β | |
| Semantic Retrieval | |
| β | |
| Feature Similarity Matching | |
| β | |
| Hybrid Re-ranking | |
| β | |
| Final Recommendation | |
| ``` | |
| --- | |
| # π Features | |
| ## β AI Chatbot | |
| * Context-aware conversations | |
| * Intent classification | |
| * Domain-specific recommendations | |
| * Memory-aware responses | |
| ## β Semantic Similarity Search | |
| * Embedding-based retrieval | |
| * Semantic duplicate detection | |
| * Vector search with FAISS | |
| ## β Hybrid Recommendation System | |
| * Multi-stage ranking pipeline | |
| * Feature-level semantic comparison | |
| * Adaptive scoring strategy | |
| ## β Originality Detection | |
| * Duplicate risk analysis | |
| * Originality scoring | |
| * Similarity confidence estimation | |
| ## β Intelligent Feature Generation | |
| * AI-generated project features | |
| * Novelty-aware generation | |
| * Domain-aware recommendations | |
| --- | |
| # π Evaluation | |
| The system includes: | |
| * Self-retrieval evaluation | |
| * Real-query testing | |
| * Hybrid ranking validation | |
| * Confidence scoring | |
| ### Evaluation Metrics | |
| * Semantic Similarity Score | |
| * Hybrid Score | |
| * Originality Score | |
| * Confidence Score | |
| * Duplicate Risk Classification | |
| --- | |
| # π Project Structure | |
| ```text | |
| GRADUATION_PROJECT/ | |
| β | |
| βββ api/ # FastAPI backend | |
| β | |
| βββ Data/ | |
| β βββ raw/ # Original dataset | |
| β βββ processed/ # Cleaned dataset | |
| β | |
| βββ models/ # FAISS index & metadata | |
| β | |
| βββ Notebooks/ | |
| β βββ TEST.ipynb # Training & evaluation notebook | |
| β | |
| βββ src/ | |
| β βββ recommendation_engine/ # Chatbot & recommendation logic | |
| β βββ similarity_model/ # Semantic search engine | |
| β | |
| βββ requirements.txt | |
| βββ README.md | |
| βββ .gitignore | |
| ``` | |
| --- | |
| # π§© Recommendation Engine Modules | |
| ## recommendation_engine/ | |
| Contains: | |
| * Chatbot engine | |
| * Intent classification | |
| * Prompt building | |
| * Idea generation | |
| * Feature generation | |
| * Memory management | |
| * Novelty checking | |
| * Response formatting | |
| --- | |
| # π¬ Similarity Model Modules | |
| ## similarity_model/ | |
| Contains: | |
| * Semantic search | |
| * Embedding engine | |
| * Hybrid ranker | |
| * Feature similarity engine | |
| * Preprocessing pipeline | |
| * Evaluation framework | |
| --- | |
| # β‘ Installation | |
| ## 1οΈβ£ Clone Repository | |
| ```bash | |
| git clone https://github.com/YOUR_USERNAME/YOUR_REPOSITORY.git | |
| cd YOUR_REPOSITORY | |
| ``` | |
| --- | |
| ## 2οΈβ£ Create Virtual Environment | |
| ### Windows | |
| ```bash | |
| python -m venv .venv | |
| .venv\Scripts\activate | |
| ``` | |
| ### Linux / Mac | |
| ```bash | |
| python3 -m venv .venv | |
| source .venv/bin/activate | |
| ``` | |
| --- | |
| ## 3οΈβ£ Install Dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| --- | |
| # π Environment Variables | |
| Create a `.env` file: | |
| ```env | |
| GEMINI_API_KEY=your_api_key_here | |
| ``` | |
| --- | |
| # βΆοΈ Running The Project | |
| ## Run FastAPI Server | |
| ```bash | |
| uvicorn api.main:app --reload | |
| ``` | |
| --- | |
| ## Run Notebook | |
| ```bash | |
| jupyter notebook | |
| ``` | |
| Open: | |
| ```text | |
| Notebooks/TEST.ipynb | |
| ``` | |
| --- | |
| # π‘ Example Query | |
| ## Input | |
| ```text | |
| AI-based smart library recommendation platform | |
| ``` | |
| ## Output | |
| * Similar graduation projects | |
| * Semantic similarity scores | |
| * Originality analysis | |
| * Duplicate risk estimation | |
| * Recommended features | |
| --- | |
| # π― Future Improvements | |
| * Full RAG integration | |
| * Multi-agent orchestration | |
| * GPU acceleration | |
| * Advanced evaluation metrics | |
| * Real-time deployment | |
| * Database persistence | |
| * Frontend dashboard | |
| --- | |
| # π Research Areas Covered | |
| * Natural Language Processing (NLP) | |
| * Semantic Search | |
| * Recommendation Systems | |
| * Vector Databases | |
| * Conversational AI | |
| * Information Retrieval | |
| * Hybrid Ranking Systems | |
| * Large Language Models (LLMs) | |
| --- | |
| # π¨βπ» Author | |
| Yossef Assem | |
| --- | |
| # π License | |
| This project is for educational and research purposes. | |