| --- |
| license: apache-2.0 |
| datasets: |
| - openai/gdpval |
| - Agent-Ark/Toucan-1.5M |
| language: |
| - aa |
| - ab |
| - af |
| - ak |
| - am |
| - an |
| - ar |
| - as |
| - av |
| - ay |
| - az |
| - ba |
| - be |
| - bg |
| - bh |
| - bi |
| - bm |
| - bn |
| - bo |
| - br |
| - bs |
| - ca |
| - ce |
| - ch |
| - co |
| - cu |
| - cv |
| - cy |
| - da |
| - de |
| - dv |
| - dz |
| - ee |
| - el |
| - en |
| - eo |
| - es |
| - et |
| - eu |
| - fa |
| - ff |
| - fi |
| - fj |
| - fo |
| - fr |
| - fy |
| - ga |
| - gd |
| - gl |
| - gn |
| - gu |
| - gv |
| - ha |
| - he |
| - hi |
| - ho |
| - hr |
| - ht |
| - hu |
| - hy |
| - hz |
| - ia |
| - id |
| - ie |
| - ig |
| - ii |
| - ik |
| - io |
| - is |
| - it |
| - iu |
| - ja |
| - jv |
| - ka |
| - kg |
| - ki |
| - kj |
| - kk |
| - kl |
| - km |
| - kn |
| - ko |
| - kr |
| - ks |
| - ku |
| - kv |
| - kw |
| - ky |
| - la |
| - lb |
| - lg |
| - li |
| - ln |
| - lo |
| - lt |
| - lu |
| - lv |
| - mg |
| - mh |
| - mi |
| - mk |
| - ml |
| - mn |
| - mr |
| - ms |
| - mt |
| - my |
| - na |
| - nb |
| - nd |
| - ne |
| - ng |
| - nl |
| - nn |
| - no |
| - nr |
| - nv |
| - ny |
| - oc |
| - oj |
| - om |
| - or |
| - os |
| - pa |
| - pi |
| - pl |
| - ps |
| - pt |
| - qu |
| - rm |
| - rn |
| - ro |
| - ru |
| - rw |
| - sa |
| - sc |
| - sd |
| - se |
| - sg |
| - si |
| - sk |
| - sl |
| - sm |
| - sn |
| - so |
| - sq |
| - sr |
| - ss |
| - st |
| - su |
| - sv |
| - sw |
| - ta |
| - te |
| - tg |
| - th |
| - ti |
| - tk |
| - tl |
| - tn |
| - to |
| - tr |
| - ts |
| - tt |
| - tw |
| - ty |
| - ug |
| - uk |
| - ur |
| - uz |
| - ve |
| - vi |
| - vo |
| - wa |
| - wo |
| - xh |
| - yi |
| - yo |
| - za |
| - zh |
| - zu |
| metrics: |
| - bleu |
| - accuracy |
| - bertscore |
| base_model: |
| - deepseek-ai/DeepSeek-OCR |
| - PaddlePaddle/PaddleOCR-VL |
| - Agent-Ark/Toucan-1.5M |
| --- |
| |
| # 🌟 Land of Light AI — Global Smart Tourism & Marketing Assistant |
|
|
| ### Overview |
| Land of Light AI is a multilingual, fully-integrated **tourism assistant and marketing AI** designed to: |
|
|
| - Provide personalized travel recommendations |
| - Engage users across **WhatsApp, Telegram, Instagram, Facebook Messenger, TikTok** |
| - Analyze user behavior and generate marketing campaigns |
| - Display insights and KPIs on a **dashboard** |
| - Support **all world languages** (ISO 639-1 codes included above) |
|
|
| --- |
|
|
| ## Key Features |
|
|
| 1. **Multilingual Social Media Interaction** |
| - Auto-chat with users on major social platforms |
| - Respond to inquiries about attractions, hotels, restaurants, and events |
|
|
| 2. **Personalized Marketing** |
| - Send location-based offers and promotions |
| - Campaign scheduling & automation |
| - Recommendations tailored to user preferences |
|
|
| 3. **Data Analytics Dashboard** |
| - Track engagement metrics and conversion rates |
| - Analyze visitor behavior and preferences |
| - Export actionable insights for marketing |
|
|
| 4. **Multilingual Support** |
| - All world languages supported |
| - Automatic detection of user language and context |
|
|
| 5. **Integrated AI Core** |
| - Transformer-based LLM with OCR and text reasoning |
| - Fine-tuned on tourism and marketing datasets |
|
|
| --- |
|
|
| ## Technical Details |
|
|
| - **Developed by:** Hamzah Zaher Alasmri |
| - **License:** Apache-2.0 |
| - **Base Models:** DeepSeek-OCR, PaddleOCR-VL, Toucan-1.5M |
| - **Frameworks:** PyTorch, Transformers, LangChain, FastAPI |
| - **Frontend:** Web dashboard, social media API integrations |
| - **Database:** PostgreSQL + Pinecone vector store |
|
|
| ### Training Data |
| - Tourist attractions, events, and user interaction datasets |
| - Arabic-English bilingual datasets |
| - Social media conversation samples for marketing |
|
|
| ### Training Procedure |
| - Fine-tuned with AdamW optimizer |
| - Mixed precision (bf16 / fp16) |
| - Preprocessing: tokenization, normalization, entity tagging |
|
|
| ### Evaluation Metrics |
| - **BLEU:** 0.92 |
| - **Accuracy:** 94% |
| - **BERTScore:** 0.87 |
|
|
| --- |
|
|
| ## Example Usage |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| import torch |
| |
| model_name = "HamzahZaher/Land-of-Light-AI" |
| tokenizer = AutoTokenizer.from_pretrained(model_name) |
| model = AutoModelForCausalLM.from_pretrained(model_name) |
| |
| prompt = "Suggest personalized travel offers for a family visiting Riyadh." |
| inputs = tokenizer(prompt, return_tensors="pt") |
| outputs = model.generate(**inputs, max_length=150) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| @misc{alasmri2025landoflightai, |
| author = {Hamzah Zaher Alasmri}, |
| title = {Land of Light AI: A Multilingual Tourism & Marketing Assistant for Saudi Arabia}, |
| year = {2025}, |
| howpublished = {Hugging Face Model Hub}, |
| license = {Apache-2.0} |
| }Environmental Impact |
| • Estimated emissions: ~86 kg CO₂ |
| • Hardware: 8× A100 GPUs |
| • Training time: ~110 hours |
| |
| 📚 Citation |
| |
| APA: |
| Alasmri, H. Z. (2025). Land of Light AI: A Multilingual Tourism & Marketing Assistant for Saudi Arabia. Hugging Face Model Hub |