scratch_chat / docs /DEVELOPER_GUIDE.md
WebashalarForML's picture
Upload 178 files
330b6e4 verified

Multi-Language Chat Agent - Developer Guide

Architecture Overview

The Multi-Language Chat Agent is built using a modular architecture with the following key components:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Frontend      │    │   WebSocket     │    │   Chat Agent    │
│   (HTML/JS)     │◄──►│   Handler       │◄──►│   Service       │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │                        │
                                ▼                        ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Session       │    │   Language      │    │   Groq LLM      │
│   Manager       │    │   Context       │    │   Client        │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │
                                ▼
                    ┌─────────────────┐
                    │   Chat History  │
                    │   Manager       │
                    └─────────────────┘
                                │
                    ┌─────────────────┐    ┌─────────────────┐
                    │   Redis Cache   │    │   PostgreSQL    │
                    │                 │    │   Database      │
                    └─────────────────┘    └─────────────────┘

Core Components

1. Chat Agent Service (chat_agent/services/chat_agent.py)

The main orchestrator that coordinates all chat operations.

Key Methods:

  • process_message(): Main message processing pipeline
  • switch_language(): Handle language context switching
  • stream_response(): Real-time response streaming

Usage Example:

from chat_agent.services.chat_agent import ChatAgent

# Initialize chat agent
chat_agent = ChatAgent()

# Process a message
response = chat_agent.process_message(
    session_id="session-123",
    message="How do I create a Python list?",
    language="python"
)

2. Session Manager (chat_agent/services/session_manager.py)

Manages user sessions and chat state.

Key Methods:

  • create_session(): Create new chat session
  • get_session(): Retrieve session information
  • cleanup_inactive_sessions(): Remove expired sessions

Usage Example:

from chat_agent.services.session_manager import SessionManager

session_manager = SessionManager()

# Create new session
session = session_manager.create_session(
    user_id="user-123",
    language="python"
)

# Get session info
session_info = session_manager.get_session(session['session_id'])

3. Language Context Manager (chat_agent/services/language_context.py)

Handles programming language context and switching.

Key Methods:

  • set_language(): Set current language for session
  • get_language(): Get current language
  • get_language_prompt_template(): Get language-specific prompts

Usage Example:

from chat_agent.services.language_context import LanguageContextManager

lang_manager = LanguageContextManager()

# Set language context
lang_manager.set_language("session-123", "javascript")

# Get current language
current_lang = lang_manager.get_language("session-123")

# Get prompt template
template = lang_manager.get_language_prompt_template("python")

4. Chat History Manager (chat_agent/services/chat_history.py)

Manages persistent and cached chat history.

Key Methods:

  • store_message(): Store message in DB and cache
  • get_recent_history(): Get recent messages for context
  • get_full_history(): Get complete conversation history

Usage Example:

from chat_agent.services.chat_history import ChatHistoryManager

history_manager = ChatHistoryManager()

# Store a message
message_id = history_manager.store_message(
    session_id="session-123",
    role="user",
    content="What is Python?",
    language="python"
)

# Get recent history
recent = history_manager.get_recent_history("session-123", limit=10)

5. Groq Client (chat_agent/services/groq_client.py)

Handles integration with Groq LangChain API.

Key Methods:

  • generate_response(): Generate LLM response
  • stream_response(): Stream response generation
  • handle_api_errors(): Error handling and fallbacks

Usage Example:

from chat_agent.services.groq_client import GroqClient

groq_client = GroqClient(api_key="your-api-key")

# Generate response
response = groq_client.generate_response(
    prompt="Explain Python functions",
    chat_history=recent_messages,
    language_context="python"
)

Development Setup

Prerequisites

  • Python 3.8+
  • PostgreSQL (for production) or SQLite (for development)
  • Redis (for caching and session management)
  • Groq API key

Installation

  1. Clone the repository:
git clone <repository-url>
cd multi-language-chat-agent
  1. Create virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up environment variables:
cp .env.example .env
# Edit .env with your configuration
  1. Initialize database:
python init_db.py
  1. Run the application:
python app.py

Environment Configuration

Required Environment Variables:

# Groq API Configuration
GROQ_API_KEY=your-groq-api-key-here
GROQ_MODEL=mixtral-8x7b-32768

# Database Configuration
DATABASE_URL=postgresql://user:password@localhost/chatdb
# Or for SQLite: DATABASE_URL=sqlite:///instance/chat_agent.db

# Redis Configuration
REDIS_URL=redis://localhost:6379/0

# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development

Optional Configuration:

# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_PER_MINUTE=30

# Session Management
SESSION_TIMEOUT=3600  # 1 hour in seconds
CLEANUP_INTERVAL=300  # 5 minutes

# Logging
LOG_LEVEL=INFO
LOG_FILE=logs/chat_agent.log

Testing

Running Tests

All Tests:

pytest

Specific Test Categories:

# Unit tests
pytest tests/unit/

# Integration tests
pytest tests/integration/

# End-to-end tests
pytest tests/e2e/

# Performance tests
pytest tests/performance/

With Coverage:

pytest --cov=chat_agent --cov-report=html

Test Structure

tests/
├── unit/                 # Unit tests for individual components
│   ├── test_chat_agent.py
│   ├── test_session_manager.py
│   └── test_language_context.py
├── integration/          # Integration tests
│   ├── test_chat_api.py
│   └── test_websocket_integration.py
├── e2e/                  # End-to-end workflow tests
│   └── test_complete_chat_workflow.py
└── performance/          # Load and performance tests
    └── test_load_testing.py

Writing Tests

Unit Test Example:

import pytest
from unittest.mock import Mock, patch
from chat_agent.services.chat_agent import ChatAgent

class TestChatAgent:
    @pytest.fixture
    def mock_dependencies(self):
        return {
            'groq_client': Mock(),
            'session_manager': Mock(),
            'language_context_manager': Mock(),
            'chat_history_manager': Mock()
        }
    
    def test_process_message_success(self, mock_dependencies):
        # Arrange
        chat_agent = ChatAgent(**mock_dependencies)
        mock_dependencies['groq_client'].generate_response.return_value = "Test response"
        
        # Act
        result = chat_agent.process_message("session-123", "Test message", "python")
        
        # Assert
        assert result == "Test response"
        mock_dependencies['groq_client'].generate_response.assert_called_once()

Integration Test Example:

import pytest
from chat_agent.services.chat_agent import ChatAgent

class TestChatIntegration:
    @pytest.fixture
    def integrated_system(self):
        # Set up real components with test configuration
        return ChatAgent()
    
    def test_complete_chat_flow(self, integrated_system):
        # Test complete workflow with real components
        session_id = "test-session"
        response = integrated_system.process_message(
            session_id, "What is Python?", "python"
        )
        assert response is not None
        assert len(response) > 0

API Development

Adding New Endpoints

  1. Create route in chat_agent/api/chat_routes.py:
@chat_bp.route('/sessions/<session_id>/export', methods=['GET'])
@require_auth
@rate_limit(per_minute=10)
def export_chat_history(session_id):
    """Export chat history for a session."""
    try:
        # Validate session ownership
        session = session_manager.get_session(session_id)
        if not session or session['user_id'] != g.user_id:
            return jsonify({'error': 'Session not found'}), 404
        
        # Get full history
        history = chat_history_manager.get_full_history(session_id)
        
        return jsonify({
            'session_id': session_id,
            'messages': history,
            'exported_at': datetime.utcnow().isoformat()
        })
        
    except Exception as e:
        logger.error(f"Export error: {e}")
        return jsonify({'error': 'Export failed'}), 500
  1. Add tests for the new endpoint:
def test_export_chat_history(self, client, auth_headers):
    # Create session and messages
    session_response = client.post('/api/v1/chat/sessions', 
                                 headers=auth_headers,
                                 json={'language': 'python'})
    session_id = session_response.json['session_id']
    
    # Test export
    response = client.get(f'/api/v1/chat/sessions/{session_id}/export',
                         headers=auth_headers)
    
    assert response.status_code == 200
    assert 'messages' in response.json
  1. Update API documentation in chat_agent/api/README.md

WebSocket Event Handling

Adding New WebSocket Events:

# In chat_agent/websocket/chat_websocket.py

@socketio.on('custom_event')
def handle_custom_event(data):
    """Handle custom WebSocket event."""
    try:
        session_id = data.get('session_id')
        
        # Validate session
        if not session_manager.get_session(session_id):
            emit('error', {'error': 'Invalid session'})
            return
        
        # Process custom logic
        result = process_custom_logic(data)
        
        # Emit response
        emit('custom_response', {
            'session_id': session_id,
            'result': result,
            'timestamp': datetime.utcnow().isoformat()
        })
        
    except Exception as e:
        logger.error(f"Custom event error: {e}")
        emit('error', {'error': 'Processing failed'})

Database Management

Schema Migrations

Creating Migrations:

# migrations/003_add_new_feature.py
def upgrade(connection):
    """Add new feature to database."""
    connection.execute("""
        ALTER TABLE messages 
        ADD COLUMN sentiment_score FLOAT DEFAULT 0.0
    """)
    
    connection.execute("""
        CREATE INDEX idx_messages_sentiment 
        ON messages(sentiment_score)
    """)

def downgrade(connection):
    """Remove new feature from database."""
    connection.execute("DROP INDEX idx_messages_sentiment")
    connection.execute("ALTER TABLE messages DROP COLUMN sentiment_score")

Running Migrations:

python migrations/migrate.py

Database Optimization

Indexing Strategy:

-- Session-based queries
CREATE INDEX idx_messages_session_timestamp ON messages(session_id, timestamp);

-- User-based queries  
CREATE INDEX idx_sessions_user_active ON chat_sessions(user_id, is_active);

-- Language-based queries
CREATE INDEX idx_messages_language ON messages(language);

-- Full-text search (PostgreSQL)
CREATE INDEX idx_messages_content_fts ON messages USING gin(to_tsvector('english', content));

Performance Optimization

Caching Strategy

Redis Caching:

import redis
import json
from datetime import timedelta

class CacheManager:
    def __init__(self, redis_url):
        self.redis_client = redis.from_url(redis_url)
    
    def cache_response(self, key, response, ttl=3600):
        """Cache LLM response."""
        self.redis_client.setex(
            key, 
            ttl, 
            json.dumps(response)
        )
    
    def get_cached_response(self, key):
        """Get cached response."""
        cached = self.redis_client.get(key)
        return json.loads(cached) if cached else None
    
    def cache_chat_history(self, session_id, messages):
        """Cache recent chat history."""
        key = f"history:{session_id}"
        self.redis_client.setex(
            key,
            1800,  # 30 minutes
            json.dumps(messages)
        )

Application-Level Caching:

from functools import lru_cache

class LanguageContextManager:
    @lru_cache(maxsize=128)
    def get_language_prompt_template(self, language):
        """Cache prompt templates in memory."""
        return self._load_prompt_template(language)
    
    @lru_cache(maxsize=64)
    def get_supported_languages(self):
        """Cache supported languages list."""
        return self._load_supported_languages()

Database Connection Pooling

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

# Configure connection pool
engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
    pool_recycle=3600
)

Monitoring and Logging

Structured Logging

import logging
import json
from datetime import datetime

class StructuredLogger:
    def __init__(self, name):
        self.logger = logging.getLogger(name)
    
    def log_chat_interaction(self, session_id, user_message, response, language):
        """Log chat interaction with structured data."""
        log_data = {
            'event': 'chat_interaction',
            'session_id': session_id,
            'language': language,
            'user_message_length': len(user_message),
            'response_length': len(response),
            'timestamp': datetime.utcnow().isoformat()
        }
        
        self.logger.info(json.dumps(log_data))
    
    def log_error(self, error, context=None):
        """Log error with context."""
        log_data = {
            'event': 'error',
            'error_type': type(error).__name__,
            'error_message': str(error),
            'context': context or {},
            'timestamp': datetime.utcnow().isoformat()
        }
        
        self.logger.error(json.dumps(log_data))

Health Checks

from flask import Blueprint, jsonify
import time

health_bp = Blueprint('health', __name__)

@health_bp.route('/health')
def health_check():
    """Comprehensive health check."""
    health_status = {
        'status': 'healthy',
        'timestamp': datetime.utcnow().isoformat(),
        'services': {}
    }
    
    # Check database
    try:
        db.session.execute('SELECT 1')
        health_status['services']['database'] = 'healthy'
    except Exception as e:
        health_status['services']['database'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    # Check Redis
    try:
        redis_client.ping()
        health_status['services']['redis'] = 'healthy'
    except Exception as e:
        health_status['services']['redis'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    # Check Groq API
    try:
        # Simple API test
        groq_client.test_connection()
        health_status['services']['groq_api'] = 'healthy'
    except Exception as e:
        health_status['services']['groq_api'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    status_code = 200 if health_status['status'] == 'healthy' else 503
    return jsonify(health_status), status_code

Deployment

Docker Configuration

Dockerfile:

FROM python:3.9-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Create non-root user
RUN useradd --create-home --shell /bin/bash app
USER app

# Expose port
EXPOSE 5000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:5000/health || exit 1

# Start application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]

docker-compose.yml:

version: '3.8'

services:
  chat-agent:
    build: .
    ports:
      - "5000:5000"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/chatdb
      - REDIS_URL=redis://redis:6379/0
      - GROQ_API_KEY=${GROQ_API_KEY}
    depends_on:
      - db
      - redis
    volumes:
      - ./logs:/app/logs

  db:
    image: postgres:13
    environment:
      - POSTGRES_DB=chatdb
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:6-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Production Considerations

Security:

  • Use environment variables for sensitive configuration
  • Implement proper authentication and authorization
  • Enable HTTPS/TLS encryption
  • Regular security updates and vulnerability scanning

Scalability:

  • Horizontal scaling with load balancers
  • Database read replicas for heavy read workloads
  • Redis clustering for high availability
  • CDN for static assets

Monitoring:

  • Application performance monitoring (APM)
  • Log aggregation and analysis
  • Metrics collection and alerting
  • Health check endpoints

Contributing

Code Style

Python Code Style:

  • Follow PEP 8 guidelines
  • Use type hints where appropriate
  • Maximum line length: 88 characters (Black formatter)
  • Use meaningful variable and function names

Example:

from typing import List, Dict, Optional
from datetime import datetime

def process_chat_message(
    session_id: str,
    message: str,
    language: str,
    metadata: Optional[Dict] = None
) -> Dict[str, any]:
    """
    Process a chat message and return response.
    
    Args:
        session_id: Unique session identifier
        message: User's chat message
        language: Programming language context
        metadata: Optional message metadata
    
    Returns:
        Dictionary containing response and metadata
    
    Raises:
        ValueError: If session_id is invalid
        APIError: If LLM API call fails
    """
    if not session_id:
        raise ValueError("Session ID is required")
    
    # Implementation here
    return {
        'response': response_text,
        'timestamp': datetime.utcnow().isoformat(),
        'language': language
    }

Pull Request Process

  1. Fork the repository
  2. Create feature branch: git checkout -b feature/new-feature
  3. Make changes with tests
  4. Run test suite: pytest
  5. Update documentation
  6. Submit pull request

Code Review Checklist

  • Code follows style guidelines
  • Tests are included and passing
  • Documentation is updated
  • No security vulnerabilities
  • Performance impact considered
  • Backward compatibility maintained

This developer guide provides comprehensive information for contributing to and extending the Multi-Language Chat Agent. For specific implementation details, refer to the source code and inline documentation.