# Multi-Language Chat Agent - Developer Guide ## Architecture Overview The Multi-Language Chat Agent is built using a modular architecture with the following key components: ``` ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Frontend │ │ WebSocket │ │ Chat Agent │ │ (HTML/JS) │◄──►│ Handler │◄──►│ Service │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ │ ▼ ▼ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ Session │ │ Language │ │ Groq LLM │ │ Manager │ │ Context │ │ Client │ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ ▼ ┌─────────────────┐ │ Chat History │ │ Manager │ └─────────────────┘ │ ┌─────────────────┐ ┌─────────────────┐ │ Redis Cache │ │ PostgreSQL │ │ │ │ Database │ └─────────────────┘ └─────────────────┘ ``` ## Core Components ### 1. Chat Agent Service (`chat_agent/services/chat_agent.py`) The main orchestrator that coordinates all chat operations. **Key Methods:** - `process_message()`: Main message processing pipeline - `switch_language()`: Handle language context switching - `stream_response()`: Real-time response streaming **Usage Example:** ```python from chat_agent.services.chat_agent import ChatAgent # Initialize chat agent chat_agent = ChatAgent() # Process a message response = chat_agent.process_message( session_id="session-123", message="How do I create a Python list?", language="python" ) ``` ### 2. Session Manager (`chat_agent/services/session_manager.py`) Manages user sessions and chat state. **Key Methods:** - `create_session()`: Create new chat session - `get_session()`: Retrieve session information - `cleanup_inactive_sessions()`: Remove expired sessions **Usage Example:** ```python from chat_agent.services.session_manager import SessionManager session_manager = SessionManager() # Create new session session = session_manager.create_session( user_id="user-123", language="python" ) # Get session info session_info = session_manager.get_session(session['session_id']) ``` ### 3. Language Context Manager (`chat_agent/services/language_context.py`) Handles programming language context and switching. **Key Methods:** - `set_language()`: Set current language for session - `get_language()`: Get current language - `get_language_prompt_template()`: Get language-specific prompts **Usage Example:** ```python from chat_agent.services.language_context import LanguageContextManager lang_manager = LanguageContextManager() # Set language context lang_manager.set_language("session-123", "javascript") # Get current language current_lang = lang_manager.get_language("session-123") # Get prompt template template = lang_manager.get_language_prompt_template("python") ``` ### 4. Chat History Manager (`chat_agent/services/chat_history.py`) Manages persistent and cached chat history. **Key Methods:** - `store_message()`: Store message in DB and cache - `get_recent_history()`: Get recent messages for context - `get_full_history()`: Get complete conversation history **Usage Example:** ```python from chat_agent.services.chat_history import ChatHistoryManager history_manager = ChatHistoryManager() # Store a message message_id = history_manager.store_message( session_id="session-123", role="user", content="What is Python?", language="python" ) # Get recent history recent = history_manager.get_recent_history("session-123", limit=10) ``` ### 5. Groq Client (`chat_agent/services/groq_client.py`) Handles integration with Groq LangChain API. **Key Methods:** - `generate_response()`: Generate LLM response - `stream_response()`: Stream response generation - `handle_api_errors()`: Error handling and fallbacks **Usage Example:** ```python from chat_agent.services.groq_client import GroqClient groq_client = GroqClient(api_key="your-api-key") # Generate response response = groq_client.generate_response( prompt="Explain Python functions", chat_history=recent_messages, language_context="python" ) ``` ## Development Setup ### Prerequisites - Python 3.8+ - PostgreSQL (for production) or SQLite (for development) - Redis (for caching and session management) - Groq API key ### Installation 1. **Clone the repository:** ```bash git clone cd multi-language-chat-agent ``` 2. **Create virtual environment:** ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` 3. **Install dependencies:** ```bash pip install -r requirements.txt ``` 4. **Set up environment variables:** ```bash cp .env.example .env # Edit .env with your configuration ``` 5. **Initialize database:** ```bash python init_db.py ``` 6. **Run the application:** ```bash python app.py ``` ### Environment Configuration **Required Environment Variables:** ```bash # Groq API Configuration GROQ_API_KEY=your-groq-api-key-here GROQ_MODEL=mixtral-8x7b-32768 # Database Configuration DATABASE_URL=postgresql://user:password@localhost/chatdb # Or for SQLite: DATABASE_URL=sqlite:///instance/chat_agent.db # Redis Configuration REDIS_URL=redis://localhost:6379/0 # Flask Configuration SECRET_KEY=your-secret-key-here FLASK_ENV=development ``` **Optional Configuration:** ```bash # Rate Limiting RATE_LIMIT_ENABLED=true RATE_LIMIT_PER_MINUTE=30 # Session Management SESSION_TIMEOUT=3600 # 1 hour in seconds CLEANUP_INTERVAL=300 # 5 minutes # Logging LOG_LEVEL=INFO LOG_FILE=logs/chat_agent.log ``` ## Testing ### Running Tests **All Tests:** ```bash pytest ``` **Specific Test Categories:** ```bash # Unit tests pytest tests/unit/ # Integration tests pytest tests/integration/ # End-to-end tests pytest tests/e2e/ # Performance tests pytest tests/performance/ ``` **With Coverage:** ```bash pytest --cov=chat_agent --cov-report=html ``` ### Test Structure ``` tests/ ├── unit/ # Unit tests for individual components │ ├── test_chat_agent.py │ ├── test_session_manager.py │ └── test_language_context.py ├── integration/ # Integration tests │ ├── test_chat_api.py │ └── test_websocket_integration.py ├── e2e/ # End-to-end workflow tests │ └── test_complete_chat_workflow.py └── performance/ # Load and performance tests └── test_load_testing.py ``` ### Writing Tests **Unit Test Example:** ```python import pytest from unittest.mock import Mock, patch from chat_agent.services.chat_agent import ChatAgent class TestChatAgent: @pytest.fixture def mock_dependencies(self): return { 'groq_client': Mock(), 'session_manager': Mock(), 'language_context_manager': Mock(), 'chat_history_manager': Mock() } def test_process_message_success(self, mock_dependencies): # Arrange chat_agent = ChatAgent(**mock_dependencies) mock_dependencies['groq_client'].generate_response.return_value = "Test response" # Act result = chat_agent.process_message("session-123", "Test message", "python") # Assert assert result == "Test response" mock_dependencies['groq_client'].generate_response.assert_called_once() ``` **Integration Test Example:** ```python import pytest from chat_agent.services.chat_agent import ChatAgent class TestChatIntegration: @pytest.fixture def integrated_system(self): # Set up real components with test configuration return ChatAgent() def test_complete_chat_flow(self, integrated_system): # Test complete workflow with real components session_id = "test-session" response = integrated_system.process_message( session_id, "What is Python?", "python" ) assert response is not None assert len(response) > 0 ``` ## API Development ### Adding New Endpoints 1. **Create route in `chat_agent/api/chat_routes.py`:** ```python @chat_bp.route('/sessions//export', methods=['GET']) @require_auth @rate_limit(per_minute=10) def export_chat_history(session_id): """Export chat history for a session.""" try: # Validate session ownership session = session_manager.get_session(session_id) if not session or session['user_id'] != g.user_id: return jsonify({'error': 'Session not found'}), 404 # Get full history history = chat_history_manager.get_full_history(session_id) return jsonify({ 'session_id': session_id, 'messages': history, 'exported_at': datetime.utcnow().isoformat() }) except Exception as e: logger.error(f"Export error: {e}") return jsonify({'error': 'Export failed'}), 500 ``` 2. **Add tests for the new endpoint:** ```python def test_export_chat_history(self, client, auth_headers): # Create session and messages session_response = client.post('/api/v1/chat/sessions', headers=auth_headers, json={'language': 'python'}) session_id = session_response.json['session_id'] # Test export response = client.get(f'/api/v1/chat/sessions/{session_id}/export', headers=auth_headers) assert response.status_code == 200 assert 'messages' in response.json ``` 3. **Update API documentation in `chat_agent/api/README.md`** ### WebSocket Event Handling **Adding New WebSocket Events:** ```python # In chat_agent/websocket/chat_websocket.py @socketio.on('custom_event') def handle_custom_event(data): """Handle custom WebSocket event.""" try: session_id = data.get('session_id') # Validate session if not session_manager.get_session(session_id): emit('error', {'error': 'Invalid session'}) return # Process custom logic result = process_custom_logic(data) # Emit response emit('custom_response', { 'session_id': session_id, 'result': result, 'timestamp': datetime.utcnow().isoformat() }) except Exception as e: logger.error(f"Custom event error: {e}") emit('error', {'error': 'Processing failed'}) ``` ## Database Management ### Schema Migrations **Creating Migrations:** ```python # migrations/003_add_new_feature.py def upgrade(connection): """Add new feature to database.""" connection.execute(""" ALTER TABLE messages ADD COLUMN sentiment_score FLOAT DEFAULT 0.0 """) connection.execute(""" CREATE INDEX idx_messages_sentiment ON messages(sentiment_score) """) def downgrade(connection): """Remove new feature from database.""" connection.execute("DROP INDEX idx_messages_sentiment") connection.execute("ALTER TABLE messages DROP COLUMN sentiment_score") ``` **Running Migrations:** ```bash python migrations/migrate.py ``` ### Database Optimization **Indexing Strategy:** ```sql -- Session-based queries CREATE INDEX idx_messages_session_timestamp ON messages(session_id, timestamp); -- User-based queries CREATE INDEX idx_sessions_user_active ON chat_sessions(user_id, is_active); -- Language-based queries CREATE INDEX idx_messages_language ON messages(language); -- Full-text search (PostgreSQL) CREATE INDEX idx_messages_content_fts ON messages USING gin(to_tsvector('english', content)); ``` ## Performance Optimization ### Caching Strategy **Redis Caching:** ```python import redis import json from datetime import timedelta class CacheManager: def __init__(self, redis_url): self.redis_client = redis.from_url(redis_url) def cache_response(self, key, response, ttl=3600): """Cache LLM response.""" self.redis_client.setex( key, ttl, json.dumps(response) ) def get_cached_response(self, key): """Get cached response.""" cached = self.redis_client.get(key) return json.loads(cached) if cached else None def cache_chat_history(self, session_id, messages): """Cache recent chat history.""" key = f"history:{session_id}" self.redis_client.setex( key, 1800, # 30 minutes json.dumps(messages) ) ``` **Application-Level Caching:** ```python from functools import lru_cache class LanguageContextManager: @lru_cache(maxsize=128) def get_language_prompt_template(self, language): """Cache prompt templates in memory.""" return self._load_prompt_template(language) @lru_cache(maxsize=64) def get_supported_languages(self): """Cache supported languages list.""" return self._load_supported_languages() ``` ### Database Connection Pooling ```python from sqlalchemy import create_engine from sqlalchemy.pool import QueuePool # Configure connection pool engine = create_engine( DATABASE_URL, poolclass=QueuePool, pool_size=10, max_overflow=20, pool_pre_ping=True, pool_recycle=3600 ) ``` ## Monitoring and Logging ### Structured Logging ```python import logging import json from datetime import datetime class StructuredLogger: def __init__(self, name): self.logger = logging.getLogger(name) def log_chat_interaction(self, session_id, user_message, response, language): """Log chat interaction with structured data.""" log_data = { 'event': 'chat_interaction', 'session_id': session_id, 'language': language, 'user_message_length': len(user_message), 'response_length': len(response), 'timestamp': datetime.utcnow().isoformat() } self.logger.info(json.dumps(log_data)) def log_error(self, error, context=None): """Log error with context.""" log_data = { 'event': 'error', 'error_type': type(error).__name__, 'error_message': str(error), 'context': context or {}, 'timestamp': datetime.utcnow().isoformat() } self.logger.error(json.dumps(log_data)) ``` ### Health Checks ```python from flask import Blueprint, jsonify import time health_bp = Blueprint('health', __name__) @health_bp.route('/health') def health_check(): """Comprehensive health check.""" health_status = { 'status': 'healthy', 'timestamp': datetime.utcnow().isoformat(), 'services': {} } # Check database try: db.session.execute('SELECT 1') health_status['services']['database'] = 'healthy' except Exception as e: health_status['services']['database'] = f'unhealthy: {e}' health_status['status'] = 'unhealthy' # Check Redis try: redis_client.ping() health_status['services']['redis'] = 'healthy' except Exception as e: health_status['services']['redis'] = f'unhealthy: {e}' health_status['status'] = 'unhealthy' # Check Groq API try: # Simple API test groq_client.test_connection() health_status['services']['groq_api'] = 'healthy' except Exception as e: health_status['services']['groq_api'] = f'unhealthy: {e}' health_status['status'] = 'unhealthy' status_code = 200 if health_status['status'] == 'healthy' else 503 return jsonify(health_status), status_code ``` ## Deployment ### Docker Configuration **Dockerfile:** ```dockerfile FROM python:3.9-slim WORKDIR /app # Install system dependencies RUN apt-get update && apt-get install -y \ gcc \ && rm -rf /var/lib/apt/lists/* # Install Python dependencies COPY requirements.txt . RUN pip install --no-cache-dir -r requirements.txt # Copy application code COPY . . # Create non-root user RUN useradd --create-home --shell /bin/bash app USER app # Expose port EXPOSE 5000 # Health check HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD curl -f http://localhost:5000/health || exit 1 # Start application CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"] ``` **docker-compose.yml:** ```yaml version: '3.8' services: chat-agent: build: . ports: - "5000:5000" environment: - DATABASE_URL=postgresql://postgres:password@db:5432/chatdb - REDIS_URL=redis://redis:6379/0 - GROQ_API_KEY=${GROQ_API_KEY} depends_on: - db - redis volumes: - ./logs:/app/logs db: image: postgres:13 environment: - POSTGRES_DB=chatdb - POSTGRES_USER=postgres - POSTGRES_PASSWORD=password volumes: - postgres_data:/var/lib/postgresql/data redis: image: redis:6-alpine volumes: - redis_data:/data volumes: postgres_data: redis_data: ``` ### Production Considerations **Security:** - Use environment variables for sensitive configuration - Implement proper authentication and authorization - Enable HTTPS/TLS encryption - Regular security updates and vulnerability scanning **Scalability:** - Horizontal scaling with load balancers - Database read replicas for heavy read workloads - Redis clustering for high availability - CDN for static assets **Monitoring:** - Application performance monitoring (APM) - Log aggregation and analysis - Metrics collection and alerting - Health check endpoints ## Contributing ### Code Style **Python Code Style:** - Follow PEP 8 guidelines - Use type hints where appropriate - Maximum line length: 88 characters (Black formatter) - Use meaningful variable and function names **Example:** ```python from typing import List, Dict, Optional from datetime import datetime def process_chat_message( session_id: str, message: str, language: str, metadata: Optional[Dict] = None ) -> Dict[str, any]: """ Process a chat message and return response. Args: session_id: Unique session identifier message: User's chat message language: Programming language context metadata: Optional message metadata Returns: Dictionary containing response and metadata Raises: ValueError: If session_id is invalid APIError: If LLM API call fails """ if not session_id: raise ValueError("Session ID is required") # Implementation here return { 'response': response_text, 'timestamp': datetime.utcnow().isoformat(), 'language': language } ``` ### Pull Request Process 1. **Fork the repository** 2. **Create feature branch:** `git checkout -b feature/new-feature` 3. **Make changes with tests** 4. **Run test suite:** `pytest` 5. **Update documentation** 6. **Submit pull request** ### Code Review Checklist - [ ] Code follows style guidelines - [ ] Tests are included and passing - [ ] Documentation is updated - [ ] No security vulnerabilities - [ ] Performance impact considered - [ ] Backward compatibility maintained --- This developer guide provides comprehensive information for contributing to and extending the Multi-Language Chat Agent. For specific implementation details, refer to the source code and inline documentation.