Spaces:

WebashalarForML
/

scratch_chat

Runtime error

App Files Files Community

scratch_chat / docs /DEVELOPER_GUIDE.md

WebashalarForML

Upload 178 files

330b6e4 verified 5 months ago

preview code

raw

history blame contribute delete

21.5 kB

Multi-Language Chat Agent - Developer Guide

Architecture Overview

The Multi-Language Chat Agent is built using a modular architecture with the following key components:

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Frontend      │    │   WebSocket     │    │   Chat Agent    │
│   (HTML/JS)     │◄──►│   Handler       │◄──►│   Service       │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │                        │
                                ▼                        ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Session       │    │   Language      │    │   Groq LLM      │
│   Manager       │    │   Context       │    │   Client        │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │
                                ▼
                    ┌─────────────────┐
                    │   Chat History  │
                    │   Manager       │
                    └─────────────────┘
                                │
                    ┌─────────────────┐    ┌─────────────────┐
                    │   Redis Cache   │    │   PostgreSQL    │
                    │                 │    │   Database      │
                    └─────────────────┘    └─────────────────┘

Core Components

1. Chat Agent Service (`chat_agent/services/chat_agent.py`)

The main orchestrator that coordinates all chat operations.

Key Methods:

process_message(): Main message processing pipeline
switch_language(): Handle language context switching
stream_response(): Real-time response streaming

Usage Example:

from chat_agent.services.chat_agent import ChatAgent

# Initialize chat agent
chat_agent = ChatAgent()

# Process a message
response = chat_agent.process_message(
    session_id="session-123",
    message="How do I create a Python list?",
    language="python"
)

2. Session Manager (`chat_agent/services/session_manager.py`)

Manages user sessions and chat state.

Key Methods:

create_session(): Create new chat session
get_session(): Retrieve session information
cleanup_inactive_sessions(): Remove expired sessions

Usage Example:

from chat_agent.services.session_manager import SessionManager

session_manager = SessionManager()

# Create new session
session = session_manager.create_session(
    user_id="user-123",
    language="python"
)

# Get session info
session_info = session_manager.get_session(session['session_id'])

3. Language Context Manager (`chat_agent/services/language_context.py`)

Handles programming language context and switching.

Key Methods:

set_language(): Set current language for session
get_language(): Get current language
get_language_prompt_template(): Get language-specific prompts

Usage Example:

from chat_agent.services.language_context import LanguageContextManager

lang_manager = LanguageContextManager()

# Set language context
lang_manager.set_language("session-123", "javascript")

# Get current language
current_lang = lang_manager.get_language("session-123")

# Get prompt template
template = lang_manager.get_language_prompt_template("python")

4. Chat History Manager (`chat_agent/services/chat_history.py`)

Manages persistent and cached chat history.

Key Methods:

store_message(): Store message in DB and cache
get_recent_history(): Get recent messages for context
get_full_history(): Get complete conversation history

Usage Example:

from chat_agent.services.chat_history import ChatHistoryManager

history_manager = ChatHistoryManager()

# Store a message
message_id = history_manager.store_message(
    session_id="session-123",
    role="user",
    content="What is Python?",
    language="python"
)

# Get recent history
recent = history_manager.get_recent_history("session-123", limit=10)

5. Groq Client (`chat_agent/services/groq_client.py`)

Handles integration with Groq LangChain API.

Key Methods:

generate_response(): Generate LLM response
stream_response(): Stream response generation
handle_api_errors(): Error handling and fallbacks

Usage Example:

from chat_agent.services.groq_client import GroqClient

groq_client = GroqClient(api_key="your-api-key")

# Generate response
response = groq_client.generate_response(
    prompt="Explain Python functions",
    chat_history=recent_messages,
    language_context="python"
)

Development Setup

Prerequisites

Python 3.8+
PostgreSQL (for production) or SQLite (for development)
Redis (for caching and session management)
Groq API key

Installation

Clone the repository:

git clone <repository-url>
cd multi-language-chat-agent

Create virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Set up environment variables:

cp .env.example .env
# Edit .env with your configuration

Initialize database:

python init_db.py

Run the application:

python app.py

Environment Configuration

Required Environment Variables:

# Groq API Configuration
GROQ_API_KEY=your-groq-api-key-here
GROQ_MODEL=mixtral-8x7b-32768

# Database Configuration
DATABASE_URL=postgresql://user:password@localhost/chatdb
# Or for SQLite: DATABASE_URL=sqlite:///instance/chat_agent.db

# Redis Configuration
REDIS_URL=redis://localhost:6379/0

# Flask Configuration
SECRET_KEY=your-secret-key-here
FLASK_ENV=development

Optional Configuration:

# Rate Limiting
RATE_LIMIT_ENABLED=true
RATE_LIMIT_PER_MINUTE=30

# Session Management
SESSION_TIMEOUT=3600  # 1 hour in seconds
CLEANUP_INTERVAL=300  # 5 minutes

# Logging
LOG_LEVEL=INFO
LOG_FILE=logs/chat_agent.log

Testing

Running Tests

All Tests:

pytest

Specific Test Categories:

# Unit tests
pytest tests/unit/

# Integration tests
pytest tests/integration/

# End-to-end tests
pytest tests/e2e/

# Performance tests
pytest tests/performance/

With Coverage:

pytest --cov=chat_agent --cov-report=html

Test Structure

tests/
├── unit/                 # Unit tests for individual components
│   ├── test_chat_agent.py
│   ├── test_session_manager.py
│   └── test_language_context.py
├── integration/          # Integration tests
│   ├── test_chat_api.py
│   └── test_websocket_integration.py
├── e2e/                  # End-to-end workflow tests
│   └── test_complete_chat_workflow.py
└── performance/          # Load and performance tests
    └── test_load_testing.py

Writing Tests

Unit Test Example:

import pytest
from unittest.mock import Mock, patch
from chat_agent.services.chat_agent import ChatAgent

class TestChatAgent:
    @pytest.fixture
    def mock_dependencies(self):
        return {
            'groq_client': Mock(),
            'session_manager': Mock(),
            'language_context_manager': Mock(),
            'chat_history_manager': Mock()
        }
    
    def test_process_message_success(self, mock_dependencies):
        # Arrange
        chat_agent = ChatAgent(**mock_dependencies)
        mock_dependencies['groq_client'].generate_response.return_value = "Test response"
        
        # Act
        result = chat_agent.process_message("session-123", "Test message", "python")
        
        # Assert
        assert result == "Test response"
        mock_dependencies['groq_client'].generate_response.assert_called_once()

Integration Test Example:

import pytest
from chat_agent.services.chat_agent import ChatAgent

class TestChatIntegration:
    @pytest.fixture
    def integrated_system(self):
        # Set up real components with test configuration
        return ChatAgent()
    
    def test_complete_chat_flow(self, integrated_system):
        # Test complete workflow with real components
        session_id = "test-session"
        response = integrated_system.process_message(
            session_id, "What is Python?", "python"
        )
        assert response is not None
        assert len(response) > 0

API Development

Adding New Endpoints

Create route in chat_agent/api/chat_routes.py:

@chat_bp.route('/sessions/<session_id>/export', methods=['GET'])
@require_auth
@rate_limit(per_minute=10)
def export_chat_history(session_id):
    """Export chat history for a session."""
    try:
        # Validate session ownership
        session = session_manager.get_session(session_id)
        if not session or session['user_id'] != g.user_id:
            return jsonify({'error': 'Session not found'}), 404
        
        # Get full history
        history = chat_history_manager.get_full_history(session_id)
        
        return jsonify({
            'session_id': session_id,
            'messages': history,
            'exported_at': datetime.utcnow().isoformat()
        })
        
    except Exception as e:
        logger.error(f"Export error: {e}")
        return jsonify({'error': 'Export failed'}), 500

Add tests for the new endpoint:

def test_export_chat_history(self, client, auth_headers):
    # Create session and messages
    session_response = client.post('/api/v1/chat/sessions', 
                                 headers=auth_headers,
                                 json={'language': 'python'})
    session_id = session_response.json['session_id']
    
    # Test export
    response = client.get(f'/api/v1/chat/sessions/{session_id}/export',
                         headers=auth_headers)
    
    assert response.status_code == 200
    assert 'messages' in response.json

Update API documentation in chat_agent/api/README.md

WebSocket Event Handling

Adding New WebSocket Events:

# In chat_agent/websocket/chat_websocket.py

@socketio.on('custom_event')
def handle_custom_event(data):
    """Handle custom WebSocket event."""
    try:
        session_id = data.get('session_id')
        
        # Validate session
        if not session_manager.get_session(session_id):
            emit('error', {'error': 'Invalid session'})
            return
        
        # Process custom logic
        result = process_custom_logic(data)
        
        # Emit response
        emit('custom_response', {
            'session_id': session_id,
            'result': result,
            'timestamp': datetime.utcnow().isoformat()
        })
        
    except Exception as e:
        logger.error(f"Custom event error: {e}")
        emit('error', {'error': 'Processing failed'})

Database Management

Schema Migrations

Creating Migrations:

# migrations/003_add_new_feature.py
def upgrade(connection):
    """Add new feature to database."""
    connection.execute("""
        ALTER TABLE messages 
        ADD COLUMN sentiment_score FLOAT DEFAULT 0.0
    """)
    
    connection.execute("""
        CREATE INDEX idx_messages_sentiment 
        ON messages(sentiment_score)
    """)

def downgrade(connection):
    """Remove new feature from database."""
    connection.execute("DROP INDEX idx_messages_sentiment")
    connection.execute("ALTER TABLE messages DROP COLUMN sentiment_score")

Running Migrations:

python migrations/migrate.py

Database Optimization

Indexing Strategy:

-- Session-based queries
CREATE INDEX idx_messages_session_timestamp ON messages(session_id, timestamp);

-- User-based queries  
CREATE INDEX idx_sessions_user_active ON chat_sessions(user_id, is_active);

-- Language-based queries
CREATE INDEX idx_messages_language ON messages(language);

-- Full-text search (PostgreSQL)
CREATE INDEX idx_messages_content_fts ON messages USING gin(to_tsvector('english', content));

Performance Optimization

Caching Strategy

Redis Caching:

import redis
import json
from datetime import timedelta

class CacheManager:
    def __init__(self, redis_url):
        self.redis_client = redis.from_url(redis_url)
    
    def cache_response(self, key, response, ttl=3600):
        """Cache LLM response."""
        self.redis_client.setex(
            key, 
            ttl, 
            json.dumps(response)
        )
    
    def get_cached_response(self, key):
        """Get cached response."""
        cached = self.redis_client.get(key)
        return json.loads(cached) if cached else None
    
    def cache_chat_history(self, session_id, messages):
        """Cache recent chat history."""
        key = f"history:{session_id}"
        self.redis_client.setex(
            key,
            1800,  # 30 minutes
            json.dumps(messages)
        )

Application-Level Caching:

from functools import lru_cache

class LanguageContextManager:
    @lru_cache(maxsize=128)
    def get_language_prompt_template(self, language):
        """Cache prompt templates in memory."""
        return self._load_prompt_template(language)
    
    @lru_cache(maxsize=64)
    def get_supported_languages(self):
        """Cache supported languages list."""
        return self._load_supported_languages()

Database Connection Pooling

from sqlalchemy import create_engine
from sqlalchemy.pool import QueuePool

# Configure connection pool
engine = create_engine(
    DATABASE_URL,
    poolclass=QueuePool,
    pool_size=10,
    max_overflow=20,
    pool_pre_ping=True,
    pool_recycle=3600
)

Monitoring and Logging

Structured Logging

import logging
import json
from datetime import datetime

class StructuredLogger:
    def __init__(self, name):
        self.logger = logging.getLogger(name)
    
    def log_chat_interaction(self, session_id, user_message, response, language):
        """Log chat interaction with structured data."""
        log_data = {
            'event': 'chat_interaction',
            'session_id': session_id,
            'language': language,
            'user_message_length': len(user_message),
            'response_length': len(response),
            'timestamp': datetime.utcnow().isoformat()
        }
        
        self.logger.info(json.dumps(log_data))
    
    def log_error(self, error, context=None):
        """Log error with context."""
        log_data = {
            'event': 'error',
            'error_type': type(error).__name__,
            'error_message': str(error),
            'context': context or {},
            'timestamp': datetime.utcnow().isoformat()
        }
        
        self.logger.error(json.dumps(log_data))

Health Checks

from flask import Blueprint, jsonify
import time

health_bp = Blueprint('health', __name__)

@health_bp.route('/health')
def health_check():
    """Comprehensive health check."""
    health_status = {
        'status': 'healthy',
        'timestamp': datetime.utcnow().isoformat(),
        'services': {}
    }
    
    # Check database
    try:
        db.session.execute('SELECT 1')
        health_status['services']['database'] = 'healthy'
    except Exception as e:
        health_status['services']['database'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    # Check Redis
    try:
        redis_client.ping()
        health_status['services']['redis'] = 'healthy'
    except Exception as e:
        health_status['services']['redis'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    # Check Groq API
    try:
        # Simple API test
        groq_client.test_connection()
        health_status['services']['groq_api'] = 'healthy'
    except Exception as e:
        health_status['services']['groq_api'] = f'unhealthy: {e}'
        health_status['status'] = 'unhealthy'
    
    status_code = 200 if health_status['status'] == 'healthy' else 503
    return jsonify(health_status), status_code

Deployment

Docker Configuration

Dockerfile:

FROM python:3.9-slim

WORKDIR /app

# Install system dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    && rm -rf /var/lib/apt/lists/*

# Install Python dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application code
COPY . .

# Create non-root user
RUN useradd --create-home --shell /bin/bash app
USER app

# Expose port
EXPOSE 5000

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
    CMD curl -f http://localhost:5000/health || exit 1

# Start application
CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]

docker-compose.yml:

version: '3.8'

services:
  chat-agent:
    build: .
    ports:
      - "5000:5000"
    environment:
      - DATABASE_URL=postgresql://postgres:password@db:5432/chatdb
      - REDIS_URL=redis://redis:6379/0
      - GROQ_API_KEY=${GROQ_API_KEY}
    depends_on:
      - db
      - redis
    volumes:
      - ./logs:/app/logs

  db:
    image: postgres:13
    environment:
      - POSTGRES_DB=chatdb
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:6-alpine
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Production Considerations

Security:

Use environment variables for sensitive configuration
Implement proper authentication and authorization
Enable HTTPS/TLS encryption
Regular security updates and vulnerability scanning

Scalability:

Horizontal scaling with load balancers
Database read replicas for heavy read workloads
Redis clustering for high availability
CDN for static assets

Monitoring:

Application performance monitoring (APM)
Log aggregation and analysis
Metrics collection and alerting
Health check endpoints

Contributing

Code Style

Python Code Style:

Follow PEP 8 guidelines
Use type hints where appropriate
Maximum line length: 88 characters (Black formatter)
Use meaningful variable and function names

Example:

from typing import List, Dict, Optional
from datetime import datetime

def process_chat_message(
    session_id: str,
    message: str,
    language: str,
    metadata: Optional[Dict] = None
) -> Dict[str, any]:
    """
    Process a chat message and return response.
    
    Args:
        session_id: Unique session identifier
        message: User's chat message
        language: Programming language context
        metadata: Optional message metadata
    
    Returns:
        Dictionary containing response and metadata
    
    Raises:
        ValueError: If session_id is invalid
        APIError: If LLM API call fails
    """
    if not session_id:
        raise ValueError("Session ID is required")
    
    # Implementation here
    return {
        'response': response_text,
        'timestamp': datetime.utcnow().isoformat(),
        'language': language
    }

Pull Request Process

Fork the repository
Create feature branch: git checkout -b feature/new-feature
Make changes with tests
Run test suite: pytest
Update documentation
Submit pull request

Code Review Checklist

Code follows style guidelines
Tests are included and passing
Documentation is updated
No security vulnerabilities
Performance impact considered
Backward compatibility maintained

This developer guide provides comprehensive information for contributing to and extending the Multi-Language Chat Agent. For specific implementation details, refer to the source code and inline documentation.

Multi-Language Chat Agent - Developer Guide

Architecture Overview

Core Components

1. Chat Agent Service (chat_agent/services/chat_agent.py)

2. Session Manager (chat_agent/services/session_manager.py)

3. Language Context Manager (chat_agent/services/language_context.py)

4. Chat History Manager (chat_agent/services/chat_history.py)

5. Groq Client (chat_agent/services/groq_client.py)

Development Setup

Prerequisites

Installation

Environment Configuration

Testing

Running Tests

Test Structure

Writing Tests

API Development

Adding New Endpoints

WebSocket Event Handling

Database Management

Schema Migrations

Database Optimization

Performance Optimization

Caching Strategy

Database Connection Pooling

Monitoring and Logging

Structured Logging

Health Checks

Deployment

Docker Configuration

Production Considerations

Contributing

Code Style

Pull Request Process

Code Review Checklist

1. Chat Agent Service (`chat_agent/services/chat_agent.py`)

2. Session Manager (`chat_agent/services/session_manager.py`)

3. Language Context Manager (`chat_agent/services/language_context.py`)

4. Chat History Manager (`chat_agent/services/chat_history.py`)

5. Groq Client (`chat_agent/services/groq_client.py`)