Spaces:

WebashalarForML
/

scratch_chat

Runtime error

File size: 21,465 Bytes

330b6e4

# Multi-Language Chat Agent - Developer Guide

## Architecture Overview

The Multi-Language Chat Agent is built using a modular architecture with the following key components:

```

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐

│   Frontend      │    │   WebSocket     │    │   Chat Agent    │

│   (HTML/JS)     │◄──►│   Handler       │◄──►│   Service       │

└─────────────────┘    └─────────────────┘    └─────────────────┘

                                │                        │

                                ▼                        ▼

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐

│   Session       │    │   Language      │    │   Groq LLM      │

│   Manager       │    │   Context       │    │   Client        │

└─────────────────┘    └─────────────────┘    └─────────────────┘

                                │

                                ▼

                    ┌─────────────────┐

                    │   Chat History  │

                    │   Manager       │

                    └─────────────────┘

                                │

                    ┌─────────────────┐    ┌─────────────────┐

                    │   Redis Cache   │    │   PostgreSQL    │

                    │                 │    │   Database      │

                    └─────────────────┘    └─────────────────┘

```

## Core Components

### 1. Chat Agent Service (`chat_agent/services/chat_agent.py`)

The main orchestrator that coordinates all chat operations.

**Key Methods:**
- `process_message()`: Main message processing pipeline
- `switch_language()`: Handle language context switching
- `stream_response()`: Real-time response streaming

**Usage Example:**
```python

from chat_agent.services.chat_agent import ChatAgent



# Initialize chat agent

chat_agent = ChatAgent()



# Process a message

response = chat_agent.process_message(

    session_id="session-123",

    message="How do I create a Python list?",

    language="python"

)

```

### 2. Session Manager (`chat_agent/services/session_manager.py`)

Manages user sessions and chat state.

**Key Methods:**
- `create_session()`: Create new chat session
- `get_session()`: Retrieve session information
- `cleanup_inactive_sessions()`: Remove expired sessions

**Usage Example:**
```python

from chat_agent.services.session_manager import SessionManager



session_manager = SessionManager()



# Create new session

session = session_manager.create_session(

    user_id="user-123",

    language="python"

)



# Get session info

session_info = session_manager.get_session(session['session_id'])

```

### 3. Language Context Manager (`chat_agent/services/language_context.py`)

Handles programming language context and switching.

**Key Methods:**
- `set_language()`: Set current language for session
- `get_language()`: Get current language
- `get_language_prompt_template()`: Get language-specific prompts

**Usage Example:**
```python

from chat_agent.services.language_context import LanguageContextManager



lang_manager = LanguageContextManager()



# Set language context

lang_manager.set_language("session-123", "javascript")



# Get current language

current_lang = lang_manager.get_language("session-123")



# Get prompt template

template = lang_manager.get_language_prompt_template("python")

```

### 4. Chat History Manager (`chat_agent/services/chat_history.py`)

Manages persistent and cached chat history.

**Key Methods:**
- `store_message()`: Store message in DB and cache
- `get_recent_history()`: Get recent messages for context
- `get_full_history()`: Get complete conversation history

**Usage Example:**
```python

from chat_agent.services.chat_history import ChatHistoryManager



history_manager = ChatHistoryManager()



# Store a message

message_id = history_manager.store_message(

    session_id="session-123",

    role="user",

    content="What is Python?",

    language="python"

)



# Get recent history

recent = history_manager.get_recent_history("session-123", limit=10)

```

### 5. Groq Client (`chat_agent/services/groq_client.py`)

Handles integration with Groq LangChain API.

**Key Methods:**
- `generate_response()`: Generate LLM response
- `stream_response()`: Stream response generation
- `handle_api_errors()`: Error handling and fallbacks

**Usage Example:**
```python

from chat_agent.services.groq_client import GroqClient



groq_client = GroqClient(api_key="your-api-key")



# Generate response

response = groq_client.generate_response(

    prompt="Explain Python functions",

    chat_history=recent_messages,

    language_context="python"

)

```

## Development Setup

### Prerequisites

- Python 3.8+
- PostgreSQL (for production) or SQLite (for development)
- Redis (for caching and session management)
- Groq API key

### Installation

1. **Clone the repository:**
```bash

git clone <repository-url>

cd multi-language-chat-agent

```

2. **Create virtual environment:**
```bash

python -m venv venv

source venv/bin/activate  # On Windows: venv\Scripts\activate

```

3. **Install dependencies:**
```bash

pip install -r requirements.txt

```

4. **Set up environment variables:**
```bash

cp .env.example .env

# Edit .env with your configuration

```

5. **Initialize database:**
```bash

python init_db.py

```

6. **Run the application:**
```bash

python app.py

```

### Environment Configuration

**Required Environment Variables:**
```bash

# Groq API Configuration

GROQ_API_KEY=your-groq-api-key-here

GROQ_MODEL=mixtral-8x7b-32768



# Database Configuration

DATABASE_URL=postgresql://user:password@localhost/chatdb

# Or for SQLite: DATABASE_URL=sqlite:///instance/chat_agent.db



# Redis Configuration

REDIS_URL=redis://localhost:6379/0



# Flask Configuration

SECRET_KEY=your-secret-key-here

FLASK_ENV=development

```

**Optional Configuration:**
```bash

# Rate Limiting

RATE_LIMIT_ENABLED=true

RATE_LIMIT_PER_MINUTE=30



# Session Management

SESSION_TIMEOUT=3600  # 1 hour in seconds

CLEANUP_INTERVAL=300  # 5 minutes



# Logging

LOG_LEVEL=INFO

LOG_FILE=logs/chat_agent.log

```

## Testing

### Running Tests

**All Tests:**
```bash

pytest

```

**Specific Test Categories:**
```bash

# Unit tests

pytest tests/unit/



# Integration tests

pytest tests/integration/



# End-to-end tests

pytest tests/e2e/



# Performance tests

pytest tests/performance/

```

**With Coverage:**
```bash

pytest --cov=chat_agent --cov-report=html

```

### Test Structure

```

tests/

├── unit/                 # Unit tests for individual components

│   ├── test_chat_agent.py

│   ├── test_session_manager.py

│   └── test_language_context.py

├── integration/          # Integration tests

│   ├── test_chat_api.py

│   └── test_websocket_integration.py

├── e2e/                  # End-to-end workflow tests

│   └── test_complete_chat_workflow.py

└── performance/          # Load and performance tests

    └── test_load_testing.py

```

### Writing Tests

**Unit Test Example:**
```python

import pytest

from unittest.mock import Mock, patch

from chat_agent.services.chat_agent import ChatAgent



class TestChatAgent:

    @pytest.fixture

    def mock_dependencies(self):

        return {

            'groq_client': Mock(),

            'session_manager': Mock(),

            'language_context_manager': Mock(),

            'chat_history_manager': Mock()

        }

    

    def test_process_message_success(self, mock_dependencies):

        # Arrange

        chat_agent = ChatAgent(**mock_dependencies)

        mock_dependencies['groq_client'].generate_response.return_value = "Test response"

        

        # Act

        result = chat_agent.process_message("session-123", "Test message", "python")

        

        # Assert

        assert result == "Test response"

        mock_dependencies['groq_client'].generate_response.assert_called_once()

```

**Integration Test Example:**
```python

import pytest

from chat_agent.services.chat_agent import ChatAgent



class TestChatIntegration:

    @pytest.fixture

    def integrated_system(self):

        # Set up real components with test configuration

        return ChatAgent()

    

    def test_complete_chat_flow(self, integrated_system):

        # Test complete workflow with real components

        session_id = "test-session"

        response = integrated_system.process_message(

            session_id, "What is Python?", "python"

        )

        assert response is not None

        assert len(response) > 0

```

## API Development

### Adding New Endpoints

1. **Create route in `chat_agent/api/chat_routes.py`:**
```python

@chat_bp.route('/sessions/<session_id>/export', methods=['GET'])

@require_auth

@rate_limit(per_minute=10)

def export_chat_history(session_id):

    """Export chat history for a session."""

    try:

        # Validate session ownership

        session = session_manager.get_session(session_id)

        if not session or session['user_id'] != g.user_id:

            return jsonify({'error': 'Session not found'}), 404

        

        # Get full history

        history = chat_history_manager.get_full_history(session_id)

        

        return jsonify({

            'session_id': session_id,

            'messages': history,

            'exported_at': datetime.utcnow().isoformat()

        })

        

    except Exception as e:

        logger.error(f"Export error: {e}")

        return jsonify({'error': 'Export failed'}), 500

```

2. **Add tests for the new endpoint:**
```python

def test_export_chat_history(self, client, auth_headers):

    # Create session and messages

    session_response = client.post('/api/v1/chat/sessions', 

                                 headers=auth_headers,

                                 json={'language': 'python'})

    session_id = session_response.json['session_id']

    

    # Test export

    response = client.get(f'/api/v1/chat/sessions/{session_id}/export',

                         headers=auth_headers)

    

    assert response.status_code == 200

    assert 'messages' in response.json

```

3. **Update API documentation in `chat_agent/api/README.md`**



### WebSocket Event Handling



**Adding New WebSocket Events:**

```python

# In chat_agent/websocket/chat_websocket.py



@socketio.on('custom_event')

def handle_custom_event(data):

    """Handle custom WebSocket event."""

    try:

        session_id = data.get('session_id')

        

        # Validate session

        if not session_manager.get_session(session_id):

            emit('error', {'error': 'Invalid session'})

            return

        

        # Process custom logic

        result = process_custom_logic(data)

        

        # Emit response

        emit('custom_response', {

            'session_id': session_id,

            'result': result,

            'timestamp': datetime.utcnow().isoformat()

        })

        

    except Exception as e:

        logger.error(f"Custom event error: {e}")

        emit('error', {'error': 'Processing failed'})

```



## Database Management



### Schema Migrations



**Creating Migrations:**
```python

# migrations/003_add_new_feature.py

def upgrade(connection):

    """Add new feature to database."""

    connection.execute("""

        ALTER TABLE messages 

        ADD COLUMN sentiment_score FLOAT DEFAULT 0.0

    """)

    

    connection.execute("""

        CREATE INDEX idx_messages_sentiment 

        ON messages(sentiment_score)

    """)



def downgrade(connection):

    """Remove new feature from database."""

    connection.execute("DROP INDEX idx_messages_sentiment")

    connection.execute("ALTER TABLE messages DROP COLUMN sentiment_score")

```

**Running Migrations:**
```bash

python migrations/migrate.py

```

### Database Optimization

**Indexing Strategy:**
```sql

-- Session-based queries

CREATE INDEX idx_messages_session_timestamp ON messages(session_id, timestamp);



-- User-based queries  

CREATE INDEX idx_sessions_user_active ON chat_sessions(user_id, is_active);



-- Language-based queries

CREATE INDEX idx_messages_language ON messages(language);



-- Full-text search (PostgreSQL)

CREATE INDEX idx_messages_content_fts ON messages USING gin(to_tsvector('english', content));

```

## Performance Optimization

### Caching Strategy

**Redis Caching:**
```python

import redis

import json

from datetime import timedelta



class CacheManager:

    def __init__(self, redis_url):

        self.redis_client = redis.from_url(redis_url)

    

    def cache_response(self, key, response, ttl=3600):

        """Cache LLM response."""

        self.redis_client.setex(

            key, 

            ttl, 

            json.dumps(response)

        )

    

    def get_cached_response(self, key):

        """Get cached response."""

        cached = self.redis_client.get(key)

        return json.loads(cached) if cached else None

    

    def cache_chat_history(self, session_id, messages):

        """Cache recent chat history."""

        key = f"history:{session_id}"

        self.redis_client.setex(

            key,

            1800,  # 30 minutes

            json.dumps(messages)

        )

```

**Application-Level Caching:**
```python

from functools import lru_cache



class LanguageContextManager:

    @lru_cache(maxsize=128)

    def get_language_prompt_template(self, language):

        """Cache prompt templates in memory."""

        return self._load_prompt_template(language)

    

    @lru_cache(maxsize=64)

    def get_supported_languages(self):

        """Cache supported languages list."""

        return self._load_supported_languages()

```

### Database Connection Pooling

```python

from sqlalchemy import create_engine

from sqlalchemy.pool import QueuePool



# Configure connection pool

engine = create_engine(

    DATABASE_URL,

    poolclass=QueuePool,

    pool_size=10,

    max_overflow=20,

    pool_pre_ping=True,

    pool_recycle=3600

)

```

## Monitoring and Logging

### Structured Logging

```python

import logging

import json

from datetime import datetime



class StructuredLogger:

    def __init__(self, name):

        self.logger = logging.getLogger(name)

    

    def log_chat_interaction(self, session_id, user_message, response, language):

        """Log chat interaction with structured data."""

        log_data = {

            'event': 'chat_interaction',

            'session_id': session_id,

            'language': language,

            'user_message_length': len(user_message),

            'response_length': len(response),

            'timestamp': datetime.utcnow().isoformat()

        }

        

        self.logger.info(json.dumps(log_data))

    

    def log_error(self, error, context=None):

        """Log error with context."""

        log_data = {

            'event': 'error',

            'error_type': type(error).__name__,

            'error_message': str(error),

            'context': context or {},

            'timestamp': datetime.utcnow().isoformat()

        }

        

        self.logger.error(json.dumps(log_data))

```

### Health Checks

```python

from flask import Blueprint, jsonify

import time



health_bp = Blueprint('health', __name__)



@health_bp.route('/health')

def health_check():

    """Comprehensive health check."""

    health_status = {

        'status': 'healthy',

        'timestamp': datetime.utcnow().isoformat(),

        'services': {}

    }

    

    # Check database

    try:

        db.session.execute('SELECT 1')

        health_status['services']['database'] = 'healthy'

    except Exception as e:

        health_status['services']['database'] = f'unhealthy: {e}'

        health_status['status'] = 'unhealthy'

    

    # Check Redis

    try:

        redis_client.ping()

        health_status['services']['redis'] = 'healthy'

    except Exception as e:

        health_status['services']['redis'] = f'unhealthy: {e}'

        health_status['status'] = 'unhealthy'

    

    # Check Groq API

    try:

        # Simple API test

        groq_client.test_connection()

        health_status['services']['groq_api'] = 'healthy'

    except Exception as e:

        health_status['services']['groq_api'] = f'unhealthy: {e}'

        health_status['status'] = 'unhealthy'

    

    status_code = 200 if health_status['status'] == 'healthy' else 503

    return jsonify(health_status), status_code

```

## Deployment

### Docker Configuration

**Dockerfile:**
```dockerfile

FROM python:3.9-slim



WORKDIR /app



# Install system dependencies

RUN apt-get update && apt-get install -y \

    gcc \

    && rm -rf /var/lib/apt/lists/*



# Install Python dependencies

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt



# Copy application code

COPY . .



# Create non-root user

RUN useradd --create-home --shell /bin/bash app

USER app



# Expose port

EXPOSE 5000



# Health check

HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \

    CMD curl -f http://localhost:5000/health || exit 1



# Start application

CMD ["gunicorn", "--bind", "0.0.0.0:5000", "--workers", "4", "app:app"]

```

**docker-compose.yml:**
```yaml

version: '3.8'



services:

  chat-agent:

    build: .

    ports:

      - "5000:5000"

    environment:

      - DATABASE_URL=postgresql://postgres:password@db:5432/chatdb

      - REDIS_URL=redis://redis:6379/0

      - GROQ_API_KEY=${GROQ_API_KEY}

    depends_on:

      - db

      - redis

    volumes:

      - ./logs:/app/logs



  db:

    image: postgres:13

    environment:

      - POSTGRES_DB=chatdb

      - POSTGRES_USER=postgres

      - POSTGRES_PASSWORD=password

    volumes:

      - postgres_data:/var/lib/postgresql/data



  redis:

    image: redis:6-alpine

    volumes:

      - redis_data:/data



volumes:

  postgres_data:

  redis_data:

```

### Production Considerations

**Security:**
- Use environment variables for sensitive configuration
- Implement proper authentication and authorization
- Enable HTTPS/TLS encryption
- Regular security updates and vulnerability scanning

**Scalability:**
- Horizontal scaling with load balancers
- Database read replicas for heavy read workloads
- Redis clustering for high availability
- CDN for static assets

**Monitoring:**
- Application performance monitoring (APM)
- Log aggregation and analysis
- Metrics collection and alerting
- Health check endpoints

## Contributing

### Code Style

**Python Code Style:**
- Follow PEP 8 guidelines
- Use type hints where appropriate
- Maximum line length: 88 characters (Black formatter)
- Use meaningful variable and function names

**Example:**
```python

from typing import List, Dict, Optional

from datetime import datetime



def process_chat_message(

    session_id: str,

    message: str,

    language: str,

    metadata: Optional[Dict] = None

) -> Dict[str, any]:

    """

    Process a chat message and return response.

    

    Args:

        session_id: Unique session identifier

        message: User's chat message

        language: Programming language context

        metadata: Optional message metadata

    

    Returns:

        Dictionary containing response and metadata

    

    Raises:

        ValueError: If session_id is invalid

        APIError: If LLM API call fails

    """

    if not session_id:

        raise ValueError("Session ID is required")

    

    # Implementation here

    return {

        'response': response_text,

        'timestamp': datetime.utcnow().isoformat(),

        'language': language

    }

```

### Pull Request Process

1. **Fork the repository**
2. **Create feature branch:** `git checkout -b feature/new-feature`
3. **Make changes with tests**
4. **Run test suite:** `pytest`
5. **Update documentation**
6. **Submit pull request**

### Code Review Checklist

- [ ] Code follows style guidelines
- [ ] Tests are included and passing
- [ ] Documentation is updated
- [ ] No security vulnerabilities
- [ ] Performance impact considered
- [ ] Backward compatibility maintained

---

This developer guide provides comprehensive information for contributing to and extending the Multi-Language Chat Agent. For specific implementation details, refer to the source code and inline documentation.