skillsync-cli / ARCHITECTURE.md
Mr-Haseeb786
Clean deployment build
88da18c
|
Raw
History Blame Contribute Delete
15 kB
# Technical Architecture
Complete technical documentation for the modern Job Apply AI React SaaS application.
## Overview
The application uses a **monolithic architecture** with:
- **Frontend**: React 18 + TypeScript + Vite
- **Backend**: Flask REST API
- **Communication**: JSON over HTTP
- **State**: Zustand (frontend), Session (backend)
- **Styling**: Tailwind CSS + Framer Motion
## Application Flow
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ USER BROWSER β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ React App (Port 3000) β”‚
β”‚ β”œβ”€β”€ HomePage (landing page) β”‚
β”‚ β”œβ”€β”€ WorkflowPage (3-step wizard) β”‚
β”‚ β”œβ”€β”€ JobListPage (job browsing + selection) β”‚
β”‚ └── SettingsModal (configuration) β”‚
β”‚ β”‚
β”‚ State: Zustand Store β”‚
β”‚ β”œβ”€β”€ jobs: Job[] β”‚
β”‚ β”œβ”€β”€ cvTemplate: CVTemplate β”‚
β”‚ β”œβ”€β”€ tailoringMode: 'local' | 'api' β”‚
β”‚ └── notifications: Toast[] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
↓ (REST API Calls)
↓ (JSON requests/responses)
↓
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FLASK SERVER (Port 5050) β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Route Handlers β”‚
β”‚ β”œβ”€β”€ GET /api/health β”‚
β”‚ β”œβ”€β”€ GET /api/config β”‚
β”‚ β”œβ”€β”€ POST /api/upload-cv β”‚
β”‚ β”œβ”€β”€ POST /api/search β”‚
β”‚ β”œβ”€β”€ GET /api/jobs β”‚
β”‚ β”œβ”€β”€ POST /api/generate-cv/<id> β”‚
β”‚ β”œβ”€β”€ POST /api/generate-all-cvs β”‚
β”‚ └── GET /api/download/<filename> β”‚
β”‚ β”‚
β”‚ Business Logic β”‚
β”‚ β”œβ”€β”€ LinkedInScraper (job collection) β”‚
β”‚ β”œβ”€β”€ CVAnalyzer (skill extraction) β”‚
β”‚ └── CVModifier (CV customization) β”‚
β”‚ β”‚
β”‚ Data Storage β”‚
β”‚ └── .runtime/ (local filesystem) β”‚
β”‚ β”œβ”€β”€ uploads/cv_templates/ β”‚
β”‚ β”œβ”€β”€ uploads/cvs/generated_cvs/ β”‚
β”‚ β”œβ”€β”€ uploads/jobs/excel_exports/ β”‚
β”‚ └── uploads/session_state/json_files/ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
## Component Hierarchy
```
App
β”œβ”€β”€ HomePage
β”‚ β”œβ”€β”€ Header
β”‚ β”œβ”€β”€ HeroSection
β”‚ β”œβ”€β”€ FeaturesGrid
β”‚ └── Footer
β”‚
β”œβ”€β”€ WorkflowPage
β”‚ β”œβ”€β”€ Header
β”‚ β”œβ”€β”€ ProgressSteps
β”‚ β”œβ”€β”€ CVUpload (step 1)
β”‚ β”œβ”€β”€ JobSearch (step 2)
β”‚ └── ReviewSection (step 3)
β”‚
β”œβ”€β”€ JobListPage
β”‚ β”œβ”€β”€ JobListHeader
β”‚ β”œβ”€β”€ SelectionControls
β”‚ β”œβ”€β”€ JobCard (repeated)
β”‚ β”‚ β”œβ”€β”€ JobHeader
β”‚ β”‚ β”œβ”€β”€ SkillBadges
β”‚ β”‚ β”œβ”€β”€ ExpandedDetails
β”‚ β”‚ └── GenerateButton
β”‚ └── BatchProgress (floating)
β”‚
β”œβ”€β”€ SettingsModal
β”‚ β”œβ”€β”€ TailoringModeSelector
β”‚ β”œβ”€β”€ LLMProviderSelector
β”‚ └── AdvancedOptions
β”‚
└── Toast (notification)
```
## Data Models
### Frontend (TypeScript)
```typescript
interface Job {
id: string;
title: string;
company: string;
link: string;
source: 'LinkedIn' | 'Indeed' | 'Other';
posted_days_ago: number;
description: string;
matched_skills: string[];
matched_categories: Record<string, string[]>;
salary?: string;
location?: string;
}
interface CVTemplate {
filename: string;
uploadedAt: string;
size: number;
content?: ArrayBuffer;
}
interface GeneratedCV {
jobId: string;
jobTitle: string;
company: string;
filename: string;
url: string;
generatedAt: string;
status: 'success' | 'failed';
error?: string;
}
interface AppState {
jobs: Job[];
cvTemplate: CVTemplate | null;
tailoringMode: 'local' | 'api';
isSearching: boolean;
isGenerating: boolean;
selectedJobIds: Set<string>;
batchProgress: BatchProgress;
notification: Toast | null;
// ... setters and methods
}
```
### Backend (Python/JSON)
```python
# Job response from /api/search
{
"jobs": [
{
"id": "job_0",
"title": "Senior React Developer",
"company": "Tech Corp",
"link": "https://...",
"source": "LinkedIn",
"posted_days_ago": 3,
"description": "Full job description...",
"location": "San Francisco, CA",
"matched_skills": ["React", "TypeScript", "REST API"],
"matched_categories": {
"Frameworks & Libraries": ["React"],
"Programming Languages": ["TypeScript"],
"Tools & Platforms": ["REST API"]
}
}
],
"excel_file": "linkedin_jobs_2024-01-15_1705353600.xlsx"
}
# CV generation response from /api/generate-cv/<id>
{
"success": true,
"filename": "CV_20240115_120530_TechCorp_ReactDeveloper.docx",
"job_title": "Senior React Developer",
"company": "Tech Corp",
"message": "CV generated successfully"
}
```
## Request/Response Examples
### Upload CV
**Request:**
```
POST /api/upload-cv
Content-Type: multipart/form-data
file: (binary .docx file)
```
**Response:**
```json
{
"success": true,
"filename": "resume.docx",
"message": "CV template uploaded successfully"
}
```
### Search Jobs
**Request:**
```json
POST /api/search
{
"keyword": "React Developer",
"location": "San Francisco",
"max_jobs": 10,
"max_days_old": 14,
"tailoring_mode": "local"
}
```
**Response:**
```json
{
"success": true,
"jobs": [/* job objects */],
"excel_file": "linkedin_jobs_2024-01-15_1234567890.xlsx",
"message": "Found 10 jobs"
}
```
### Generate All CVs
**Request:**
```json
POST /api/generate-all-cvs
{
"job_indices": [0, 2, 5] // null for all
}
```
**Response:**
```json
{
"success": true,
"successful": [
{
"job_index": 0,
"filename": "CV_..._TechCorp_ReactDev.docx",
"job_title": "Senior React Developer",
"company": "Tech Corp"
}
],
"failed": [
{
"job_index": 2,
"error": "File processing error",
"job_title": "Mid-level Engineer"
}
],
"zip_filename": "CVs_20240115_120530.zip",
"total_generated": 2,
"total_failed": 1
}
```
## State Management Flow
### Zustand Store Pattern
```typescript
// 1. Define store with getter/setter/action methods
export const useJobStore = create<AppState>()(
persist((set, get) => ({
jobs: [],
setJobs: (jobs) => set({ jobs }),
addJob: (job) => set((state) => ({
jobs: [...state.jobs, job]
})),
toggleJobSelection: (jobId) => set((state) => ({
selectedJobIds: new Set(...)
})),
// ... more methods
}), {
name: 'job-apply-store', // localStorage key
partialize: (state) => ({ // what to persist
tailoringMode: state.tailoringMode,
llmProvider: state.llmProvider,
// Don't persist large data
})
})
);
// 2. Use in components
const MyComponent = () => {
const { jobs, setJobs, isSearching } = useJobStore();
// Component re-renders when state changes
};
// 3. Update state
await handleSearch(); // calls setJobs()
// Component automatically re-renders
```
### Session Flow
```
1. Browser β†’ Upload CV β†’ Flask saves to .runtime/uploads/
2. Browser β†’ Search β†’ Flask queries LinkedIn
3. Flask β†’ Processes results β†’ Saves to .runtime/session_state/{uuid}.json
4. Browser ← Get jobs from session
5. Browser β†’ Generate CV β†’ Flask reads job from session state
6. Flask β†’ Modifies document β†’ Saves to .runtime/uploads/cvs/
7. Browser ← Download CV
```
## File System Structure
```
.runtime/
β”œβ”€β”€ uploads/
β”‚ β”œβ”€β”€ cv_template_1705353600.docx
β”‚ β”‚
β”‚ β”œβ”€β”€ cvs/
β”‚ β”‚ β”œβ”€β”€ CV_20240115_120530_TechCorp_ReactDeveloper.docx
β”‚ β”‚ β”œβ”€β”€ CV_20240115_120531_DataCorp_Engineer.docx
β”‚ β”‚ └── CVs_20240115_120530.zip
β”‚ β”‚
β”‚ β”œβ”€β”€ jobs/
β”‚ β”‚ └── linkedin_jobs_2024-01-15_1705353600.xlsx
β”‚ β”‚
β”‚ └── session_state/
β”‚ └── c8f9d2e1-4b3c-5a7f-8e9c-2d4f6a8b9c1d.json
```
## API Error Handling
### Client-Side
```typescript
try {
const result = await jobsAPI.searchJobs(filters);
setJobs(result.jobs);
} catch (error) {
setNotification({
type: 'error',
message: error.message
});
}
```
### Server-Side
```python
@app.route('/api/search', methods=['POST'])
def api_search_jobs():
try:
# Validate input
if not keyword: return jsonify({...}, 400)
# Process
jobs = scraper.scrape_job_listings(...)
# Return
return jsonify({'success': True, 'jobs': jobs})
except Exception as e:
logger.error(str(e))
return jsonify({'success': False, 'error': str(e)}, 500)
```
### Error Types
| Type | HTTPCode | User Message |
|------|----------|--------------|
| Missing required field | 400 | "Keyword and location are required" |
| Invalid file type | 400 | "Only .docx files are supported" |
| File not found | 404 | "CV template not found" |
| Server error | 500 | "An error occurred. Please try again" |
## Performance Considerations
### Frontend
- **Code Splitting**: Route-based chunk splitting via Vite
- **Lazy Loading**: Components load on demand
- **Memoization**: React.memo for expensive components
- **Debouncing**: Search input debounced
- **CSS**: Tailwind purges unused styles
### Backend
- **Batch Operations**: Process multiple CVs in single request
- **Session Caching**: Job data cached in .runtime/
- **Connection Pooling**: Selenium reuses browser window
- **Async**: Non-blocking operations where possible
### Network
- **JSON": Compact data format vs XML/Form
- **Compression**: Gzip enabled by default
- **Caching": Etag headers for static assets
- **Streaming**: Files streamed for download
## Security Measures
### Input Validation
```python
# File type check
if not file.filename.endswith('.docx'):
return error
# Size limit
if file.size > 10_000_000:
return error
# Path traversal prevention
if '..' in filename or '/' in filename:
return error
```
### Session Security
```python
# Unique session key per run
app.config['SESSION_COOKIE_NAME'] = f"job_apply_ai_{int(time.time())}"
# Secure path storage
_session_state_path(state_id) # Safe path construction
# CORS configuration
CORS(app, resources={r"/api/*":{"origins":"*"}})
```
### Data Protection
- No passwords stored
- No PII logged
- Uploaded files deleted after processing (optional)
- Session files cleaned up
## Scaling Architecture
### Horizontal Scaling
```
User β†’ Load Balancer
β”œβ”€β”€ Flask Server 1 β†’ Shared Session Store (Redis)
β”œβ”€β”€ Flask Server 2 β†’ Shared File Storage (S3/NFS)
└── Flask Server 3
```
### Vertical Scaling
```
Single Machine
β”œβ”€β”€ Increase Worker Processes
β”œβ”€β”€ Add RAM for larger job batches
β”œβ”€β”€ SSD for faster CV processing
└── Dedicated GPU for future AI tasks
```
## Deployment Targets
### Development
- Vite dev server on :3000
- Flask dev server on :5050
- Hot reload enabled
### Staging
- Docker containers
- Cloud platform (AWS/GCP/Azure)
- Full testing suite
### Production
- Built React app on :3000 (or CDN)
- Gunicorn server on :5050
- Database for sessions
- Cloud storage for files
## Future Architectural Improvements
### Phase 2
- [ ] Authentication/Authorization system
- [ ] Database (PostgreSQL) for persistent storage
- [ ] Redis for session caching and queue
- [ ] Celery for async job processing
- [ ] WebSocket for real-time progress
### Phase 3
- [ ] Microservices architecture
- [ ] Kubernetes orchestration
- [ ] Message queue (RabbitMQ)
- [ ] API rate limiting
- [ ] Advanced caching
### Phase 4
- [ ] GraphQL API option
- [ ] Event streaming (Kafka)
- [ ] Service mesh (Istio)
- [ ] Distributed tracing
- [ ] Advanced analytics
## Technology Decision Rationale
| Choice | Why |
|--------|-----|
| React | Large ecosystem, component reusability, strong community |
| Zustand | Simple, no boilerplate compared to Redux |
| Tailwind | Fast development, consistent design system |
| Framer Motion | Smooth animations, good perf, learning curve |
| Vite | Fast builds, excellent DX, modern tooling |
| Flask | Lightweight, Pythonic, good for API + rendering |
| pandas | Data processing, Excel export |
| Selenium | Web automation, JS-heavy site support |
| spaCy | NLP, good performance, pre-trained models |
## Monitoring & Observability
### Logging
```python
logger.info("Job search initiated")
logger.warning("CV generation slow")
logger.error("Scraping failed")
```
### Metrics to Track
- Page load times
- API response times
- CV generation duration
- Success/failure rates
- File upload sizes
### Debugging
```
Browser DevTools β†’ Network β†’ Check API responses
Browser Console β†’ Check React errors
Flask Terminal β†’ Check server logs
Browser Storage β†’ Check localStorage/state
```
---
This architecture provides a solid foundation for growth and future enhancements while maintaining simplicity and ease of development.