Pujan Neupane commited on
Commit
72b7684
Β·
unverified Β·
2 Parent(s): 2609d891e91f18

Merge pull request #21 from cyberalertnepal/PujanDev

Browse files
This view is limited to 50 files because it contains too many changes. Β  See raw diff
Files changed (50) hide show
  1. .env-example +0 -0
  2. .gitignore +6 -0
  3. Dockerfile +10 -1
  4. Procfile +0 -0
  5. README.md +149 -6
  6. __init__.py +0 -0
  7. app.py +38 -13
  8. config.py +0 -0
  9. docs/api_endpoints.md +31 -14
  10. docs/deployment.md +3 -0
  11. docs/detector/ELA.md +65 -0
  12. docs/detector/fft.md +136 -0
  13. docs/detector/meta.md +20 -0
  14. docs/detector/note-for-backend.md +94 -0
  15. docs/features/image_classifier.md +31 -0
  16. docs/features/nepali_text_classifier.md +30 -0
  17. docs/features/text_classifier.md +30 -0
  18. docs/functions.md +10 -1
  19. docs/nestjs_integration.md +1 -0
  20. docs/security.md +1 -0
  21. docs/setup.md +1 -0
  22. docs/status_code.md +68 -0
  23. docs/structure.md +51 -31
  24. features/image_classifier/__init__.py +0 -0
  25. features/image_classifier/controller.py +16 -0
  26. features/image_classifier/inferencer.py +42 -0
  27. features/image_classifier/model_loader.py +58 -0
  28. features/image_classifier/preprocess.py +26 -0
  29. features/image_classifier/routes.py +26 -0
  30. features/image_edit_detector/controller.py +49 -0
  31. features/image_edit_detector/detectors/ela.py +32 -0
  32. features/image_edit_detector/detectors/fft.py +40 -0
  33. features/image_edit_detector/detectors/metadata.py +82 -0
  34. features/image_edit_detector/preprocess.py +9 -0
  35. features/image_edit_detector/routes.py +53 -0
  36. features/nepali_text_classifier/__init__.py +0 -0
  37. features/nepali_text_classifier/controller.py +0 -1
  38. features/nepali_text_classifier/inferencer.py +0 -0
  39. features/nepali_text_classifier/model_loader.py +1 -1
  40. features/nepali_text_classifier/preprocess.py +6 -8
  41. features/nepali_text_classifier/routes.py +0 -0
  42. features/text_classifier/__init__.py +0 -0
  43. features/text_classifier/controller.py +0 -0
  44. features/text_classifier/inferencer.py +0 -0
  45. features/text_classifier/model_loader.py +1 -1
  46. features/text_classifier/preprocess.py +0 -0
  47. features/text_classifier/routes.py +0 -0
  48. license.md +20 -0
  49. readme.md +0 -35
  50. requirements.txt +7 -0
.env-example CHANGED
File without changes
.gitignore CHANGED
@@ -60,3 +60,9 @@ models/.gitattributes #<-- This line can stay if you only want to ignore that f
60
 
61
  todo.md
62
  np_text_model
 
 
 
 
 
 
 
60
 
61
  todo.md
62
  np_text_model
63
+ IMG_Models
64
+ notebooks
65
+ # Ignore model and tokenizer files
66
+ np_text_model/classifier/sentencepiece.bpe.model
67
+ np_text_model/classifier/tokenizer.json
68
+
Dockerfile CHANGED
@@ -1,12 +1,19 @@
1
  # Read the doc: https://huggingface.co/docs/hub/spaces-sdks-docker
2
  # you will also find guides on how best to write your Dockerfile
3
 
4
- FROM python:3.9
5
 
 
6
  RUN useradd -m -u 1000 user
 
 
 
 
 
7
  USER user
8
  ENV PATH="/home/user/.local/bin:$PATH"
9
 
 
10
  WORKDIR /app
11
 
12
  COPY --chown=user ./requirements.txt requirements.txt
@@ -14,4 +21,6 @@ RUN pip install --no-cache-dir --upgrade -r requirements.txt
14
  RUN python -m spacy download en_core_web_sm || echo "Failed to download model"
15
 
16
  COPY --chown=user . /app
 
17
  CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
 
 
1
  # Read the doc: https://huggingface.co/docs/hub/spaces-sdks-docker
2
  # you will also find guides on how best to write your Dockerfile
3
 
4
+ FROM python:3.10
5
 
6
+ # Create user first
7
  RUN useradd -m -u 1000 user
8
+
9
+ # Install system dependencies (requires root)
10
+ RUN apt-get update && apt-get install -y libgl1
11
+
12
+ # Switch to non-root user
13
  USER user
14
  ENV PATH="/home/user/.local/bin:$PATH"
15
 
16
+ # Add TensorFlow environment variables to reduce logging noise
17
  WORKDIR /app
18
 
19
  COPY --chown=user ./requirements.txt requirements.txt
 
21
  RUN python -m spacy download en_core_web_sm || echo "Failed to download model"
22
 
23
  COPY --chown=user . /app
24
+
25
  CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
26
+
Procfile CHANGED
File without changes
README.md CHANGED
@@ -1,9 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Ai-Checker
3
- emoji: πŸš€
4
- colorFrom: yellow
5
- colorTo: blue
6
- sdk: docker
7
- pinned: false
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  ---
9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AI-Contain-Checker
2
+
3
+ A modular AI content detection system with support for **image classification**, **image edit detection**, **Nepali text classification**, and **general text classification**. Built for performance and extensibility, it is ideal for detecting AI-generated content in both visual and textual forms.
4
+
5
+
6
+ ## 🌟 Features
7
+
8
+ ### πŸ–ΌοΈ Image Classifier
9
+
10
+ * **Purpose**: Classifies whether an image is AI-generated or a real-life photo.
11
+ * **Model**: Fine-tuned **InceptionV3** CNN.
12
+ * **Dataset**: Custom curated dataset with **\~79,950 images** for binary classification.
13
+ * **Location**: [`features/image_classifier`](features/image_classifier)
14
+ * **Docs**: [`docs/features/image_classifier.md`](docs/features/image_classifier.md)
15
+
16
+ ### πŸ–ŒοΈ Image Edit Detector
17
+
18
+ * **Purpose**: Detects image tampering or post-processing.
19
+ * **Techniques Used**:
20
+
21
+ * **Error Level Analysis (ELA)**: Visualizes compression artifacts.
22
+ * **Fast Fourier Transform (FFT)**: Detects unnatural frequency patterns.
23
+ * **Location**: [`features/image_edit_detector`](features/image_edit_detector)
24
+ * **Docs**:
25
+
26
+ * [ELA](docs/detector/ELA.md)
27
+ * [FFT](docs/detector/fft.md )
28
+ * [Metadata Analysis](docs/detector/meta.md)
29
+ * [Backend Notes](docs/detector/note-for-backend.md)
30
+
31
+ ### πŸ“ Nepali Text Classifier
32
+
33
+ * **Purpose**: Determines if Nepali text content is AI-generated or written by a human.
34
+ * **Model**: Based on `XLMRClassifier` fine-tuned on Nepali language data.
35
+ * **Dataset**: Scraped dataset of **\~18,000** Nepali texts.
36
+ * **Location**: [`features/nepali_text_classifier`](features/nepali_text_classifier)
37
+ * **Docs**: [`docs/features/nepali_text_classifier.md`](docs/features/nepali_text_classifier.md)
38
+
39
+ ### 🌐 English Text Classifier
40
+
41
+ * **Purpose**: Detects if English text is AI-generated or human-written.
42
+ * **Pipeline**:
43
+
44
+ * Uses **GPT2 tokenizer** for input preprocessing.
45
+ * Custom binary classifier to differentiate between AI and human-written content.
46
+ * **Location**: [`features/text_classifier`](features/text_classifier)
47
+ * **Docs**: [`docs/features/text_classifier.md`](docs/features/text_classifier.md)
48
+
49
  ---
50
+
51
+ ## πŸ—‚οΈ Project Structure
52
+
53
+ ```bash
54
+ AI-Checker/
55
+ β”‚
56
+ β”œβ”€β”€ app.py # Main FastAPI entry point
57
+ β”œβ”€β”€ config.py # Configuration settings
58
+ β”œβ”€β”€ Dockerfile # Docker build script
59
+ β”œβ”€β”€ Procfile # Deployment file for Heroku or similar
60
+ β”œβ”€β”€ requirements.txt # Python dependencies
61
+ β”œβ”€β”€ README.md # You are here πŸ“˜
62
+ β”‚
63
+ β”œβ”€β”€ features/ # Core detection modules
64
+ β”‚ β”œβ”€β”€ image_classifier/
65
+ β”‚ β”œβ”€β”€ image_edit_detector/
66
+ β”‚ β”œβ”€β”€ nepali_text_classifier/
67
+ β”‚ └── text_classifier/
68
+ β”‚
69
+ β”œβ”€β”€ docs/ # Internal and API documentation
70
+ β”‚ β”œβ”€β”€ api_endpoints.md
71
+ β”‚ β”œβ”€β”€ deployment.md
72
+ β”‚ β”œβ”€β”€ detector/
73
+ β”‚ β”‚ β”œβ”€β”€ ELA.md
74
+ β”‚ β”‚ β”œβ”€β”€ fft.md
75
+ β”‚ β”‚ β”œβ”€β”€ meta.md
76
+ β”‚ β”‚ └── note-for-backend.md
77
+ β”‚ β”œβ”€β”€ functions.md
78
+ β”‚ β”œβ”€β”€ nestjs_integration.md
79
+ β”‚ β”œβ”€β”€ security.md
80
+ β”‚ β”œβ”€β”€ setup.md
81
+ β”‚ └── structure.md
82
+ β”‚
83
+ β”œβ”€β”€ IMG_Models/ # Saved image classifier model(s)
84
+ β”‚ └── latest-my_cnn_model.h5
85
+ β”‚
86
+ β”œβ”€β”€ notebooks/ # Experimental and debug notebooks
87
+ β”œβ”€β”€ static/ # Static assets if needed
88
+ └── test.md # Test notes
89
+ ````
90
+
91
  ---
92
 
93
+ ## πŸ“š Documentation Links
94
+
95
+ * [API Endpoints](docs/api_endpoints.md)
96
+ * [Deployment Guide](docs/deployment.md)
97
+ * [Detector Documentation](docs/detector/)
98
+
99
+ * [Error Level Analysis (ELA)](docs/detector/ELA.md)
100
+ * [Fast Fourier Transform (FFT)](docs/detector/fft.md)
101
+ * [Metadata Analysis](docs/detector/meta.md)
102
+ * [Backend Notes](docs/detector/note-for-backend.md)
103
+ * [Functions Overview](docs/functions.md)
104
+ * [NestJS Integration Guide](docs/nestjs_integration.md)
105
+ * [Security Details](docs/security.md)
106
+ * [Setup Instructions](docs/setup.md)
107
+ * [Project Structure](docs/structure.md)
108
+
109
+ ---
110
+
111
+ ## πŸš€ Usage
112
+
113
+ 1. **Install dependencies**
114
+
115
+ ```bash
116
+ pip install -r requirements.txt
117
+ ```
118
+
119
+ 2. **Run the API**
120
+
121
+ ```bash
122
+ uvicorn app:app --reload
123
+ ```
124
+
125
+ 3. **Build Docker (optional)**
126
+
127
+ ```bash
128
+ docker build -t ai-contain-checker .
129
+ docker run -p 8000:8000 ai-contain-checker
130
+ ```
131
+
132
+ ---
133
+
134
+ ## πŸ” Security & Integration
135
+
136
+ * **Token Authentication** and **IP Whitelisting** supported.
137
+ * NestJS integration guide: [`docs/nestjs_integration.md`](docs/nestjs_integration.md)
138
+ * Rate limiting handled using `slowapi`.
139
+
140
+ ---
141
+
142
+ ## πŸ›‘οΈ Future Plans
143
+
144
+ * Add **video classifier** module.
145
+ * Expand dataset for **multilingual** AI content detection.
146
+ * Add **fine-tuning UI** for models.
147
+
148
+ ---
149
+
150
+ ## πŸ“„ License
151
+
152
+ See full license terms here: [`LICENSE.md`](license.md)
__init__.py CHANGED
File without changes
app.py CHANGED
@@ -1,37 +1,62 @@
1
  from fastapi import FastAPI, Request
2
  from slowapi import Limiter, _rate_limit_exceeded_handler
 
3
  from slowapi.middleware import SlowAPIMiddleware
4
  from slowapi.errors import RateLimitExceeded
5
  from slowapi.util import get_remote_address
6
  from fastapi.responses import JSONResponse
7
  from features.text_classifier.routes import router as text_classifier_router
8
- from features.nepali_text_classifier.routes import router as nepali_text_classifier_router
 
 
 
 
 
 
9
  from config import ACCESS_RATE
 
10
  import requests
 
11
  limiter = Limiter(key_func=get_remote_address, default_limits=[ACCESS_RATE])
12
 
13
  app = FastAPI()
14
-
15
  # Set up SlowAPI
16
  app.state.limiter = limiter
17
- app.add_exception_handler(RateLimitExceeded, lambda request, exc: JSONResponse(
18
- status_code=429,
19
- content={
20
- "status_code": 429,
21
- "error": "Rate limit exceeded",
22
- "message": "Too many requests. Chill for a bit and try again"
23
- }
24
- ))
 
 
 
25
  app.add_middleware(SlowAPIMiddleware)
26
 
27
  # Include your routes
28
  app.include_router(text_classifier_router, prefix="/text")
29
- app.include_router(nepali_text_classifier_router,prefix="/NP")
 
 
 
 
30
  @app.get("/")
31
  @limiter.limit(ACCESS_RATE)
32
  async def root(request: Request):
33
  return {
34
  "message": "API is working",
35
- "endpoints": ["/text/analyse", "/text/upload", "/text/analyse-sentences", "/text/analyse-sentance-file"]
 
 
 
 
 
 
 
 
 
 
36
  }
37
-
 
1
  from fastapi import FastAPI, Request
2
  from slowapi import Limiter, _rate_limit_exceeded_handler
3
+ from fastapi.responses import FileResponse
4
  from slowapi.middleware import SlowAPIMiddleware
5
  from slowapi.errors import RateLimitExceeded
6
  from slowapi.util import get_remote_address
7
  from fastapi.responses import JSONResponse
8
  from features.text_classifier.routes import router as text_classifier_router
9
+ from features.nepali_text_classifier.routes import (
10
+ router as nepali_text_classifier_router,
11
+ )
12
+ from features.image_classifier.routes import router as image_classifier_router
13
+ from features.image_edit_detector.routes import router as image_edit_detector_router
14
+ from fastapi.staticfiles import StaticFiles
15
+
16
  from config import ACCESS_RATE
17
+
18
  import requests
19
+
20
  limiter = Limiter(key_func=get_remote_address, default_limits=[ACCESS_RATE])
21
 
22
  app = FastAPI()
23
+ # added the robots.txt
24
  # Set up SlowAPI
25
  app.state.limiter = limiter
26
+ app.add_exception_handler(
27
+ RateLimitExceeded,
28
+ lambda request, exc: JSONResponse(
29
+ status_code=429,
30
+ content={
31
+ "status_code": 429,
32
+ "error": "Rate limit exceeded",
33
+ "message": "Too many requests. Chill for a bit and try again",
34
+ },
35
+ ),
36
+ )
37
  app.add_middleware(SlowAPIMiddleware)
38
 
39
  # Include your routes
40
  app.include_router(text_classifier_router, prefix="/text")
41
+ app.include_router(nepali_text_classifier_router, prefix="/NP")
42
+ app.include_router(image_classifier_router, prefix="/AI-image")
43
+ app.include_router(image_edit_detector_router, prefix="/detect")
44
+
45
+
46
  @app.get("/")
47
  @limiter.limit(ACCESS_RATE)
48
  async def root(request: Request):
49
  return {
50
  "message": "API is working",
51
+ "endpoints": [
52
+ "/text/analyse",
53
+ "/text/upload",
54
+ "/text/analyse-sentences",
55
+ "/text/analyse-sentance-file",
56
+ "/NP/analyse",
57
+ "/NP/upload",
58
+ "/NP/analyse-sentences",
59
+ "/NP/file-sentences-analyse",
60
+ "/AI-image/analyse",
61
+ ],
62
  }
 
config.py CHANGED
File without changes
docs/api_endpoints.md CHANGED
@@ -2,13 +2,13 @@
2
 
3
  ### English (GPT-2) - `/text/`
4
 
5
- | Endpoint | Method | Description |
6
- | --------------------------------- | ------ | ----------------------------------------- |
7
- | `/text/analyse` | POST | Classify raw English text |
8
- | `/text/analyse-sentences` | POST | Sentence-by-sentence breakdown |
9
- | `/text/analyse-sentance-file` | POST | Upload file, per-sentence breakdown |
10
- | `/text/upload` | POST | Upload file for overall classification |
11
- | `/text/health` | GET | Health check |
12
 
13
  #### Example: Classify English text
14
 
@@ -20,6 +20,7 @@ curl -X POST http://localhost:8000/text/analyse \
20
  ```
21
 
22
  **Response:**
 
23
  ```json
24
  {
25
  "result": "AI-generated",
@@ -40,13 +41,13 @@ curl -X POST http://localhost:8000/text/upload \
40
 
41
  ### Nepali (SentencePiece) - `/NP/`
42
 
43
- | Endpoint | Method | Description |
44
- | --------------------------------- | ------ | ----------------------------------------- |
45
- | `/NP/analyse` | POST | Classify Nepali text |
46
- | `/NP/analyse-sentences` | POST | Sentence-by-sentence breakdown |
47
- | `/NP/upload` | POST | Upload Nepali PDF for classification |
48
- | `/NP/file-sentences-analyse` | POST | PDF upload, per-sentence breakdown |
49
- | `/NP/health` | GET | Health check |
50
 
51
  #### Example: Nepali text classification
52
 
@@ -58,6 +59,7 @@ curl -X POST http://localhost:8000/NP/analyse \
58
  ```
59
 
60
  **Response:**
 
61
  ```json
62
  {
63
  "label": "Human",
@@ -73,3 +75,18 @@ curl -X POST http://localhost:8000/NP/upload \
73
  -F 'file=@NepaliText.pdf;type=application/pdf'
74
  ```
75
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  ### English (GPT-2) - `/text/`
4
 
5
+ | Endpoint | Method | Description |
6
+ | ----------------------------- | ------ | -------------------------------------- |
7
+ | `/text/analyse` | POST | Classify raw English text |
8
+ | `/text/analyse-sentences` | POST | Sentence-by-sentence breakdown |
9
+ | `/text/analyse-sentance-file` | POST | Upload file, per-sentence breakdown |
10
+ | `/text/upload` | POST | Upload file for overall classification |
11
+ | `/text/health` | GET | Health check |
12
 
13
  #### Example: Classify English text
14
 
 
20
  ```
21
 
22
  **Response:**
23
+
24
  ```json
25
  {
26
  "result": "AI-generated",
 
41
 
42
  ### Nepali (SentencePiece) - `/NP/`
43
 
44
+ | Endpoint | Method | Description |
45
+ | ---------------------------- | ------ | ------------------------------------ |
46
+ | `/NP/analyse` | POST | Classify Nepali text |
47
+ | `/NP/analyse-sentences` | POST | Sentence-by-sentence breakdown |
48
+ | `/NP/upload` | POST | Upload Nepali PDF for classification |
49
+ | `/NP/file-sentences-analyse` | POST | PDF upload, per-sentence breakdown |
50
+ | `/NP/health` | GET | Health check |
51
 
52
  #### Example: Nepali text classification
53
 
 
59
  ```
60
 
61
  **Response:**
62
+
63
  ```json
64
  {
65
  "label": "Human",
 
75
  -F 'file=@NepaliText.pdf;type=application/pdf'
76
  ```
77
 
78
+ ### Image-Classification -`/verify-image/`
79
+
80
+ | Endpoint | Method | Description |
81
+ | ----------------------- | ------ | ----------------------- |
82
+ | `/verify-image/analyse` | POST | Classify Image using ML |
83
+
84
+ #### Example: Image-Classification
85
+
86
+ ```bash
87
+ curl -X POST http://localhost:8000/verify-image/analyse \
88
+ -H "Authorization: Bearer <SECRET_TOKEN>" \
89
+ -F 'file=@test1.png'
90
+ ```
91
+
92
+ [πŸ”™ Back to Main README](../README.md)
docs/deployment.md CHANGED
@@ -103,3 +103,6 @@ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
103
 
104
  Happy deploying!
105
  **P.S.** Try not to break stuff. πŸ˜…
 
 
 
 
103
 
104
  Happy deploying!
105
  **P.S.** Try not to break stuff. πŸ˜…
106
+
107
+
108
+ [πŸ”™ Back to Main README](../README.md)
docs/detector/ELA.md ADDED
@@ -0,0 +1,65 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Error Level Analysis (ELA) Detector
2
+
3
+ This module provides a function to perform Error Level Analysis (ELA) on images to detect potential manipulations or edits.
4
+
5
+ ## Function: `run_ela`
6
+
7
+ ```python
8
+ def run_ela(image: Image.Image, quality: int = 90, threshold: int = 15) -> bool:
9
+ ```
10
+
11
+ ### Description
12
+
13
+ Error Level Analysis (ELA) works by recompressing an image at a specified JPEG quality level and comparing it to the original image. Differences between the two images reveal areas with inconsistent compression artifacts β€” often indicating image manipulation.
14
+
15
+ The function computes the maximum pixel difference across all color channels and uses a threshold to determine if the image is likely edited.
16
+
17
+ ### Parameters
18
+
19
+ | Parameter | Type | Default | Description |
20
+ | ----------- | ----------- | ------- | ------------------------------------------------------------------------------------------- |
21
+ | `image` | `PIL.Image` | N/A | Input image in RGB mode to analyze. |
22
+ | `quality` | `int` | 90 | JPEG compression quality used for recompression during analysis (lower = more compression). |
23
+ | `threshold` | `int` | 15 | Pixel difference threshold to flag the image as edited. |
24
+
25
+ ### Returns
26
+
27
+ `bool`
28
+
29
+ - `True` if the image is likely edited (max pixel difference > threshold).
30
+ - `False` if the image appears unedited.
31
+
32
+ ### Usage Example
33
+
34
+ ```python
35
+ from PIL import Image
36
+ from detectors.ela import run_ela
37
+
38
+ # Open and convert image to RGB
39
+ img = Image.open("example.jpg").convert("RGB")
40
+
41
+ # Run ELA detection
42
+ is_edited = run_ela(img, quality=90, threshold=15)
43
+
44
+ print("Image edited:", is_edited)
45
+ ```
46
+
47
+ ### Notes
48
+
49
+ - The input image **must** be in RGB mode for accurate analysis.
50
+ - ELA is a heuristic technique; combining it with other detection methods increases reliability.
51
+ - Visualizing the enhanced difference image can help identify edited regions (not returned by this function but possible to add).
52
+
53
+ ### Installation
54
+
55
+ Make sure you have Pillow installed:
56
+
57
+ ```bash
58
+ pip install pillow
59
+ ```
60
+
61
+ ### Running Locally
62
+
63
+ Just put the function in a notebook or script file and run it with your image. It works well for basic images.
64
+
65
+ [πŸ”™ Back to Main README](../README.md)
docs/detector/fft.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # Fast Fourier Transform (FFT) Detector
3
+
4
+ ```python
5
+ def run_fft(image: Image.Image, threshold: float = 0.92) -> bool:
6
+ ```
7
+
8
+ ## **Overview**
9
+
10
+ The `run_fft` function performs a frequency domain analysis on an image using the **Fast Fourier Transform (FFT)** to detect possible **AI generation or digital manipulation**. It leverages the fact that artificially generated or heavily edited images often exhibit a distinct high-frequency pattern.
11
+
12
+ ---
13
+
14
+ ## **Parameters**
15
+
16
+ | Parameter | Type | Description |
17
+ | ----------- | ----------------- | --------------------------------------------------------------------------------------- |
18
+ | `image` | `PIL.Image.Image` | Input image to analyze. It will be converted to grayscale and resized. |
19
+ | `threshold` | `float` | Proportion threshold of high-frequency components to flag the image. Default is `0.92`. |
20
+
21
+ ---
22
+
23
+ ## **Returns**
24
+
25
+ | Type | Description |
26
+ | ------ | ---------------------------------------------------------------------- |
27
+ | `bool` | `True` if image is likely AI-generated/manipulated; otherwise `False`. |
28
+
29
+ ---
30
+
31
+ ## **Step-by-Step Explanation**
32
+
33
+ ### 1. **Grayscale Conversion**
34
+
35
+ All images are converted to grayscale:
36
+
37
+ ```python
38
+ gray_image = image.convert("L")
39
+ ```
40
+
41
+ ### 2. **Resize**
42
+
43
+ The image is resized to a fixed $512 \times 512$ for uniformity:
44
+
45
+ ```python
46
+ resized_image = gray_image.resize((512, 512))
47
+ ```
48
+
49
+ ### 3. **FFT Calculation**
50
+
51
+ Compute the 2D Discrete Fourier Transform:
52
+
53
+ $$
54
+ F(u, v) = \sum_{x=0}^{M-1} \sum_{y=0}^{N-1} f(x, y) \cdot e^{-2\pi i \left( \frac{ux}{M} + \frac{vy}{N} \right)}
55
+ $$
56
+
57
+ ```python
58
+ fft_result = fft2(image_array)
59
+ ```
60
+
61
+ ### 4. **Shift Zero Frequency to Center**
62
+
63
+ Use `fftshift` to center the zero-frequency component:
64
+
65
+ ```python
66
+ fft_shifted = fftshift(fft_result)
67
+ ```
68
+
69
+ ### 5. **Magnitude Spectrum**
70
+
71
+ $$
72
+ |F(u, v)| = \sqrt{\Re^2 + \Im^2}
73
+ $$
74
+
75
+ ```python
76
+ magnitude_spectrum = np.abs(fft_shifted)
77
+ ```
78
+
79
+ ### 6. **Normalization**
80
+
81
+ Normalize the spectrum to avoid scale issues:
82
+
83
+ $$
84
+ \text{Normalized}(u,v) = \frac{|F(u,v)|}{\max(|F(u,v)|)}
85
+ $$
86
+
87
+ ```python
88
+ normalized_spectrum = magnitude_spectrum / max_magnitude
89
+ ```
90
+
91
+ ### 7. **High-Frequency Detection**
92
+
93
+ High-frequency components are defined as:
94
+
95
+ $$
96
+ \text{Mask}(u,v) =
97
+ \begin{cases}
98
+ 1 & \text{if } \text{Normalized}(u,v) > 0.5 \\
99
+ 0 & \text{otherwise}
100
+ \end{cases}
101
+ $$
102
+
103
+ ```python
104
+ high_freq_mask = normalized_spectrum > 0.5
105
+ ```
106
+
107
+ ### 8. **Proportion Calculation**
108
+
109
+ $$
110
+ \text{Ratio} = \frac{\sum \text{Mask}}{\text{Total pixels}}
111
+ $$
112
+
113
+ ```python
114
+ high_freq_ratio = np.sum(high_freq_mask) / normalized_spectrum.size
115
+ ```
116
+
117
+ ### 9. **Threshold Decision**
118
+
119
+ If the ratio exceeds the threshold:
120
+
121
+ $$
122
+ \text{is\_fake} = (\text{Ratio} > \text{Threshold})
123
+ $$
124
+
125
+ ```python
126
+ is_fake = high_freq_ratio > threshold
127
+ ```
128
+
129
+ it is implemented in the api
130
+
131
+ ### Running Locally
132
+
133
+ Just put the function in a notebook or script file and run it with your image. It works well for basic images.
134
+
135
+
136
+ [πŸ”™ Back to Main README](../README.md)
docs/detector/meta.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Metadata Analysis for Image Edit Detection
2
+
3
+ This module inspects image metadata to detect possible signs of AI-generation or post-processing edits.
4
+
5
+ ## Overview
6
+
7
+ - Many AI-generated images and edited images leave identifiable traces in their metadata.
8
+ - This detector scans image EXIF metadata and raw bytes for known AI generation indicators and common photo editing software signatures.
9
+ - It classifies images as `"ai_generated"`, `"edited"`, or `"undetermined"` based on detected markers.
10
+ - Handles invalid image formats gracefully by reporting errors.
11
+
12
+ ## How It Works
13
+
14
+ - Opens the image from raw bytes using the Python Pillow library (`PIL`).
15
+ - Reads EXIF metadata and specifically looks for the "Software" tag that often contains the editing app name.
16
+ - Checks for common image editors such as Photoshop, GIMP, Snapseed, etc.
17
+ - Scans the entire raw byte content of the image for embedded AI generation identifiers like "midjourney", "stable-diffusion", "openai", etc.
18
+ - Returns a status string indicating the metadata classification.
19
+
20
+ [πŸ”™ Back to Main README](../README.md)
docs/detector/note-for-backend.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ # πŸ“¦API integration note
3
+
4
+ ## Overview
5
+
6
+ This system integrates **three image forensics methods**β€”**ELA**, **FFT**, and **Metadata analysis**β€”into a single detection pipeline to determine whether an image is AI-generated, manipulated, or authentic.
7
+
8
+ ---
9
+
10
+ ## πŸ” Detection Modules
11
+
12
+ ### 1. **ELA (Error Level Analysis)**
13
+
14
+ * **Purpose:** Detects tampering or editing by analyzing compression error levels.
15
+ * **Accuracy:** βœ… *Most accurate method*
16
+ * **Performance:** ❗ *Slowest method*
17
+ * **Output:** `True` (edited) or `False` (authentic)
18
+
19
+ ### 2. **FFT (Fast Fourier Transform)**
20
+
21
+ * **Purpose:** Identifies high-frequency patterns typical of AI-generated images.
22
+ * **Accuracy:** ⚠️ *Moderately accurate*
23
+ * **Performance:** ❗ *Moderate to slow*
24
+ * **Output:** `True` (likely AI-generated) or `False` (authentic)
25
+
26
+ ### 3. **Metadata Analysis**
27
+
28
+ * **Purpose:** Detects traces of AI tools or editors in image metadata or binary content.
29
+ * **Accuracy:** ⚠️ *Fast but weaker signal*
30
+ * **Performance:** πŸš€ *Fastest method*
31
+ * **Output:** One of:
32
+
33
+ * `"ai_generated"` – AI tool or generator identified
34
+ * `"edited"` – Edited using known software
35
+ * `"undetermined"` – No signature found
36
+
37
+ ---
38
+
39
+ ## 🧩 Integration Plan
40
+
41
+ ### βž• Combine all three APIs into one unified endpoint:
42
+
43
+ ```bash
44
+ POST /api/detect-image
45
+ ```
46
+
47
+ ### Input:
48
+
49
+ * `image`: Image file (binary, any format supported by Pillow)
50
+
51
+ ### Output:
52
+
53
+ ```json
54
+ {
55
+ "ela_result": true,
56
+ "fft_result": false,
57
+ "metadata_result": "ai_generated",
58
+ "final_decision": "ai_generated"
59
+ }
60
+ ```
61
+ > NOTE:Optionally recommending a default logic (e.g., trust ELA > FFT > Metadata).
62
+
63
+ ## Result implementation
64
+ | `ela_result` | `fft_result` | `metadata_result` | Suggested Final Decision | Notes |
65
+ | ------------ | ------------ | ----------------- | ------------------------ | ----------------------------------------------------------------------- |
66
+ | `true` | `true` | `"ai_generated"` | `ai_generated` | Strong evidence from all three modules |
67
+ | `true` | `false` | `"edited"` | `edited` | ELA confirms editing, no AI signals |
68
+ | `true` | `false` | `"undetermined"` | `edited` | ELA indicates manipulation |
69
+ | `false` | `true` | `"ai_generated"` | `ai_generated` | No edits, but strong AI frequency & metadata signature |
70
+ | `false` | `true` | `"undetermined"` | `possibly_ai_generated` | Weak metadata, but FFT indicates possible AI generation |
71
+ | `false` | `false` | `"ai_generated"` | `ai_generated` | Metadata alone shows AI use |
72
+ | `false` | `false` | `"edited"` | `possibly_edited` | Weak signalβ€”metadata shows editing but no structural or frequency signs |
73
+ | `false` | `false` | `"undetermined"` | `authentic` | No detectable manipulation or AI indicators |
74
+
75
+
76
+ ### Decision Logic:
77
+
78
+ * Use **ELA** as the **primary indicator** for manipulation.
79
+ * Supplement with **FFT** and **Metadata** to improve reliability.
80
+ * Combine using a simple rule-based or voting system.
81
+
82
+ ---
83
+
84
+ ## βš™οΈ Performance Consideration
85
+
86
+ | Method | Speed | Strength |
87
+ | -------- | ----------- | -------------------- |
88
+ | ELA | ❗ Slow | βœ… Highly accurate |
89
+ | FFT | ⚠️ Moderate | ⚠️ Somewhat reliable |
90
+ | Metadata | πŸš€ Fast | ⚠️ Low confidence |
91
+
92
+ > For high-throughput systems, consider running Metadata first and conditionally applying ELA/FFT if suspicious.
93
+
94
+ [πŸ”™ Back to Main README](../README.md)
docs/features/image_classifier.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Image Classifier
2
+
3
+ ## Overview
4
+
5
+ This module classifies whether an input image is AI-generated or a real-life photograph.
6
+
7
+ ## Model
8
+
9
+ - Architecture: InceptionV3
10
+ - Type: Binary Classifier (AI vs Real)
11
+ - Format: H5 model (`latest-my_cnn_model.h5`)
12
+
13
+ ## Dataset
14
+
15
+ - Total images: ~79,950
16
+ - Balanced between real and generated images
17
+ - Preprocessing: Resizing, normalization
18
+
19
+ ## Code Location
20
+
21
+ - Controller: `features/image_classifier/controller.py`
22
+ - Model Loader: `features/image_classifier/model_loader.py`
23
+ - Preprocessor: `features/image_classifier/preprocess.py`
24
+
25
+ ## API
26
+
27
+ - Endpoint: [ENDPOINTS](../api_endpoints.md)
28
+ - Input: Image file (PNG/JPG)
29
+ - Output: JSON response with classification result and confidence
30
+
31
+ [πŸ”™ Back to Main README](../README.md)
docs/features/nepali_text_classifier.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Nepali Text Classifier
2
+
3
+ ## Overview
4
+
5
+ This classifier identifies whether Nepali-language text content is written by a human or AI.
6
+
7
+ ## Model
8
+
9
+ - Base Model: XLM-Roberta (XLMRClassifier)
10
+ - Language: Nepali (Multilingual model)
11
+ - Fine-tuned with scraped web content (~18,000 samples)
12
+
13
+ ## Dataset
14
+
15
+ - Custom scraped dataset with manual labeling
16
+ - Includes news, blogs, and synthetic content from various LLMs
17
+
18
+ ## Code Location
19
+
20
+ - Controller: `features/nepali_text_classifier/controller.py`
21
+ - Inference: `features/nepali_text_classifier/inferencer.py`
22
+ - Model Loader: `features/nepali_text_classifier/model_loader.py`
23
+
24
+ ## API
25
+
26
+ - Endpoint: [ENDPOINTS](../api_endpoints.md)
27
+ - Input: Raw text
28
+ - Output: JSON classification with label and confidence score
29
+
30
+ [πŸ”™ Back to Main README](../README.md)
docs/features/text_classifier.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # English Text Classifier
2
+
3
+ ## Overview
4
+
5
+ Detects whether English-language text is AI-generated or human-written.
6
+
7
+ ## Model Pipeline
8
+
9
+ - Tokenizer: GPT-2 Tokenizer
10
+ - Model: Custom trained binary classifier
11
+
12
+ ## Dataset
13
+
14
+ - Balanced dataset: Human vs AI-generated (ChatGPT, Claude, etc.)
15
+ - Tokenized and fed into the model using PyTorch/TensorFlow
16
+
17
+ ## Code Location
18
+
19
+ - Controller: `features/text_classifier/controller.py`
20
+ - Inference: `features/text_classifier/inferencer.py`
21
+ - Model Loader: `features/text_classifier/model_loader.py`
22
+ - Preprocessor: `features/text_classifier/preprocess.py`
23
+
24
+ ## API
25
+
26
+ - Endpoint: [ENDPOINTS](../api_endpoints.md)
27
+ - Input: Raw English text
28
+ - Output: Prediction result with probability/confidence
29
+
30
+ [πŸ”™ Back to Main README](../README.md)
docs/functions.md CHANGED
@@ -49,5 +49,14 @@
49
 
50
  - **`analyze_sentence_file()`**
51
  Like `handle_file_sentence()`β€”analyzes sentences in uploaded files.
52
-
53
  ## for image_classifier
 
 
 
 
 
 
 
 
 
 
49
 
50
  - **`analyze_sentence_file()`**
51
  Like `handle_file_sentence()`β€”analyzes sentences in uploaded files.
52
+ ---
53
  ## for image_classifier
54
+
55
+ - **`Classify_Image_router()`** – Handles image classification requests by routing and coordinating preprocessing and inference.
56
+ - **`classify_image()`** – Performs AI vs human image classification using the loaded model.
57
+ - **`load_model()`** – Loads the pretrained model from Hugging Face at server startup.
58
+ - **`preprocess_image()`** – Applies all required preprocessing steps to the input image.
59
+
60
+ > Note: While many functions mirror those in the text classifier, the image classifier primarily uses TensorFlow rather than PyTorch.
61
+
62
+ [πŸ”™ Back to Main README](../README.md)
docs/nestjs_integration.md CHANGED
@@ -80,3 +80,4 @@ export class AppController {
80
  }
81
  }
82
  ```
 
 
80
  }
81
  }
82
  ```
83
+ [πŸ”™ Back to Main README](../README.md)
docs/security.md CHANGED
@@ -7,3 +7,4 @@ All endpoints require authentication via Bearer token:
7
 
8
  Unauthorized requests receive `403 Forbidden`.
9
 
 
 
7
 
8
  Unauthorized requests receive `403 Forbidden`.
9
 
10
+ [πŸ”™ Back to Main README](../README.md)
docs/setup.md CHANGED
@@ -21,3 +21,4 @@ SECRET_TOKEN=your_secret_token_here
21
  ```bash
22
  uvicorn app:app --host 0.0.0.0 --port 8000
23
  ```
 
 
21
  ```bash
22
  uvicorn app:app --host 0.0.0.0 --port 8000
23
  ```
24
+ [πŸ”™ Back to Main README](../README.md)
docs/status_code.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Error Codes Reference
2
+
3
+ ## πŸ”Ή Summary Table
4
+
5
+ | Code | Message | Description |
6
+ | ---- | ----------------------------------------------------- | ------------------------------------------ |
7
+ | 400 | Text must contain at least two words | Input text too short |
8
+ | 400 | Text should be less than 10,000 characters | Input text too long |
9
+ | 404 | The file is empty or only contains whitespace | File has no usable content |
10
+ | 404 | Invalid file type. Only .docx, .pdf, and .txt allowed | Unsupported file format |
11
+ | 403 | Invalid or expired token | Authentication token is invalid or expired |
12
+ | 413 | Text must contain at least two words | Text too short (alternative condition) |
13
+ | 413 | Text must be less than 10,000 characters | Text too long (alternative condition) |
14
+ | 413 | The image error (preprocessing) | Image size/content issue |
15
+ | 500 | Error processing the file | Internal server error while processing |
16
+
17
+ ---
18
+
19
+ ## πŸ” Error Details
20
+
21
+ ### `400` - Bad Request
22
+
23
+ - **Text must contain at least two words**
24
+ The input text field is too short. Submit at least two words to proceed.
25
+
26
+ - **Text should be less than 10,000 characters**
27
+ Input text exceeds the maximum allowed character limit. Consider truncating or summarizing the content.
28
+
29
+ ---
30
+
31
+ ### `404` - Not Found
32
+
33
+ - **The file is empty or only contains whitespace**
34
+ The uploaded file is invalid due to lack of meaningful content. Ensure the file has readable, non-empty text.
35
+
36
+ - **Invalid file type. Only .docx, .pdf, and .txt are allowed**
37
+ The file format is not supported. Convert the file to one of the allowed formats before uploading.
38
+
39
+ ---
40
+
41
+ ### `403` - Forbidden
42
+
43
+ - **Invalid or expired token**
44
+ Your access token is either expired or incorrect. Try logging in again or refreshing the token.
45
+
46
+ ---
47
+
48
+ ### `413` - Payload Too Large
49
+
50
+ - **Text must contain at least two words**
51
+ The text payload is too small or malformed under a large upload context. Add more content.
52
+
53
+ - **Text must be less than 10,000 characters**
54
+ The payload exceeds the allowed character limit for a single request. Break it into smaller chunks if needed.
55
+
56
+ - **The image error**
57
+ The uploaded image is too large or corrupted. Try resizing or compressing it before retrying.
58
+
59
+ ---
60
+
61
+ ### `500` - Internal Server Error
62
+
63
+ - **Error processing the file**
64
+ An unexpected server-side failure occurred during file analysis. Retry later or contact support if persistent.
65
+
66
+ ---
67
+
68
+ > πŸ“Œ **Note:** Always validate inputs, check token status, and follow file guidelines before making requests.
docs/structure.md CHANGED
@@ -1,36 +1,58 @@
1
  ## πŸ—οΈ Project Structure
2
 
3
- ```
4
- β”œβ”€β”€ app.py # Main FastAPI app entrypoint
5
- β”œβ”€β”€ config.py # Configuration loader (.env, settings)
6
- β”œβ”€β”€ features/
7
- β”‚ β”œβ”€β”€ text_classifier/ # English (GPT-2) classifier
 
 
 
 
 
 
 
 
 
 
 
 
8
  β”‚ β”‚ β”œβ”€β”€ controller.py
9
  β”‚ β”‚ β”œβ”€β”€ inferencer.py
10
  β”‚ β”‚ β”œβ”€β”€ model_loader.py
11
- β”‚ β”‚ β”œβ”€β”€ preprocess.py
12
- β”‚ β”‚ └── routes.py
13
- β”‚ └── nepali_text_classifier/ # Nepali (sentencepiece) classifier
14
  β”‚ β”œβ”€β”€ controller.py
15
  β”‚ β”œβ”€β”€ inferencer.py
16
  β”‚ β”œβ”€β”€ model_loader.py
17
- β”‚ β”œβ”€β”€ preprocess.py
18
- β”‚ └── routes.py
19
- β”œβ”€β”€ np_text_model/ # Nepali model artifacts (auto-downloaded)
20
- β”‚ β”œβ”€β”€ classifier/
21
- β”‚ β”‚ └── sentencepiece.bpe.model
22
- β”‚ └── model_95_acc.pth
23
- β”œβ”€β”€ models/ # English GPT-2 model/tokenizer (auto-downloaded)
24
- β”‚ β”œβ”€β”€ merges.txt
25
- β”‚ β”œβ”€β”€ tokenizer.json
26
- β”‚ └── model_weights.pth
27
- β”œβ”€β”€ Dockerfile # Container build config
28
- β”œβ”€β”€ Procfile # Deployment entrypoint (for PaaS)
29
- β”œβ”€β”€ requirements.txt # Python dependencies
30
- β”œβ”€β”€ README.md
31
- β”œβ”€β”€ Docs # documents
32
- └── .env # Secret token(s), environment config
 
 
 
 
 
 
 
 
 
 
33
  ```
 
34
  ### 🌟 Key Files and Their Roles
35
 
36
  - **`app.py`**: Entry point initializing FastAPI app and routes.
@@ -39,16 +61,14 @@
39
  - **`__init__.py`**: Package initializer for the root module and submodules.
40
  - **`features/text_classifier/`**
41
  - **`controller.py`**: Handles logic between routes and the model.
42
- - **`inferencer.py`**: Runs inference and returns predictions as well as file system
43
- utilities.
44
  - **`features/NP/`**
45
  - **`controller.py`**: Handles logic between routes and the model.
46
- - **`inferencer.py`**: Runs inference and returns predictions as well as file system
47
- utilities.
48
  - **`model_loader.py`**: Loads the ML model and tokenizer.
49
  - **`preprocess.py`**: Prepares input text for the model.
50
  - **`routes.py`**: Defines API routes for text classification.
51
 
52
-
53
-
54
- -[Main](../README.md)
 
1
  ## πŸ—οΈ Project Structure
2
 
3
+ ```bash
4
+ AI-Checker/
5
+ β”‚
6
+ β”œβ”€β”€ app.py # Main FastAPI entry point
7
+ β”œβ”€β”€ config.py # Configuration settings
8
+ β”œβ”€β”€ Dockerfile # Docker build script
9
+ β”œβ”€β”€ Procfile # Deployment entry for platforms like Heroku/Railway
10
+ β”œβ”€β”€ requirements.txt # Python dependency list
11
+ β”œβ”€β”€ README.md # Main project overview πŸ“˜
12
+ β”‚
13
+ β”œβ”€β”€ features/ # Core AI content detection modules
14
+ β”‚ β”œβ”€β”€ image_classifier/ # Classifies AI vs Real images
15
+ β”‚ β”‚ β”œβ”€β”€ controller.py
16
+ β”‚ β”‚ β”œβ”€β”€ model_loader.py
17
+ β”‚ β”‚ └── preprocess.py
18
+ β”‚ β”œβ”€β”€ image_edit_detector/ # Detects tampered or edited images
19
+ β”‚ β”œβ”€β”€ nepali_text_classifier/ # Classifies Nepali text as AI or Human
20
  β”‚ β”‚ β”œβ”€β”€ controller.py
21
  β”‚ β”‚ β”œβ”€β”€ inferencer.py
22
  β”‚ β”‚ β”œβ”€β”€ model_loader.py
23
+ β”‚ β”‚ └── preprocess.py
24
+ β”‚ └── text_classifier/ # Classifies English text as AI or Human
 
25
  β”‚ β”œβ”€β”€ controller.py
26
  β”‚ β”œβ”€β”€ inferencer.py
27
  β”‚ β”œβ”€β”€ model_loader.py
28
+ β”‚ └── preprocess.py
29
+ β”‚
30
+ β”œβ”€β”€ docs/ # Internal documentation and API references
31
+ β”‚ β”œβ”€β”€ api_endpoints.md
32
+ β”‚ β”œβ”€β”€ deployment.md
33
+ β”‚ β”œβ”€β”€ detector/
34
+ β”‚ β”‚ β”œβ”€β”€ ELA.md
35
+ β”‚ β”‚ β”œβ”€β”€ fft.md
36
+ β”‚ β”‚ β”œβ”€β”€ meta.md
37
+ β”‚ β”‚ └── note-for-backend.md
38
+ β”‚ β”œβ”€β”€ features/
39
+ β”‚ β”‚ β”œβ”€β”€ image_classifier.md
40
+ β”‚ β”‚ β”œβ”€β”€ nepali_text_classifier.md
41
+ β”‚ β”‚ └── text_classifier.md
42
+ β”‚ β”œβ”€β”€ functions.md
43
+ β”‚ β”œβ”€β”€ nestjs_integration.md
44
+ β”‚ β”œβ”€β”€ security.md
45
+ β”‚ β”œβ”€β”€ setup.md
46
+ β”‚ └── structure.md
47
+ β”‚
48
+ β”œβ”€β”€ IMG_Models/ # Stored model weights
49
+ β”‚ └── latest-my_cnn_model.h5
50
+ β”‚
51
+ β”œβ”€β”€ notebooks/ # Experimental/debug Jupyter notebooks
52
+ β”œβ”€β”€ static/ # Static files (e.g., UI assets, test inputs)
53
+ └── test.md # Test usage notes
54
  ```
55
+
56
  ### 🌟 Key Files and Their Roles
57
 
58
  - **`app.py`**: Entry point initializing FastAPI app and routes.
 
61
  - **`__init__.py`**: Package initializer for the root module and submodules.
62
  - **`features/text_classifier/`**
63
  - **`controller.py`**: Handles logic between routes and the model.
64
+ - **`inferencer.py`**: Runs inference and returns predictions as well as file system
65
+ utilities.
66
  - **`features/NP/`**
67
  - **`controller.py`**: Handles logic between routes and the model.
68
+ - **`inferencer.py`**: Runs inference and returns predictions as well as file system
69
+ utilities.
70
  - **`model_loader.py`**: Loads the ML model and tokenizer.
71
  - **`preprocess.py`**: Prepares input text for the model.
72
  - **`routes.py`**: Defines API routes for text classification.
73
 
74
+ [πŸ”™ Back to Main README](../README.md)
 
 
features/image_classifier/__init__.py ADDED
File without changes
features/image_classifier/controller.py ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi import HTTPException, File, UploadFile
2
+ from .preprocess import preprocess_image
3
+ from .inferencer import classify_image
4
+
5
+
6
+ async def Classify_Image_router(file: UploadFile = File(...)):
7
+ try:
8
+ image_array = preprocess_image(file)
9
+ try:
10
+ result = classify_image(image_array)
11
+ return result
12
+ except:
13
+ raise HTTPException(status_code=423, detail="something went wrong")
14
+
15
+ except Exception as e:
16
+ raise HTTPException(status_code=413, detail=str(e))
features/image_classifier/inferencer.py ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ from .model_loader import get_model
3
+
4
+ # Thresholds
5
+ AI_THRESHOLD = 0.55
6
+ HUMAN_THRESHOLD = 0.45
7
+
8
+
9
+ def classify_image(image_array: np.ndarray) -> dict:
10
+ try:
11
+ model = get_model()
12
+ predictions = model.predict(image_array)
13
+
14
+ if predictions.ndim != 2 or predictions.shape[1] != 1:
15
+ raise ValueError(
16
+ "Model output shape is invalid. Expected shape: (batch, 1)"
17
+ )
18
+
19
+ ai_conf = float(np.clip(predictions[0][0], 0.0, 1.0))
20
+ human_conf = 1.0 - ai_conf
21
+
22
+ # Classification logic
23
+ if ai_conf > AI_THRESHOLD:
24
+ label = "AI Generated"
25
+ elif ai_conf < HUMAN_THRESHOLD:
26
+ label = "Human Generated"
27
+ else:
28
+ label = "Uncertain (Maybe AI)"
29
+
30
+ return {
31
+ "label": label,
32
+ "ai_confidence": round(ai_conf * 100, 2),
33
+ "human_confidence": round(human_conf * 100, 2),
34
+ }
35
+
36
+ except Exception as e:
37
+ return {
38
+ "error": str(e),
39
+ "label": "Classification Failed",
40
+ "ai_confidence": None,
41
+ "human_confidence": None,
42
+ }
features/image_classifier/model_loader.py ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import shutil
3
+ import logging
4
+ import tensorflow as tf
5
+ from tensorflow.keras.layers import Layer
6
+ from huggingface_hub import snapshot_download
7
+
8
+ # Model config
9
+ REPO_ID = "can-org/AI-VS-HUMAN-IMAGE-classifier"
10
+ MODEL_DIR = "./IMG_Models"
11
+ WEIGHTS_PATH = os.path.join(MODEL_DIR, "latest-my_cnn_model.h5")
12
+
13
+ # Device info (for logging)
14
+ gpus = tf.config.list_physical_devices("GPU")
15
+ device = "cuda" if gpus else "cpu"
16
+
17
+ # Global model reference
18
+ _model_img = None
19
+
20
+ # Custom layer used in the model
21
+ class Cast(Layer):
22
+ def call(self, inputs):
23
+ return tf.cast(inputs, tf.float32)
24
+
25
+ def warmup():
26
+ global _model_img
27
+ download_model_repo()
28
+ _model_img = load_model()
29
+ logging.info("Image model is ready.")
30
+
31
+ def download_model_repo():
32
+ if os.path.exists(MODEL_DIR) and os.path.isdir(MODEL_DIR):
33
+ logging.info("Image model already exists, skipping download.")
34
+ return
35
+ snapshot_path = snapshot_download(repo_id=REPO_ID)
36
+ os.makedirs(MODEL_DIR, exist_ok=True)
37
+ shutil.copytree(snapshot_path, MODEL_DIR, dirs_exist_ok=True)
38
+
39
+ def load_model():
40
+ global _model_img
41
+ if _model_img is not None:
42
+ return _model_img
43
+
44
+ print(f"{'GPU detected' if device == 'cuda' else 'No GPU detected'}, loading model on {device.upper()}.")
45
+
46
+ _model_img = tf.keras.models.load_model(
47
+ WEIGHTS_PATH, custom_objects={"Cast": Cast}
48
+ )
49
+ print("Model input shape:", _model_img.input_shape)
50
+ return _model_img
51
+
52
+ def get_model():
53
+ global _model_img
54
+ if _model_img is None:
55
+ download_model_repo()
56
+ _model_img = load_model()
57
+ return _model_img
58
+
features/image_classifier/preprocess.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ import cv2
3
+ from fastapi import HTTPException
4
+
5
+
6
+ def preprocess_image(file):
7
+ try:
8
+ file.file.seek(0)
9
+ image_bytes = file.file.read()
10
+ nparr = np.frombuffer(image_bytes, np.uint8)
11
+ img = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
12
+ if img is None:
13
+ raise HTTPException(status_code=500, detail="Could not decode image.")
14
+
15
+ img = cv2.resize(img, (299, 299))
16
+ img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
17
+ img = img / 255.0
18
+ img = np.expand_dims(img, axis=0).astype(np.float32)
19
+ return img
20
+
21
+ except HTTPException:
22
+ raise # Re-raise already defined HTTP errors
23
+ except Exception as e:
24
+ raise HTTPException(
25
+ status_code=500, detail=f"Image preprocessing failed: {str(e)}"
26
+ )
features/image_classifier/routes.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from slowapi import Limiter
2
+ from config import ACCESS_RATE
3
+ from fastapi import APIRouter, File, Request, Depends, HTTPException, UploadFile
4
+ from fastapi.security import HTTPBearer
5
+ from slowapi import Limiter
6
+ from slowapi.util import get_remote_address
7
+ from .controller import Classify_Image_router
8
+ router = APIRouter()
9
+ limiter = Limiter(key_func=get_remote_address)
10
+ security = HTTPBearer()
11
+
12
+ @router.post("/analyse")
13
+ @limiter.limit(ACCESS_RATE)
14
+ async def analyse(
15
+ request: Request,
16
+ file: UploadFile = File(...),
17
+ token: str = Depends(security)
18
+ ):
19
+ result = await Classify_Image_router(file) # await the async function
20
+ return result
21
+
22
+ @router.get("/health")
23
+ @limiter.limit(ACCESS_RATE)
24
+ def health(request: Request):
25
+ return {"status": "ok"}
26
+
features/image_edit_detector/controller.py ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image
2
+ import io
3
+ from io import BytesIO
4
+ from .detectors.fft import run_fft
5
+ from .detectors.metadata import run_metadata
6
+ from .detectors.ela import run_ela
7
+ from .preprocess import preprocess_image
8
+ from fastapi import HTTPException,status,Depends
9
+ from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
10
+ security=HTTPBearer()
11
+ import os
12
+ async def process_image_ela(image_bytes: bytes, quality: int=90):
13
+ image = Image.open(io.BytesIO(image_bytes))
14
+
15
+ if image.mode != "RGB":
16
+ image = image.convert("RGB")
17
+
18
+ compressed_image = preprocess_image(image, quality)
19
+ ela_result = run_ela(compressed_image, quality)
20
+
21
+ return {
22
+ "is_edited": ela_result,
23
+ "ela_score": ela_result
24
+ }
25
+
26
+ async def process_fft_image(image_bytes: bytes,threshold:float=0.95) -> dict:
27
+ image = Image.open(BytesIO(image_bytes)).convert("RGB")
28
+ result = run_fft(image,threshold)
29
+ return {"edited": bool(result)}
30
+
31
+
32
+ async def process_meta_image(image_bytes: bytes) -> dict:
33
+ try:
34
+ result = run_metadata(image_bytes)
35
+ return {"source": result} # e.g. "edited", "phone_capture", "unknown"
36
+ except Exception as e:
37
+ # Handle errors gracefully, return useful message or raise HTTPException if preferred
38
+ return {"error": str(e)}
39
+
40
+
41
+ async def verify_token(credentials: HTTPAuthorizationCredentials = Depends(security)):
42
+ token = credentials.credentials
43
+ expected_token = os.getenv("MY_SECRET_TOKEN")
44
+ if token != expected_token:
45
+ raise HTTPException(
46
+ status_code=status.HTTP_403_FORBIDDEN,
47
+ detail="Invalid or expired token"
48
+ )
49
+ return token
features/image_edit_detector/detectors/ela.py ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image, ImageChops, ImageEnhance
2
+ import io
3
+
4
+
5
+ def run_ela(image: Image.Image, quality: int = 90, threshold: int = 15) -> bool:
6
+ """
7
+ Perform Error Level Analysis to detect image manipulation.
8
+
9
+ Parameters:
10
+ image (PIL.Image): Input image (should be RGB).
11
+ quality (int): JPEG compression quality for ELA.
12
+ threshold (int): Maximum pixel difference threshold to classify as edited.
13
+
14
+ Returns:
15
+ bool: True if image appears edited, False otherwise.
16
+ """
17
+
18
+ # Recompress the image into JPEG format in memory
19
+ buffer = io.BytesIO()
20
+ image.save(buffer, format="JPEG", quality=quality)
21
+ buffer.seek(0)
22
+ recompressed = Image.open(buffer)
23
+
24
+ # Compute the pixel-wise difference
25
+ diff = ImageChops.difference(image, recompressed)
26
+ extrema = diff.getextrema()
27
+ max_diff = max([ex[1] for ex in extrema])
28
+
29
+ # Enhance difference image for debug (not returned)
30
+ _ = ImageEnhance.Brightness(diff).enhance(10)
31
+
32
+ return max_diff > threshold
features/image_edit_detector/detectors/fft.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ from PIL import Image
3
+ from scipy.fft import fft2, fftshift
4
+
5
+
6
+ def run_fft(image: Image.Image, threshold: float = 0.92) -> bool:
7
+ """
8
+ Detects potential image manipulation or generation using FFT-based high-frequency analysis.
9
+
10
+ Parameters:
11
+ image (PIL.Image.Image): The input image.
12
+ threshold (float): Proportion of high-frequency components above which the image is flagged.
13
+
14
+ Returns:
15
+ bool: True if the image is likely AI-generated or manipulated, False otherwise.
16
+ """
17
+ gray_image = image.convert("L")
18
+
19
+ resized_image = gray_image.resize((512, 512))
20
+
21
+
22
+ image_array = np.array(resized_image)
23
+
24
+ fft_result = fft2(image_array)
25
+
26
+ fft_shifted = fftshift(fft_result)
27
+
28
+ magnitude_spectrum = np.abs(fft_shifted)
29
+ max_magnitude = np.max(magnitude_spectrum)
30
+ if max_magnitude == 0:
31
+ return False # Avoid division by zero if image is blank
32
+ normalized_spectrum = magnitude_spectrum / max_magnitude
33
+
34
+ high_freq_mask = normalized_spectrum > 0.5
35
+
36
+ high_freq_ratio = np.sum(high_freq_mask) / normalized_spectrum.size
37
+
38
+ is_fake = high_freq_ratio > threshold
39
+ return is_fake
40
+
features/image_edit_detector/detectors/metadata.py ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image, UnidentifiedImageError
2
+ import io
3
+
4
+ # Common AI metadata identifiers in image files.
5
+ AI_INDICATORS = [
6
+ b'c2pa', b'claim_generator', b'claim_generator_info',
7
+ b'created_software_agent', b'actions.v2', b'assertions',
8
+ b'urn:c2pa', b'jumd', b'jumb', b'jumdcbor', b'jumdc2ma',
9
+ b'jumdc2as', b'jumdc2cl', b'cbor', b'convertedsfwareagent',b'c2pa.version',
10
+ b'c2pa.assertions', b'c2pa.actions',
11
+ b'c2pa.thumbnail', b'c2pa.signature', b'c2pa.manifest',
12
+ b'c2pa.manifest_store', b'c2pa.ingredient', b'c2pa.parent',
13
+ b'c2pa.provenance', b'c2pa.claim', b'c2pa.hash', b'c2pa.authority',
14
+ b'jumdc2pn', b'jumdrefs', b'jumdver', b'jumdmeta',
15
+
16
+
17
+ 'midjourney'.encode('utf-8'),
18
+ 'stable-diffusion'.encode('utf-8'),
19
+ 'stable diffusion'.encode('utf-8'),
20
+ 'stable_diffusion'.encode('utf-8'),
21
+ 'artbreeder'.encode('utf-8'),
22
+ 'runwayml'.encode('utf-8'),
23
+ 'remix.ai'.encode('utf-8'),
24
+ 'firefly'.encode('utf-8'),
25
+ 'adobe_firefly'.encode('utf-8'),
26
+
27
+ # OpenAI / DALLΒ·E indicators (all encoded to bytes)
28
+ 'openai'.encode('utf-8'),
29
+ 'dalle'.encode('utf-8'),
30
+ 'dalle2'.encode('utf-8'),
31
+ 'DALL-E'.encode('utf-8'),
32
+ 'DALLΒ·E'.encode('utf-8'),
33
+ 'created_by: openai'.encode('utf-8'),
34
+ 'tool: dalle'.encode('utf-8'),
35
+ 'tool: dalle2'.encode('utf-8'),
36
+ 'creator: openai'.encode('utf-8'),
37
+ 'creator: dalle'.encode('utf-8'),
38
+ 'openai.com'.encode('utf-8'),
39
+ 'api.openai.com'.encode('utf-8'),
40
+ 'openai_model'.encode('utf-8'),
41
+ 'openai_gpt'.encode('utf-8'),
42
+
43
+ #Further possible AI-Generation Indicators
44
+ 'generated_by'.encode('utf-8'),
45
+ 'model_id'.encode('utf-8'),
46
+ 'model_version'.encode('utf-8'),
47
+ 'model_info'.encode('utf-8'),
48
+ 'tool_name'.encode('utf-8'),
49
+ 'tool_creator'.encode('utf-8'),
50
+ 'tool_version'.encode('utf-8'),
51
+ 'model_signature'.encode('utf-8'),
52
+ 'ai_model'.encode('utf-8'),
53
+ 'ai_tool'.encode('utf-8'),
54
+ 'generator'.encode('utf-8'),
55
+ 'generated_by_ai'.encode('utf-8'),
56
+ 'ai_generated'.encode('utf-8'),
57
+ 'ai_art'.encode('utf-8')
58
+ ]
59
+
60
+
61
+ def run_metadata(image_bytes: bytes) -> str:
62
+ try:
63
+ img = Image.open(io.BytesIO(image_bytes))
64
+ img.load()
65
+
66
+ exif = img.getexif()
67
+ software = str(exif.get(305, "")).strip()
68
+
69
+ suspicious_editors = ["Photoshop", "GIMP", "Snapseed", "Pixlr", "VSCO", "Editor", "Adobe", "Luminar"]
70
+
71
+ if any(editor.lower() in software.lower() for editor in suspicious_editors):
72
+ return "edited"
73
+
74
+ if any(indicator in image_bytes for indicator in AI_INDICATORS):
75
+ return "ai_generated"
76
+
77
+ return "undetermined"
78
+
79
+ except UnidentifiedImageError:
80
+ return "error: invalid image format"
81
+ except Exception as e:
82
+ return f"error: {str(e)}"
features/image_edit_detector/preprocess.py ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ from PIL import Image
2
+ import io
3
+
4
+ def preprocess_image(img: Image.Image, quality: int) -> Image.Image:
5
+ buffer = io.BytesIO()
6
+ img.save(buffer, format="JPEG", quality=quality)
7
+ buffer.seek(0)
8
+ return Image.open(buffer)
9
+
features/image_edit_detector/routes.py ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from slowapi import Limiter
2
+ from config import ACCESS_RATE
3
+ from fastapi import APIRouter, File, Request, Depends, HTTPException, UploadFile
4
+ from fastapi.security import HTTPBearer
5
+ from slowapi import Limiter
6
+ from slowapi.util import get_remote_address
7
+ from io import BytesIO
8
+ from .controller import process_image_ela , verify_token,process_fft_image, process_meta_image
9
+ import requests
10
+ router = APIRouter()
11
+ limiter = Limiter(key_func=get_remote_address)
12
+ security = HTTPBearer()
13
+
14
+
15
+
16
+ @router.post("/ela")
17
+ @limiter.limit(ACCESS_RATE)
18
+ async def detect_ela(request:Request,file: UploadFile = File(...), quality: int = 90 ,token: str = Depends(verify_token)):
19
+ # Check file extension
20
+ allowed_types = ["image/jpeg", "image/png"]
21
+
22
+ if file.content_type not in allowed_types:
23
+ raise HTTPException(
24
+ status_code=400,
25
+ detail="Unsupported file type. Only JPEG and PNG images are allowed."
26
+ )
27
+
28
+ content = await file.read()
29
+ result = await process_image_ela(content, quality)
30
+ return result
31
+
32
+ @router.post("/fft")
33
+ @limiter.limit(ACCESS_RATE)
34
+ async def detect_fft(request:Request,file:UploadFile =File(...),threshold:float=0.95,token:str=Depends(verify_token)):
35
+ if file.content_type not in ["image/jpeg", "image/png"]:
36
+ raise HTTPException(status_code=400, detail="Unsupported image type.")
37
+
38
+ content = await file.read()
39
+ result = await process_fft_image(content,threshold)
40
+ return result
41
+
42
+ @router.post("/meta")
43
+ @limiter.limit(ACCESS_RATE)
44
+ async def detect_meta(request:Request,file:UploadFile=File(...),token:str=Depends(verify_token)):
45
+ if file.content_type not in ["image/jpeg", "image/png"]:
46
+ raise HTTPException(status_code=400, detail="Unsupported image type.")
47
+ content = await file.read()
48
+ result = await process_meta_image(content)
49
+ return result
50
+ @router.post("/health")
51
+ @limiter.limit(ACCESS_RATE)
52
+ def heath(request:Request):
53
+ return {"status":"ok"}
features/nepali_text_classifier/__init__.py CHANGED
File without changes
features/nepali_text_classifier/controller.py CHANGED
@@ -3,7 +3,6 @@ from io import BytesIO
3
  from fastapi import HTTPException, UploadFile, status, Depends
4
  from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
5
  import os
6
-
7
  from features.nepali_text_classifier.inferencer import classify_text
8
  from features.nepali_text_classifier.preprocess import *
9
  import re
 
3
  from fastapi import HTTPException, UploadFile, status, Depends
4
  from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
5
  import os
 
6
  from features.nepali_text_classifier.inferencer import classify_text
7
  from features.nepali_text_classifier.preprocess import *
8
  import re
features/nepali_text_classifier/inferencer.py CHANGED
File without changes
features/nepali_text_classifier/model_loader.py CHANGED
@@ -8,7 +8,7 @@ from huggingface_hub import snapshot_download
8
  from transformers import AutoTokenizer, AutoModel
9
 
10
  # Configs
11
- REPO_ID = "Pujan-Dev/Nepali-AI-VS-HUMAN"
12
  BASE_DIR = "./np_text_model"
13
  TOKENIZER_DIR = os.path.join(BASE_DIR, "classifier") # <- update this to match your uploaded folder
14
  WEIGHTS_PATH = os.path.join(BASE_DIR, "model_95_acc.pth") # <- change to match actual uploaded weight
 
8
  from transformers import AutoTokenizer, AutoModel
9
 
10
  # Configs
11
+ REPO_ID = "can-org/Nepali-AI-VS-HUMAN"
12
  BASE_DIR = "./np_text_model"
13
  TOKENIZER_DIR = os.path.join(BASE_DIR, "classifier") # <- update this to match your uploaded folder
14
  WEIGHTS_PATH = os.path.join(BASE_DIR, "model_95_acc.pth") # <- change to match actual uploaded weight
features/nepali_text_classifier/preprocess.py CHANGED
@@ -20,19 +20,17 @@ def parse_pdf(file: BytesIO):
20
  for page_num in range(doc.page_count):
21
  page = doc.load_page(page_num)
22
  text += page.get_text()
23
- return text
24
  except Exception as e:
25
  logging.error(f"Error while processing PDF: {str(e)}")
26
  raise HTTPException(
27
  status_code=500, detail="Error processing PDF file")
28
 
29
-
30
  def parse_txt(file: BytesIO):
31
  return file.read().decode("utf-8")
32
 
33
-
34
- def end_symbol_for_NP_text(text):
35
- if not text.endswith("ΰ₯€"):
36
- text += "ΰ₯€"
37
-
38
-
 
20
  for page_num in range(doc.page_count):
21
  page = doc.load_page(page_num)
22
  text += page.get_text()
23
+ return text
24
  except Exception as e:
25
  logging.error(f"Error while processing PDF: {str(e)}")
26
  raise HTTPException(
27
  status_code=500, detail="Error processing PDF file")
28
 
 
29
  def parse_txt(file: BytesIO):
30
  return file.read().decode("utf-8")
31
 
32
+ def end_symbol_for_NP_text(text: str) -> str:
33
+ text = text.strip()
34
+ if not text.endswith("ΰ₯€"):
35
+ text += "ΰ₯€"
36
+ return text
 
features/nepali_text_classifier/routes.py CHANGED
File without changes
features/text_classifier/__init__.py CHANGED
File without changes
features/text_classifier/controller.py CHANGED
File without changes
features/text_classifier/inferencer.py CHANGED
File without changes
features/text_classifier/model_loader.py CHANGED
@@ -6,7 +6,7 @@ from huggingface_hub import snapshot_download
6
  import torch
7
  from dotenv import load_dotenv
8
  load_dotenv()
9
- REPO_ID = "Pujan-Dev/AI-Text-Detector"
10
  MODEL_DIR = "./models"
11
  TOKENIZER_DIR = os.path.join(MODEL_DIR, "model")
12
  WEIGHTS_PATH = os.path.join(MODEL_DIR, "model_weights.pth")
 
6
  import torch
7
  from dotenv import load_dotenv
8
  load_dotenv()
9
+ REPO_ID = "can-org/AI-Content-Checker"
10
  MODEL_DIR = "./models"
11
  TOKENIZER_DIR = os.path.join(MODEL_DIR, "model")
12
  WEIGHTS_PATH = os.path.join(MODEL_DIR, "model_weights.pth")
features/text_classifier/preprocess.py CHANGED
File without changes
features/text_classifier/routes.py CHANGED
File without changes
license.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # License - All Rights Reserved
2
+
3
+ Copyright (c) 2025 CyberAlertNepal
4
+
5
+ This software and all associated materials are **not open source** and are protected under a custom license.
6
+
7
+ ## Strict Usage Terms
8
+
9
+ Unless explicit written permission is granted by **CyberAlertNepal**, **no individual or entity** is allowed to:
10
+
11
+ - Use this codebase or its models in any capacity β€” personal, educational, or commercial.
12
+ - Modify, copy, distribute, or sublicense any part of this project.
13
+ - Deploy, mirror, or host this project, either publicly or privately.
14
+ - Incorporate any component of this project into derivative works or other applications.
15
+
16
+ This project is intended for **private, internal use by the author(s) only**.
17
+
18
+ Any unauthorized usage, reproduction, or distribution is strictly prohibited and may result in legal action.
19
+
20
+ **All rights reserved.**
readme.md DELETED
@@ -1,35 +0,0 @@
1
- # πŸš€ FastAPI AI Detector
2
-
3
- A production-ready FastAPI app for detecting AI vs. human-written text in English and Nepali. It uses GPT-2 and SentencePiece-based models, with Bearer token security.
4
-
5
- ## πŸ“‚ Documentation
6
-
7
- - [Project Structure](docs/structure.md)
8
- - [API Endpoints](docs/api_endpoints.md)
9
- - [Setup & Installation](docs/setup.md)
10
- - [Deployment](docs/deployment.md)
11
- - [Security](docs/security.md)
12
- - [NestJS Integration](docs/nestjs_integration.md)
13
- - [Core Functions](docs/functions.md)
14
-
15
- ## ⚑ Quick Start
16
- ```bash
17
- uvicorn app:app --host 0.0.0.0 --port 8000
18
- ```
19
- ## πŸš€ Deployment
20
-
21
- - **Local**: Use `uvicorn` as above.
22
- - **Railway/Heroku**: Use the provided `Procfile`.
23
- - **Hugging Face Spaces**: Use the `Dockerfile` for container deployment.
24
-
25
- ---
26
-
27
- ## πŸ’‘ Tips
28
-
29
- - **Model files auto-download at first start** if not found.
30
- - **Keep `requirements.txt` up-to-date** after adding dependencies.
31
- - **All endpoints require the correct `Authorization` header**.
32
- - **For security**: Avoid committing `.env` to public repos.
33
-
34
- ---
35
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt CHANGED
@@ -11,3 +11,10 @@ python-multipart
11
  slowapi
12
  spacy
13
  nltk
 
 
 
 
 
 
 
 
11
  slowapi
12
  spacy
13
  nltk
14
+ tensorflow
15
+ opencv-python
16
+ pillow
17
+ scipy
18
+ fitz
19
+ frontend
20
+ tools