Spaces:
Running
Running
Create I Released A Discord Bot That Describes Images And I Am Surprised It Works.html
Browse files
I Released A Discord Bot That Describes Images And I Am Surprised It Works.html
ADDED
|
@@ -0,0 +1,157 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
<!DOCTYPE html>
|
| 2 |
+
<html lang="en">
|
| 3 |
+
<head>
|
| 4 |
+
<meta charset="UTF-8">
|
| 5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
| 6 |
+
<title>I Released A Discord Bot That Describes Images And I Am Surprised It Works | FMN-GPT - CompactAI</title>
|
| 7 |
+
<link rel="stylesheet" href="bluesheet.css">
|
| 8 |
+
<link rel="preconnect" href="https://fonts.googleapis.com">
|
| 9 |
+
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
|
| 10 |
+
<link href="https://fonts.googleapis.com/css2?family=Geist:wght@400:500:600:700&family=Geist+Mono&display=swap" rel="stylesheet">
|
| 11 |
+
<style>
|
| 12 |
+
:root {
|
| 13 |
+
--blue-900: #000000;
|
| 14 |
+
--blue-800: #0a0a0a;
|
| 15 |
+
--blue-700: #111111;
|
| 16 |
+
--blue-600: #1a1a1a;
|
| 17 |
+
--blue-500: #333333;
|
| 18 |
+
--blue-400: #555555;
|
| 19 |
+
--blue-300: #777777;
|
| 20 |
+
--blue-200: #888888;
|
| 21 |
+
--blue-100: #aaaaaa;
|
| 22 |
+
--white: #ffffff;
|
| 23 |
+
--white-soft: #f5f5f5;
|
| 24 |
+
--white-muted: #e0e0e0;
|
| 25 |
+
--grid-line: rgba(255, 255, 255, 0.03);
|
| 26 |
+
--grid-line-major: rgba(255, 255, 255, 0.06);
|
| 27 |
+
--accent: #ededed;
|
| 28 |
+
--accent-muted: #888888;
|
| 29 |
+
--font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif;
|
| 30 |
+
--font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace;
|
| 31 |
+
--container-max: 1100px;
|
| 32 |
+
}
|
| 33 |
+
* { box-sizing: border-box; margin: 0; padding: 0; }
|
| 34 |
+
html { font-size: 16px; scroll-behavior: smooth; }
|
| 35 |
+
body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; }
|
| 36 |
+
a { color: var(--white); text-decoration: none; transition: color 0.15s ease; }
|
| 37 |
+
a:hover { color: var(--accent); }
|
| 38 |
+
.container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; }
|
| 39 |
+
nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; }
|
| 40 |
+
nav .container { display: flex; justify-content: space-between; align-items: center; }
|
| 41 |
+
.nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; }
|
| 42 |
+
.nav-brand span { color: var(--accent); }
|
| 43 |
+
.nav-links { display: flex; gap: 32px; }
|
| 44 |
+
.nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); }
|
| 45 |
+
.nav-links a:hover { color: var(--white); }
|
| 46 |
+
.post { padding: 140px 0 80px; }
|
| 47 |
+
.post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; }
|
| 48 |
+
.post-back:hover { color: var(--accent); }
|
| 49 |
+
.post-back::before { content: '← '; }
|
| 50 |
+
.post-meta { display: flex; gap: 12px; margin-bottom: 20px; }
|
| 51 |
+
.post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); }
|
| 52 |
+
.post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; }
|
| 53 |
+
.post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; }
|
| 54 |
+
.post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); }
|
| 55 |
+
.post-body p:first-of-type { font-size: 20px; color: var(--white-muted); }
|
| 56 |
+
.post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; }
|
| 57 |
+
.post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; }
|
| 58 |
+
.post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; }
|
| 59 |
+
.post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; }
|
| 60 |
+
.code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; }
|
| 61 |
+
.code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; }
|
| 62 |
+
.cta-box { background: var(--blue-800); border: 2px solid var(--accent); border-radius: 12px; padding: 24px; margin: 32px 0; text-align: center; }
|
| 63 |
+
.cta-box a { color: var(--accent); font-weight: 600; font-size: 18px; word-break: break-all; }
|
| 64 |
+
.cta-box a:hover { color: var(--white); }
|
| 65 |
+
.cta-box p { margin: 12px 0 0; color: var(--blue-200); font-size: 14px; }
|
| 66 |
+
.post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); }
|
| 67 |
+
.post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; }
|
| 68 |
+
footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; }
|
| 69 |
+
footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; }
|
| 70 |
+
footer a { color: var(--blue-200); }
|
| 71 |
+
footer a:hover { color: var(--accent); }
|
| 72 |
+
@media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } }
|
| 73 |
+
|
| 74 |
+
</style>
|
| 75 |
+
|
| 76 |
+
</head>
|
| 77 |
+
<body>
|
| 78 |
+
<nav>
|
| 79 |
+
<div class="container">
|
| 80 |
+
<a href="index.html" class="nav-brand"><span>/</span>FMN-GPT</a>
|
| 81 |
+
<div class="nav-links">
|
| 82 |
+
<a href="blog.html">Blog</a>
|
| 83 |
+
<a href="status.html">Model Status</a>
|
| 84 |
+
<a href="https://huggingface.co/CompactAI-O" target="_blank">HuggingFace Org</a>
|
| 85 |
+
</div>
|
| 86 |
+
</div>
|
| 87 |
+
</nav>
|
| 88 |
+
<main>
|
| 89 |
+
<article class="post">
|
| 90 |
+
<div class="container">
|
| 91 |
+
<a href="blog.html" class="post-back">Back to Blog</a>
|
| 92 |
+
<header>
|
| 93 |
+
<div class="post-meta">
|
| 94 |
+
<span class="post-date">2026-05-04</span>
|
| 95 |
+
<span class="post-tag">Accessibility Tools</span>
|
| 96 |
+
</div>
|
| 97 |
+
<h1>I Released A Discord Bot That Describes Images And I Am Surprised It Works</h1>
|
| 98 |
+
</header>
|
| 99 |
+
<div class="post-body">
|
| 100 |
+
<p>I released something that might actually help people. It is called Blindbot. It is a Discord bot that uses Ollama to describe images for blind and visually impaired users. I am surprised it works. I am also surprised I finished it. Both surprises are valid.</p>
|
| 101 |
+
<blockquote>
|
| 102 |
+
<p>Building accessibility tools feels different. The stakes feel higher. The margin for error feels smaller. I am not used to that feeling. I am learning to live with it.</p>
|
| 103 |
+
</blockquote>
|
| 104 |
+
<h2>What Blindbot Does</h2>
|
| 105 |
+
<p>The bot watches for images in Discord channels. When someone posts an image it automatically generates a description. The description is formatted for screen readers. No markdown. No visual fluff. Just words that convey what the image contains.</p>
|
| 106 |
+
<p>It handles GIFs by extracting frames. It deduplicates near-identical frames using dHash and pixel difference scoring. It picks the most visually interesting frames to describe. This prevents the bot from describing the same frame five times in a row. That would be annoying. I avoided the annoyance.</p>
|
| 107 |
+
<p>It supports multiple formats. Images. GIFs. Videos in MP4, WebM, and MOV. PDFs. SVGs. I did not test every format thoroughly. I tested enough to feel confident. That is the CompactAI standard for release readiness.</p>
|
| 108 |
+
<div class="code-block">
|
| 109 |
+
<span class="comment"># Quick start because I like making things accessible</span><br>
|
| 110 |
+
Step 1: Install Ollama and pull a vision model<br>
|
| 111 |
+
ollama pull gemma4:e4b<br>
|
| 112 |
+
Step 2: Clone the repo and set up environment<br>
|
| 113 |
+
git clone https://github.com/CompactAIOfficial/Blindbot.git<br>
|
| 114 |
+
Step 3: Add your Discord bot token to .env<br>
|
| 115 |
+
Step 4: Install dependencies and run<br>
|
| 116 |
+
pip install -r requirements.txt<br>
|
| 117 |
+
python bot.py<br>
|
| 118 |
+
<span class="comment"># If it crashes, that is on me. Report issues.</span>
|
| 119 |
+
</div>
|
| 120 |
+
<h2>Per-Server Configuration</h2>
|
| 121 |
+
<p>Each Discord server can configure Blindbot independently. Use slash commands to adjust settings. Set the detail level to brief, standard, or detailed. Choose how GIFs are processed. View cache statistics. Clear the cache if needed. The bot respects server preferences. It does not force a one-size-fits-all approach.</p>
|
| 122 |
+
<p>The feedback system lets users react with thumbs up or thumbs down to descriptions. Thumbs down removes the description from cache. This helps improve quality over time. It also gives users agency. Agency matters. Especially in accessibility tools.</p>
|
| 123 |
+
<h2>Why I Built This</h2>
|
| 124 |
+
<p>I train tiny models. I care about efficiency. I care about making technology accessible to more people. Blindbot aligns with those values. It uses a vision model to describe images. It formats output for screen readers. It reduces barriers for users who cannot see the images others post.</p>
|
| 125 |
+
<p>I also built it because I could. Because Ollama made vision models accessible to run locally. Because Discord has an API that lets bots listen for attachments. Because the pieces existed. I just connected them. That is the CompactAI way. Connect existing pieces. Hope they work. Document the process. Share the result.</p>
|
| 126 |
+
<blockquote>
|
| 127 |
+
<p>Accessibility is not an afterthought. It is a design principle. I am learning that principle one bot at a time.</p>
|
| 128 |
+
</blockquote>
|
| 129 |
+
<h2>How You Can Use It</h2>
|
| 130 |
+
<p>The repository is public. The code is open. You can clone it. You can run it. You can modify it. You can contribute improvements. That is how open source works. That is how tools get better. That is how communities grow.</p>
|
| 131 |
+
<div class="cta-box">
|
| 132 |
+
<a href="https://github.com/CompactAIOfficial/Blindbot" target="_blank">https://github.com/CompactAIOfficial/Blindbot</a>
|
| 133 |
+
<p>Clone the repo. Read the README. Install dependencies. Run the bot. Tell me if it helps. Tell me if it breaks. Tell me how to make it better.</p>
|
| 134 |
+
</div>
|
| 135 |
+
<h2>What Comes Next</h2>
|
| 136 |
+
<p>I will monitor feedback. I will fix bugs as they appear. I will add features if the community requests them. I will keep the bot lightweight. I will keep the output screen-reader friendly. I will keep the code open.</p>
|
| 137 |
+
<p>I am also working on Chroma TTS. I am also training Glint variants. I am also building cAI-Grid. I am also managing four group chats. I am also sleeping occasionally. The workload is heavy. The progress is real. The caffeine supply is depleted.</p>
|
| 138 |
+
<h2>Final Thoughts</h2>
|
| 139 |
+
<p>Blindbot is live. It describes images. It formats output for screen readers. It handles GIFs without spamming duplicate descriptions. It respects server settings. It accepts feedback. It is open source. It is free. It is mine. It is yours.</p>
|
| 140 |
+
<p>If you run a Discord server, consider adding it. If you are blind or visually impaired, try it. If you are neither, try it anyway. Share your experience. Share your feedback. Share your improvements. That is how tools evolve. That is how communities thrive.</p>
|
| 141 |
+
<p>I am surprised it works. I am glad it exists. I am ready for the bug reports. Progress is weird. Accessibility is essential. Both can be true.</p>
|
| 142 |
+
<hr>
|
| 143 |
+
</div>
|
| 144 |
+
<footer class="post-footer">
|
| 145 |
+
<p>Current status: Blindbot released. Ollama integration stable. Screen-reader output tested. Feedback system active. Bug reports pending. Progress is weird. Accessibility is essential.</p>
|
| 146 |
+
</footer>
|
| 147 |
+
</div>
|
| 148 |
+
</article>
|
| 149 |
+
</main>
|
| 150 |
+
<footer>
|
| 151 |
+
<div class="container">
|
| 152 |
+
<p>Built with curiosity over compute</p>
|
| 153 |
+
<p>FMN-GPT by <a href="https://huggingface.co/CompactAI-O" target="_blank">CompactAI-O</a> | 2026</p>
|
| 154 |
+
</div>
|
| 155 |
+
</footer>
|
| 156 |
+
</body>
|
| 157 |
+
</html>
|