Spaces:
Configuration error
Configuration error
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8" /> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> | |
| <title>OptiTransfer Data</title> | |
| <style> | |
| * { box-sizing: border-box; margin: 0; padding: 0; } | |
| body { | |
| font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif; | |
| background: #f8fafc; | |
| color: #1e293b; | |
| padding: 2rem 1rem; | |
| } | |
| .container { max-width: 860px; margin: 0 auto; } | |
| .hero { | |
| text-align: center; | |
| padding: 2.5rem 1rem 2rem; | |
| } | |
| .hero h1 { font-size: 2rem; font-weight: 700; color: #0f172a; margin-bottom: 0.5rem; } | |
| .hero p { font-size: 1.05rem; color: #475569; max-width: 640px; margin: 0 auto; line-height: 1.7; } | |
| .badge { | |
| display: inline-block; | |
| background: #f1f5f9; | |
| color: #334155; | |
| border: 1px solid #e2e8f0; | |
| border-radius: 999px; | |
| padding: 0.25rem 0.85rem; | |
| font-size: 0.8rem; | |
| font-weight: 600; | |
| margin: 0.75rem 0.25rem 0; | |
| } | |
| hr { border: none; border-top: 1px solid #e2e8f0; margin: 2rem 0; } | |
| h2 { font-size: 1.2rem; font-weight: 700; color: #0f172a; margin-bottom: 1rem; letter-spacing: -0.01em; } | |
| .section { margin-bottom: 2rem; } | |
| .features { | |
| display: grid; | |
| grid-template-columns: repeat(auto-fit, minmax(190px, 1fr)); | |
| gap: 1rem; | |
| margin-top: 0.5rem; | |
| } | |
| .feature-card { | |
| background: #fff; | |
| border: 1px solid #e2e8f0; | |
| border-radius: 10px; | |
| padding: 1.1rem; | |
| } | |
| .feature-card strong { display: block; font-size: 0.9rem; margin-bottom: 0.3rem; color: #0f172a; } | |
| .feature-card p { font-size: 0.82rem; color: #64748b; line-height: 1.5; } | |
| table { width: 100%; border-collapse: collapse; font-size: 0.9rem; } | |
| th { background: #f1f5f9; text-align: left; padding: 0.6rem 0.75rem; font-weight: 600; font-size: 0.78rem; color: #475569; text-transform: uppercase; letter-spacing: 0.04em; } | |
| td { padding: 0.65rem 0.75rem; border-top: 1px solid #f1f5f9; vertical-align: top; } | |
| tr:hover td { background: #fafbfc; } | |
| a { color: #2563eb; text-decoration: none; } | |
| a:hover { text-decoration: underline; } | |
| .tag { | |
| display: inline-block; | |
| background: #f1f5f9; | |
| color: #475569; | |
| border-radius: 4px; | |
| padding: 0.15rem 0.5rem; | |
| font-size: 0.73rem; | |
| margin: 0.15rem 0.15rem 0.15rem 0; | |
| white-space: nowrap; | |
| } | |
| .tag-group { margin-top: 0.5rem; line-height: 1.9; } | |
| .quality-list { list-style: none; } | |
| .quality-list li { padding: 0.35rem 0; font-size: 0.88rem; color: #374151; padding-left: 1.2rem; position: relative; } | |
| .quality-list li::before { content: "\2713"; position: absolute; left: 0; color: #16a34a; font-weight: 700; } | |
| .pricing-grid { | |
| display: grid; | |
| grid-template-columns: repeat(auto-fit, minmax(200px, 1fr)); | |
| gap: 1rem; | |
| margin-top: 0.5rem; | |
| } | |
| .price-card { | |
| background: #fff; | |
| border: 1px solid #e2e8f0; | |
| border-radius: 10px; | |
| padding: 1.25rem; | |
| } | |
| .price-card.featured { | |
| border-color: #2563eb; | |
| box-shadow: 0 0 0 1px rgba(37,99,235,0.1); | |
| } | |
| .price-card h3 { font-size: 0.95rem; font-weight: 700; margin-bottom: 0.5rem; color: #0f172a; } | |
| .price-card .price { font-size: 1.2rem; font-weight: 800; color: #2563eb; margin-bottom: 0.5rem; } | |
| .price-card p { font-size: 0.82rem; color: #64748b; line-height: 1.5; } | |
| .payment { display: flex; flex-wrap: wrap; gap: 0.5rem; margin-top: 0.75rem; } | |
| .payment-item { | |
| background: #fff; | |
| border: 1px solid #e2e8f0; | |
| border-radius: 8px; | |
| padding: 0.4rem 0.85rem; | |
| font-size: 0.84rem; | |
| color: #374151; | |
| } | |
| .contact-bar { | |
| background: #0f172a; | |
| color: #e2e8f0; | |
| border-radius: 10px; | |
| padding: 1.5rem; | |
| text-align: center; | |
| margin-top: 2rem; | |
| } | |
| .contact-bar strong { font-size: 1rem; } | |
| .contact-bar a { color: #93c5fd; } | |
| .contact-bar p { font-size: 0.88rem; margin-top: 0.35rem; } | |
| .dataset-detail { margin-top: 0.75rem; } | |
| .dataset-detail p { font-size: 0.84rem; color: #64748b; line-height: 1.5; margin-bottom: 0.4rem; } | |
| </style> | |
| </head> | |
| <body> | |
| <div class="container"> | |
| <div class="hero"> | |
| <h1>OptiTransfer Data</h1> | |
| <p>Premium web corpora for LLM pre-training, fine-tuning, RAG, and multilingual NLP. Swiss-registered. EU AI Act compliant. Quality-scored, PII-redacted, SHA256-verified.</p> | |
| <div> | |
| <span class="badge">Swiss-Registered</span> | |
| <span class="badge">EU AI Act Compliant</span> | |
| <span class="badge">SHA256 Verified</span> | |
| <span class="badge">PII Redacted</span> | |
| </div> | |
| </div> | |
| <hr /> | |
| <div class="section"> | |
| <h2>Capabilities</h2> | |
| <div class="features"> | |
| <div class="feature-card"> | |
| <strong>LLM Training</strong> | |
| <p>Sovereign national web corpora at scale for pre-training and supervised fine-tuning</p> | |
| </div> | |
| <div class="feature-card"> | |
| <strong>RAG Pipelines</strong> | |
| <p>Pre-chunked, embedding-ready corpora with quality scores per chunk</p> | |
| </div> | |
| <div class="feature-card"> | |
| <strong>Regulatory NLP</strong> | |
| <p>Domain-classified, jurisdiction-specific government and institutional data</p> | |
| </div> | |
| <div class="feature-card"> | |
| <strong>Research</strong> | |
| <p>Reproducible datasets with full metadata, provenance tracking, and QA reports</p> | |
| </div> | |
| </div> | |
| </div> | |
| <div class="section"> | |
| <h2>Available Datasets</h2> | |
| <table> | |
| <thead> | |
| <tr> | |
| <th>Dataset</th> | |
| <th>Records</th> | |
| <th>Formats</th> | |
| <th>Access</th> | |
| </tr> | |
| </thead> | |
| <tbody> | |
| <tr> | |
| <td> | |
| <strong><a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch" target="_blank">*.ch Swiss Web Premium (A+)</a></strong> | |
| </td> | |
| <td>110,491</td> | |
| <td>Parquet, JSONL, Language Splits, RAG Chunks</td> | |
| <td> | |
| <a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch" target="_blank">Sample</a> | | |
| <a href="https://huggingface.co/datasets/OptiTransferData/swiss-web-premium-ch-full" target="_blank">Full</a> | |
| </td> | |
| </tr> | |
| </tbody> | |
| </table> | |
| <div class="dataset-detail"> | |
| <p>Flagship Swiss web corpus from the .ch ccTLD. 112.4M tokens across 78 fields. Multilingual coverage: German (61.2%), French (19.0%), English (10.5%), Italian (4.7%), and 25 additional languages. Nine-component quality model, full provenance chain, and independent QA report.</p> | |
| <div class="tag-group"> | |
| <span class="tag">LLM Pre-Training</span> | |
| <span class="tag">Supervised Fine-Tuning (SFT)</span> | |
| <span class="tag">Retrieval-Augmented Generation</span> | |
| <span class="tag">Multilingual NLP</span> | |
| <span class="tag">German Language Models</span> | |
| <span class="tag">French Language Models</span> | |
| <span class="tag">Swiss Market AI</span> | |
| <span class="tag">EU AI Act Compliance</span> | |
| <span class="tag">Domain-Specific Training</span> | |
| <span class="tag">Web Corpus Research</span> | |
| <span class="tag">Text Classification</span> | |
| <span class="tag">Summarisation</span> | |
| <span class="tag">Question Answering</span> | |
| <span class="tag">Translation</span> | |
| </div> | |
| </div> | |
| <p style="font-size:0.82rem; color:#64748b; margin-top:1rem;">Free gated samples available on each dataset. Request access to evaluate before purchasing.</p> | |
| </div> | |
| <div class="section"> | |
| <h2>Quality Standards</h2> | |
| <ul class="quality-list"> | |
| <li>Independent QA audits with documented accuracy metrics</li> | |
| <li>SHA-256 integrity verification on all production files</li> | |
| <li>Quality scoring per record (0 to 100 scale, nine components)</li> | |
| <li>Domain classification and language detection</li> | |
| <li>EU AI Act compliance with full data provenance and licensing transparency</li> | |
| <li>Content-level and URL-level deduplication</li> | |
| <li>PII detection and redaction (email, phone, IBAN, AHV, credit card)</li> | |
| <li>Croissant metadata for ML interoperability</li> | |
| </ul> | |
| </div> | |
| <div class="section"> | |
| <h2>Licensing and Pricing</h2> | |
| <div class="pricing-grid"> | |
| <div class="price-card"> | |
| <h3>Sample</h3> | |
| <div class="price">Free</div> | |
| <p>Gated access. Evaluate data quality, schema, and documentation before committing.</p> | |
| </div> | |
| <div class="price-card featured"> | |
| <h3>Full Dataset</h3> | |
| <div class="price">Commercial</div> | |
| <p>Complete production data with commercial licence. All formats included.</p> | |
| </div> | |
| <div class="price-card"> | |
| <h3>Enterprise</h3> | |
| <div class="price">Custom</div> | |
| <p>Dedicated support, SLA, bespoke corpora, volume pricing.</p> | |
| </div> | |
| </div> | |
| <p style="margin-top:1rem; font-size:0.88rem; color:#374151;">Contact us for a quote: <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p> | |
| <div class="payment"> | |
| <div class="payment-item">Bank Transfer (SEPA/SWIFT)</div> | |
| <div class="payment-item">TWINT (Swiss)</div> | |
| <div class="payment-item">Crypto (BTC / ETH / SOL)</div> | |
| </div> | |
| </div> | |
| <div class="contact-bar"> | |
| <strong>OptiTransfer Data</strong> | |
| <p>Swiss-registered | <a href="https://optitransfer.ch">optitransfer.ch</a> | <a href="mailto:data@optitransfer.ch">data@optitransfer.ch</a></p> | |
| </div> | |
| </div> | |
| </body> | |
| </html> | |