MoireFormer (104.9M Proof-of-Concept)
This repository hosts the PyTorch weights (moire_phase2_weights_final.pt) for MoireFormer,
a fundamentally new neural network architecture that replaces standard scalar dot-product attention with Moiré phase-interference wave mechanics.
Instead of computing attention via Q · K^T, this model splits token embeddings into amplitude and phase
(q_amp, q_phase) and computes attention through geometric wave resonance (q_real * k_real + q_imag * k_imag).
This proves that artificial intelligence can be trained using the continuous, biological wave-geometry observed
in human EEGs.
🔗 GitHub Repository (Code & Inference): anttiluode/MoireFormer 🔗 Theory & Clinical Proof: anttiluode/Geometric-Neuron
Model Details
- Architecture: MoireGPT (Custom Transformer Bolt-on)
- Size: 104.9M Parameters
- Structure: 8 Layers, 8 Heads, 768 Embedding Dimension
- Capabilities: Coherent bilingual (English/Spanish) grammar, persona adoption (Assistant), structural instruction following.
- Disclaimer: At ~100M parameters, this is a proof-of-substrate, not a knowledge oracle. It demonstrates that wave fields can learn discrete human syntax, but it will hallucinate factual data due to its small parameter count.
⚠️ How to Use (Read Before Downloading)
Because this is a novel mathematical architecture, you cannot load this model using the standard Hugging Face AutoModel pipeline.
To run inference, you must download these weights and run them through the custom Moiré architecture provided in the GitHub repository.
Step-by-Step Instructions:
1. Clone the GitHub Repository:
git clone [https://github.com/anttiluode/MoireFormer.git](https://github.com/anttiluode/MoireFormer.git)
cd MoireFormer
2. Download the Weights:
Download moire_phase2_weights_final.pt from the Files and versions tab of this Hugging Face repository and place
it in your cloned MoireFormer folder.
3. Run the Chat Interface:
pip install torch transformers datasets
python moire_chat.py --weights moire_phase2_weights_final.pt --size large
Training Curriculum
The model was trained in two continuous phases to demonstrate that wave-fields avoid catastrophic forgetting via
phase-locking (destructive and constructive interference):
Phase 1 (Base Geometry): 15 Epochs on a mixed dataset of Databricks Dolly-15k, WikiText-2, and OpenAssistant.
This established the foundational phase-space for English and conversational structure.
Phase 2 (Phase-Space Expansion): 5 Epochs finetuning on the Guanaco dataset to refine logical geometry
and instruction-following, organically expanding the model's topological complexity without overwriting previous data.
(Perhaps?)