qqqqaXCN's picture

qqqqaXCN PRO

xcnqa009

·

AI & ML interests

None yet

Recent Activity

repliedto AbstractPhil's post about 16 hours ago

Today, I'll be determining the codebook capacity and utility potential for the larger batteries; Fresnel, Johanna, Grandmaster, Freckles, and Johanna-F variants, which should give a good indication of which models are capable of handling codebooks and which are more errant. The earlier all use SVD while the later do not. The differences are noted per and the behavior divergent. I anticipate the D=16 will be more errant, and the final-state variants of those could very well be much more difficult or costly to inference as their axis bends are likely considerably harder to track. However, I'm confident that enough bounces will give the yield required so I'll set up some high-yield noise barrages to determine how much of them we can in fact extract from Johanna, and then set up similar barrages for images to map the internals of Fresnel and Grandmaster. Grandmaster will be tricky, as it was an experimental Johanna-256 finetuned series meant to map sigma noised image inputs to recreate Fresnel behavioral output. Noised image goes in -> Fresnel-grade replication comes out in high res. This allowed preliminary Dall-E Mini-esque VAE generation and will be explored further for the stereoscopic translation subsystem, to allow image generation in the unique format of diffusion that I was working out. I anticipate this system to be more than capable at making monstrosities, so I won't be posting TOO MANY prelims on this one, but the high-capacity potential of these noise makers are meaningfully powerful. Getting uniform codebooks in-place for these models will allow full transformer mapping downstream instead of just guess working the MSE piecemeal, which the earlier versions and variants were doing. I'm straying from the CLS specifically for this series because CLS creates adjudicated pools of bias orbiting the INCORRECT orbiter some SVAE. The orbital target IS the soft-hand accumulated bias with the sphere-norm, so having a competitor isn't going to be a good option.

repliedto ajibawa-2023's post about 2 months ago

Cpp-Code-Large Dataset: https://huggingface.co/datasets/ajibawa-2023/Cpp-Code-Large Cpp-Code-Large is a large-scale corpus of C++ source code comprising more than 5 million lines of C++ code. The dataset is designed to support research in large language model (LLM) pretraining, code intelligence, software engineering automation, and static program analysis for the C++ ecosystem. By providing a high-volume, language-specific corpus, Cpp-Code-Large enables systematic experimentation in C++-focused model training, domain adaptation, and downstream code understanding tasks. Cpp-Code-Large addresses the need for a dedicated C++-only dataset at substantial scale, enabling focused research across systems programming, performance-critical applications, embedded systems, game engines, and large-scale native software projects.

repliedto Javedalam's post 3 months ago

When an AI Model Solves College-Level Math and Physics — On a Phone This morning I came across a model called Nanbeige4.1-3B, and what began as simple curiosity quickly became something more significant. I loaded an already 4-bit quantized version and ran it locally on a phone. No GPU, no cloud support, no hidden infrastructure — just a compact reasoning model operating entirely at the edge. I started with classical mechanics: acceleration, force, friction on an incline. The model worked through them cleanly and correctly. Then I stepped into calculus and gave it a differential equation. It immediately recognized the structure, chose the proper method, carried the mathematics through without confusion, and verified the result. It did not behave like a model trying to sound intelligent. It behaved like a system trained to solve problems. And it was doing this on a phone. For a long time, we have associated serious reasoning in AI with massive models and enormous compute. Capability was supposed to live inside data centers. Bigger models were expected to mean smarter systems. But watching Nanbeige4.1-3B handle college-level math and physics forces a rethink of that assumption. Intelligence is not only expanding — it is compressing. Better training and sharper reasoning alignment are allowing smaller models to operate far beyond what their size once suggested. When structured problem-solving runs locally on pocket hardware, the implications are larger than they first appear. Experimentation becomes personal. Engineers can explore ideas without waiting on infrastructure. Students can access serious analytical capability from a device they already carry. Builders are no longer required to send every complex task into the cloud. What makes moments like this easy to miss is that they rarely arrive with fanfare. There is no dramatic announceme The model responses are here https://fate-stingray-0b3.notion.site/AI-model-Nanbeige4-1-3B-304

View all activity

Organizations

None yet

xcnqa009 's datasets

None public yet