Fully Open-Sourced, or just Open-Weight?
I wanted to know if this model, and future models, are going to be fully open-weight or fully open-sourced, as in the datasets, training and code being all public!
Thanks for your question! Our goal is to move toward a more open ecosystem, and this model, as well as future models will be fully open-sourced in terms of weights and key innovations.
However, there are some practical limitations. We won’t be able to open-source the full training datasets or the complete training pipeline/code, mainly due to a mix of privacy, safety, and proprietary considerations.
That said, we are committed to sharing as much as possible. This includes:
- Model weights
- Inference code
- Key research insights and innovations we develop along the way
We believe this approach strikes a balance between openness and responsibility, while still enabling the community to build, experiment, and innovate on top of our work.
Thank you!