OLMo 2 13B Instruct
OLMo (Open Language Model) is AI2's effort to produce models that are genuinely open by every measure: weights, training data, training code, intermediate checkpoints, and evaluation harness are all published. OLMo 2 is the second generation, with 7B and 13B variants.
Why "fully open" matters
Most "open-weight" models are open in name only — the weights are released but the training data, the data filtering scripts, and the post-training mixes are proprietary. OLMo publishes all of these. If you care about model provenance, reproducibility, or being able to audit what your model was trained on, OLMo is the only serious option at this scale.
What it's good at
OLMo 2 13B benchmarks competitively with Llama 3.1 8B on most tasks. The point is less to beat the closed-data competition and more to demonstrate that fully open development can produce competitive models — and to provide a research artifact others can build on.
The Dolma training corpus
OLMo 2 was trained on the Dolmino corpus, a successor to AI2's original Dolma. It's filterable, reproducible, and explicitly licensed. If you're doing data-attribution research, this is invaluable.
License
Apache 2.0 for weights and code. Training data licensing is handled per-source and documented in the corpus release.