Mistral 7B Instruct v0.3
Mistral 7B was the first widely-used open-weight model that made local inference genuinely practical for hobbyists. v0.3 is the final maintenance release of that line — same 7-billion-parameter architecture, expanded 32,768-token vocabulary, and support for function calling.
Why it still matters
Newer 7B models from Qwen and Meta now outperform it on most benchmarks. But Mistral 7B remains a useful baseline: it's small, fast, has a stable Apache 2.0 license, and has the deepest fine-tune ecosystem of any 7B model — thousands of community variants exist on Hugging Face.
What it's good at
General instruction-following in English and major European languages. Good at structured output. A common choice when you need something that runs on a single 16 GB consumer GPU at full precision.
Running it locally
Full weights are ~14 GB at FP16, comfortably fitting any modern GPU. Q4_K_M GGUF is about 4.4 GB and runs on CPU-only at a few tokens per second, or on a laptop's integrated GPU.
License
Apache 2.0. No restrictions on commercial or non-commercial use, redistribution, or derivative works.