KittyHawk Model Hub

Llama 3.3 70B Instruct

Meta's flagship dense instruction-tuned model. Matches the original 405B on most benchmarks at a fraction of the inference cost.

70B params 128K ctx Llama 3.3 Community

View & download

DeepSeek

DeepSeek-V3

671B-parameter Mixture-of-Experts with 37B active per token. Strong on code and math; competitive with the top closed models at far lower training cost.

671B (37B active) MoE 128K ctx MIT

View & download

DeepSeek

DeepSeek-R1

A reasoning model trained with reinforcement learning. Exposes a transparent chain-of-thought and is unusually good at multi-step problems.

671B (37B active) Reasoning MIT

View & download

Alibaba

Qwen2.5-72B-Instruct

A general-purpose dense model that punches above its weight in multilingual and coding tasks. The 7B/14B/32B variants are popular for fine-tuning.

72B params 128K ctx Qwen License

View & download

Mistral AI

Mixtral 8x22B Instruct

Sparse Mixture-of-Experts with 8 experts of 22B parameters each. Fast for its size thanks to only 39B active params per token.

141B (39B active) MoE Apache 2.0

View & download

Mistral AI

Mistral 7B Instruct v0.3

The 7B that made local inference real for a lot of people. Still a sensible baseline when you want something fast on a single GPU.

7B params 32K ctx Apache 2.0

View & download

Google

Gemma 2 27B Instruct

Google's open model family. The 27B is the sweet spot: noticeably stronger than the 9B and still fits on a single high-end consumer GPU at 4-bit.

27B params 8K ctx Gemma Terms

View & download

Microsoft

Phi-4

A 14B model trained heavily on synthetic data. Punches well above its weight on reasoning and math benchmarks for its size.

14B params 16K ctx MIT

View & download

Cohere

Command R+ (08-2024)

104B dense model tuned for retrieval-augmented generation and tool use. Among the best open-weights options for grounded enterprise workflows.

104B params 128K ctx CC-BY-NC 4.0

View & download

01.AI

Yi-1.5 34B Chat

A bilingual (Chinese/English) model with strong long-context retrieval. The 34B is a popular choice for serious self-hosting on a 48GB card.

34B params 16K ctx Apache 2.0

View & download

Zhipu AI / Tsinghua

GLM-4 9B Chat

A compact bilingual chat model with surprisingly strong instruction-following. Competitive with much larger Western models in its size class.

9B params 128K ctx GLM License

View & download

TII

Falcon 3 10B Instruct

TII's third-generation Falcon. The 10B targets edge deployment with respectable English-only performance and a very permissive license.

10B params 32K ctx Falcon LLM License

View & download

AI2

OLMo 2 13B Instruct

A genuinely open model: weights, training data, and code all published. The natural pick when reproducibility and provenance matter.

13B params 4K ctx Apache 2.0

View & download

Hugging Face

SmolLM2 1.7B Instruct

A genuinely small model that's still useful. Designed for on-device deployment where 8B is too heavy.

1.7B params 8K ctx Apache 2.0

View & download

IBM

Granite 3.1 8B Instruct

IBM's enterprise-leaning open model. Stable, well-documented, and oriented toward business workflows rather than benchmark chasing.

8B params 128K ctx Apache 2.0

View & download

A practical index of open-weight AI models.