Models · sarmalink

Model catalogue

Bring keys for any of these 17 providers — 40 free-tier models plus the paid flagships from OpenAI, Anthropic and xAI — and sarmalink will route, retry and failover for you. Free keys are always tried first; paid keys are only used when you add them.

Free models

Providers

Markup tokens

Groq

Free tier

LPU inference, the easiest free key to start with. · 14,400 requests/day, 30 RPM

Get free key →

Llama 3.3 70B Versatile

Free131.072k

llama-3.3-70b-versatile

14,400 req/day · 30 RPM · 12k TPM

Workhorse generalist. The most reliable free Groq model.

FastSmartTools

Llama 3.1 8B Instant

Free131.072k

llama-3.1-8b-instant

14,400 req/day · 30 RPM

Sub-second answers for cheap workflows.

Fast

Llama 4 Scout 17B

Free131.072k

meta-llama/llama-4-scout-17b-16e-instruct

14,400 req/day · 30 RPM

Vision-capable MoE — image upload roadmap pending.

SmartVision

Qwen3 32B

Free131.072k

qwen/qwen3-32b

14,400 req/day · 30 RPM

SmartMultilingual

DeepSeek R1 Distill 70B

Free131.072k

deepseek-r1-distill-llama-70b

1,000 req/day · 30 RPM

Cheap chain-of-thought without the latency tax.

Reasoner

gpt-oss 120B

Free131.072k

openai/gpt-oss-120b

1,000 req/day · 30 RPM

SmartTools

Kimi K2 Instruct

Free131.072k

moonshotai/kimi-k2-instruct

1,000 req/day · 30 RPM

SmartTools

SambaNova

Free tier

DeepSeek V3.2 frontier model on RDU hardware. · Frontier model, generous daily quota

Get free key →

DeepSeek V3.1

Free131.072k

DeepSeek-V3.1

Free tier, generous daily quota

SmartCoder

DeepSeek R1

Free131.072k

DeepSeek-R1

Free tier, lower daily quota

Reasoner

Llama 3.3 70B Instruct

Free131.072k

Meta-Llama-3.3-70B-Instruct

Free tier

Smart

Cerebras

Free tier

WSE-3 wafer-scale inference at extreme speed. · 1M tokens/day free

Get free key →

Qwen3 Coder 480B

Free131.072k

qwen-3-coder-480b

1M tokens/day free · ~2,400 tokens/sec

Wafer-scale silicon: editor-level autocomplete.

CoderFast

Llama 3.3 70B

Free131.072k

llama3.3-70b

1M tokens/day free

FastSmart

Google Gemini

Free tier

Flash, Pro, and Google Search grounding for Live mode. · Flash + Pro, Search grounding

Get free key →

Gemini 2.5 Flash

Free1,048.576k

gemini-2.5-flash

500 req/day · 15 RPM (free)

Best free big-context option with Google Search grounding.

FastVisionTools

Gemini 2.5 Flash Lite

Free1,048.576k

gemini-2.5-flash-lite

1,000 req/day · 30 RPM (free)

Fast

Gemini 2.5 Pro

Free1,048.576k

gemini-2.5-pro

50 req/day · 5 RPM (free)

SmartReasonerVision

OpenRouter

Free tier

Aggregator with 17+ models behind one key. · 17+ models, free variants

Get free key →

DeepSeek Chat v3.1

Free163.84k

deepseek/deepseek-chat-v3.1:free

50 req/day · 20 RPM

Frontier open-weight, completely free on OpenRouter.

SmartCoder

DeepSeek R1

Free163.84k

deepseek/deepseek-r1:free

50 req/day · 20 RPM

Reasoner

Qwen3 Coder

Free262.144k

qwen/qwen3-coder:free

50 req/day · 20 RPM

CoderTools

Qwen3 235B A22B

Free131.072k

qwen/qwen3-235b-a22b:free

50 req/day · 20 RPM

SmartMultilingual

Llama 3.3 70B Instruct

Free131.072k

meta-llama/llama-3.3-70b-instruct:free

50 req/day · 20 RPM

Smart

Gemini 2.0 Flash Exp

Free1,048.576k

google/gemini-2.0-flash-exp:free

50 req/day · 20 RPM

FastVision

Microsoft MAI DS R1

Free163.84k

microsoft/mai-ds-r1:free

50 req/day · 20 RPM

Reasoner

Kimi K2

Free131.072k

moonshotai/kimi-k2:free

50 req/day · 20 RPM

SmartTools

GLM-4.5 Air

Free131.072k

z-ai/glm-4.5-air:free

50 req/day · 20 RPM

Smart

Nemotron Nano 9B v2

Free131.072k

nvidia/nemotron-nano-9b-v2:free

50 req/day · 20 RPM

Fast

Mistral Small 3.2 24B

Free131.072k

mistralai/mistral-small-3.2-24b-instruct:free

50 req/day · 20 RPM

Smart

NVIDIA NIM

Free tier

Llama-Nemotron, Nemotron Mini, plus a Mixture of frontier open weights. · 1000 free credits at sign-up

Get free key →

Llama Nemotron 70B

Free128k

nvidia/llama-3.1-nemotron-70b-instruct

1,000 free credits at sign-up

SmartTools

Llama 3.3 70B

Free128k

meta/llama-3.3-70b-instruct

1,000 free credits at sign-up

Smart

Qwen2.5 Coder 32B

Free32k

qwen/qwen2.5-coder-32b-instruct

1,000 free credits

Coder

DeepSeek

Paid

DeepSeek V3.2 + R1 reasoner. Best £/token on the market. · Pay-as-you-go, very cheap

Get API key →

DeepSeek V3.2 Chat

Paid128k

deepseek-chat

Pay-as-you-go, ≈ $0.27/M input

SmartCoder

DeepSeek R1 Reasoner

Paid128k

deepseek-reasoner

Pay-as-you-go, ≈ $0.55/M input

Reasoner

Alibaba Qwen

Free tier

Qwen3 Max, Qwen-Coder, Qwen-VL via Alibaba DashScope. · Free tier on DashScope

Get free key →

Qwen3 Max

Free256k

qwen3-max

1M tokens free trial

SmartMultilingual

Qwen3 Coder Plus

Free256k

qwen3-coder-plus

1M tokens free trial

Coder

Qwen VL Max

Free32k

qwen-vl-max

1M tokens free trial

Vision

Moonshot Kimi

Free tier

Kimi K2 with 256k context — long-doc and agent workloads. · Free trial credits

Get free key →

Kimi K2

Free256k

kimi-k2

Free trial credits

SmartTools

Kimi K2 Thinking

Free256k

kimi-k2-thinking

Free trial credits

Reasoner

Zhipu GLM

Free tier

GLM-4.6 + GLM-Z1 reasoner. Strong agentic tool-use. · Generous free quota

Get free key →

GLM-4.6 Plus

Free128k

glm-4-plus

Generous free quota

SmartTools

GLM-Z1 Air

Free128k

glm-z1-air

Free quota

Reasoner

Mistral

Free tier

Mistral Large + Codestral 25.08 for European data residency. · La Plateforme free tier

Get free key →

Mistral Small

Free128k

mistral-small-latest

1 req/sec · 500k tokens/min (free)

FastSmart

Codestral 25.08

Free256k

codestral-latest

Free trial credits

CoderFast

OpenAI

Paid

GPT-5.1 and GPT-5 mini straight from the source. · Paid — usage-based

Get API key →

GPT-5.1

Paid400k

gpt-5.1

Paid — usage-based

OpenAI flagship with adaptive reasoning.

SmartReasonerCoderVisionTools

GPT-5 mini

Paid400k

gpt-5-mini

Paid — usage-based

FastSmartTools

Anthropic

Paid

Claude Sonnet, Haiku and Opus via the OpenAI-compatible endpoint. · Paid — usage-based

Get API key →

Claude Sonnet 4.5

Paid200k

claude-sonnet-4-5

Paid — usage-based

The everyday Claude: strong coding and agentic tool use.

SmartCoderVisionTools

Claude Haiku 4.5

Paid200k

claude-haiku-4-5

Paid — usage-based

FastTools

Claude Opus 4.5

Paid200k

claude-opus-4-5

Paid — usage-based

SmartReasonerCoder

xAI Grok

Paid

Grok 4 and the fast Grok 4 variants from xAI. · Paid — usage-based

Get API key →

Grok 4

Paid256k

grok-4

Paid — usage-based

SmartReasonerTools

Grok 4 Fast

Paid2,000k

grok-4-fast-non-reasoning

Paid — usage-based

FastSmart

Together AI

Free tier

Llama, DeepSeek and Qwen open weights on serverless GPUs. · Free endpoints on select open models

Get free key →

Llama 3.3 70B Turbo

Paid131.072k

meta-llama/Llama-3.3-70B-Instruct-Turbo

Paid — usage-based (free variant available)

SmartFast

DeepSeek R1

Paid163.84k

deepseek-ai/DeepSeek-R1

Paid — usage-based

Reasoner

Qwen2.5 Coder 32B

Paid32.768k

Qwen/Qwen2.5-Coder-32B-Instruct

Paid — usage-based

Coder

Fireworks AI

Paid

Low-latency serving for open models: Llama, DeepSeek, Qwen. · Paid — usage-based

Get API key →

Llama 3.3 70B

Paid131.072k

accounts/fireworks/models/llama-v3p3-70b-instruct

Paid — usage-based

SmartFast

DeepSeek R1

Paid163.84k

accounts/fireworks/models/deepseek-r1

Paid — usage-based

Reasoner

DeepSeek V3

Paid131.072k

accounts/fireworks/models/deepseek-v3

Paid — usage-based

SmartCoder

Novita AI

Free tier

Budget serving of open models with trial credits at sign-up. · Free trial credits

Get free key →

Llama 3.3 70B

Free131.072k

meta-llama/llama-3.3-70b-instruct

Trial credits, then usage-based

SmartFast

DeepSeek R1

Free64k

deepseek/deepseek-r1

Trial credits, then usage-based

Reasoner

We never put a hand in the cookie jar

sarmalink is a wrapper, not a wallet. Tokens billed by Groq, Gemini, DeepSeek or anyone else go straight to them at the rate you signed up for. We do not skim, mark up, or proxy through a paid pool. If you want a paid-tier model, bring the paid-tier key.

Start free in 60 seconds →