§03 — Explore Models

Explore Models.

Discover 225+ credit-based AI models.

Aion 1.0

AionLabs

Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It …

131K tokens

2 creditsaion-labs/aion-1.0

Aion 2.0

AionLabs

Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong a…

131K tokens

3 creditsaion-labs/aion-2.0

Alex

Replicate

A relaxed, informal male voice for chatting with friends.

5 creditsCasual_Guy

Arthur

Replicate

A noble, chivalrous young male voice for heroic tales.

5 creditsYoung_Knight

Chloe

Replicate

A cheerful, bubbly young female voice that radiates positivity.

5 creditsLively_Girl

Claude 3 Haiku

Anthropic

Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targete…

200K tokensVision

5 creditsanthropic/claude-3-haiku

Claude 3.5 Haiku

Anthropic

Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in r…

200K tokensVision

10 creditsanthropic/claude-3.5-hai…

Claude Haiku (Latest)

Anthropic

This model always redirects to the latest model in the Anthropic Claude Haiku family.

200K tokensVision

25 credits~anthropic/claude-haiku-…

Claude Haiku 4.5

Anthropic

Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of…

200K tokensVision

25 creditsanthropic/claude-haiku-4…

Claude Opus 4

Anthropic

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on com…

200K tokensVision

250 creditsanthropic/claude-opus-4

Claude Opus 4.1

Anthropic

Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning,…

200K tokensVision

250 creditsanthropic/claude-opus-4.…

Claude Opus 4.5

Anthropic

Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, a…

200K tokensVision

100 creditsanthropic/claude-opus-4.…

Claude Opus 4.6

Anthropic

Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that oper…

1000K tokensVision

100 creditsanthropic/claude-opus-4.…

Claude Opus 4.6 Fast

Anthropic

Fast-mode variant of Opus 4.6 - identical capabilities with higher output speed at premium 6x pricing.

1000K tokensVision

600 creditsanthropic/claude-opus-4.…

Claude Opus 4.7

Anthropic

Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the…

1000K tokensVision

80 creditsanthropic/claude-opus-4.…

Claude Opus 4.7 Fast

Anthropic

Fast-mode variant of Opus 4.7 - identical capabilities with higher output speed at premium 6x pricing.

1000K tokensVision

480 creditsanthropic/claude-opus-4.…

Claude Opus 4.8

Anthropic

Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and f…

1000K tokensVision

120 creditsanthropic/claude-opus-4.…

Claude Opus 4.8 Fast

Anthropic

Fast-mode variant of Opus 4.8 - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4…

1000K tokensVision

250 creditsanthropic/claude-opus-4.…

Claude Sonnet (Latest)

Anthropic

This model always redirects to the latest model in the Anthropic Claude Sonnet family.

1000K tokensVision

60 credits~anthropic/claude-sonnet…

Claude Sonnet 4.5

Anthropic

Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflow…

1000K tokensVision

60 creditsanthropic/claude-sonnet-…

Claude Sonnet 4.6

Anthropic

Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and prof…

1000K tokensVision

60 creditsanthropic/claude-sonnet-…

Command R7B

Cohere

Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, too…

128K tokens

2 creditscohere/command-r7b-12-20…

Cydonia 24B V4.1

TheDrummer

Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligenc…

131K tokens

5 creditsthedrummer/cydonia-24b-v…

David

Replicate

A deep, authoritative male voice ideal for professional presentations.

5 creditsDeep_Voice_Man

DeepSeek Chat V3

Specialized chat model from DeepSeek, optimized for conversation.

2 creditsdeepseek/deepseek-chat-v…

DeepSeek R1

DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tok…

164K tokens

10 creditsdeepseek/deepseek-r1

DeepSeek R1 0528

DeepSeek

May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reas…

164K tokens

12 creditsdeepseek/deepseek-r1-052…

DeepSeek R1 Distill Llama 70B

DeepSeek

DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.…

131K tokens

3 creditsdeepseek/deepseek-r1-dis…

DeepSeek R1 Distill Qwen 32B

DeepSeek

DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwe…

128K tokens

2 creditsdeepseek/deepseek-r1-dis…

DeepSeek V3

DeepSeek

DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of …

131K tokens

5 creditsdeepseek/deepseek-chat

DeepSeek V3 0324

DeepSeek

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from …

164K tokens

5 creditsdeepseek/deepseek-chat-v…

DeepSeek V3.1

DeepSeek

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinki…

164K tokens

6 creditsdeepseek/deepseek-chat-v…

DeepSeek V3.1 Nex N1

Nex AGI

DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent aut…

131K tokens

6 creditsnex-agi/deepseek-v3.1-ne…

DeepSeek V3.1 Terminus

DeepSeek

DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original…

164K tokens

6 creditsdeepseek/deepseek-v3.1-t…

DeepSeek V3.2

DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and ag…

131K tokens

2 creditsdeepseek/deepseek-v3.2

DeepSeek V3.2 Exp

DeepSeek

DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and …

164K tokens

6 creditsdeepseek/deepseek-v3.2-e…

DeepSeek V4 Flash

DeepSeek

DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B a…

1049K tokens

1 creditsdeepseek/deepseek-v4-fla…

DeepSeek V4 Pro

DeepSeek

DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated par…

1049K tokens

3 creditsdeepseek/deepseek-v4-pro

Devstral 2512

Mistral AI

Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter …

262K tokens

4 creditsmistralai/devstral-2512

Dewi

Replicate

A serene, calm female voice in Indonesian.

5 creditsIndonesian_CalmWoman

Emily

Replicate

An energetic, uplifting young female voice full of motivation.

5 creditsInspirational_girl

ERNIE 4.5 300B A47B

Baidu

ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE …

131K tokens

20 creditsbaidu/ernie-4.5-vl-424b-…

Ethan

Replicate

A polite, well-mannered young male voice for formal settings.

5 creditsDecent_Boy

Gemini 2.5 Flash

Google

Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mat…

1049K tokensVision

4 creditsgoogle/gemini-2.5-flash

Gemini 2.5 Flash Lite

Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cos…

1049K tokensVision

1 creditsgoogle/gemini-2.5-flash-…

Gemini 2.5 Flash Lite (Sep 2025)

Google

Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cos…

1049K tokensVision

3 creditsgoogle/gemini-2.5-flash-…

Gemini 2.5 Pro

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientifi…

1049K tokensVision

5 creditsgoogle/gemini-2.5-pro

Gemini 3 Flash Preview

Google

Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and c…

1049K tokensVision

5 creditsgoogle/gemini-3-flash-pr…

Gemini 3.1 Flash Lite

Google

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. …

1049K tokensVision

3 creditsgoogle/gemini-3.1-flash-…

Gemini 3.1 Flash Lite

Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemi…

1049K tokensVision

3 creditsgoogle/gemini-3.1-flash-…

Gemini 3.1 Pro

Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, impro…

1049K tokensVision

30 creditsgoogle/gemini-3.1-pro-pr…

Gemini 3.1 Pro (Custom Tools)

Google

Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing o…

1049K tokensVision

30 creditsgoogle/gemini-3.1-pro-pr…

Gemini 3.5 Flash

Google

Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tie…

1049K tokensVision

5 creditsgoogle/gemini-3.5-flash

Gemini Flash (Latest)

Google

This model always redirects to the latest model in the Google Gemini Flash family.

1049K tokensVision

5 credits~google/gemini-flash-lat…

Gemini Pro (Latest)

Google

This model always redirects to the latest model in the Google Gemini Pro family.

1049K tokensVision

30 credits~google/gemini-pro-lates…

GLM 4.5

Z.AI

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-E…

131K tokens

2 creditsz-ai/glm-4.5

GLM 4.5 Air

Z.AI

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applica…

131K tokens

2 creditsz-ai/glm-4.5-air

GLM 4.5V

Z.AI

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) ar…

66K tokensVision

3 creditsz-ai/glm-4.5v

GLM 4.6

Z.AI

Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has be…

203K tokens

2 creditsz-ai/glm-4.6

GLM 4.6V

Z.AI

GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across i…

131K tokensVision

3 creditsz-ai/glm-4.6v

GLM 4.7

Z.AI

GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more…

203K tokens

3 creditsz-ai/glm-4.7

GLM 4.7 Flash

Z.AI

As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further opt…

203K tokens

2 creditsz-ai/glm-4.7-flash

GLM 5.1 Reasoning

Z-AI

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. …

203K tokens

10 creditsz-ai/glm-5.1

GLM-5

Z.AI

GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workf…

203K tokens

11 creditsz-ai/glm-5

GLM-5 Turbo

Z.AI

GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments suc…

203K tokens

12 creditsz-ai/glm-5-turbo

GLM-5V-Turbo

Z.AI

GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven ta…

203K tokensVision

12 creditsz-ai/glm-5v-turbo

GPT (Latest)

OpenAI

This model always redirects to the latest model in the OpenAI GPT family.

1050K tokensVision

30 credits~openai/gpt-latest

GPT Chat Latest

OpenAI

GPT Chat Latest points to OpenAI's stable API alias chat-latest that always resolves to the latest Instant chat model us…

400K tokensVision

150 creditsopenai/gpt-chat-latest

GPT Mini (Latest)

OpenAI

This model always redirects to the latest model in the OpenAI GPT Mini family.

400K tokensVision

3 credits~openai/gpt-mini-latest

GPT-3.5 Turbo

OpenAI

GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for c…

16K tokens

2 creditsopenai/gpt-3.5-turbo

GPT-4o

OpenAI

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintai…

128K tokensVision

99 creditsopenai/gpt-4o

GPT-4o Mini

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs wi…

128K tokensVision

99 creditsopenai/gpt-4o-mini

GPT-4o Mini Search Preview

OpenAI

GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and ex…

128K tokensWeb Search

5 creditsopenai/gpt-4o-mini-searc…

GPT-5 Mini

OpenAI

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instru…

400K tokensVision

3 creditsopenai/gpt-5-mini

GPT-5 Nano

OpenAI

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, a…

400K tokensVision

4 creditsopenai/gpt-5-nano

GPT-5.1 Codex Max

OpenAI

GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development …

400K tokensVision

8 creditsopenai/gpt-5.1-codex-max

GPT-5.2

GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance co…

400K tokensVision

5 creditsopenai/gpt-5.2

GPT-5.2 Chat

OpenAI

GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retai…

128K tokensVision

4 creditsopenai/gpt-5.2-chat

GPT-5.2 Codex

OpenAI

GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is des…

400K tokensVision

6 creditsopenai/gpt-5.2-codex

GPT-5.3 Chat

OpenAI

GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more…

128K tokensVision

4 creditsopenai/gpt-5.3-chat

GPT-5.3 Codex

OpenAI

GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of…

400K tokensVision

15 creditsopenai/gpt-5.3-codex

GPT-5.4

GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ toke…

1050K tokensVision

28 creditsopenai/gpt-5.4

GPT-5.4 Mini

OpenAI

GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput wor…

400K tokensVision

3 creditsopenai/gpt-5.4-mini

GPT-5.4 Nano

OpenAI

GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and …

400K tokensVision

2 creditsopenai/gpt-5.4-nano

GPT-5.4 Pro

GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabili…

1050K tokensVision

30 creditsopenai/gpt-5.4-pro

GPT-5.5

OpenAI

GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reason…

1050K tokensVision

120 creditsopenai/gpt-5.5

GPT-5.5 Pro

OpenAI

GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workload…

1050K tokensVision

680 creditsopenai/gpt-5.5-pro

Grace

Replicate

A gentle, soothing female voice ideal for relaxation and meditation.

5 creditsCalm_Woman

Granite 4.0 H Micro

IBM

Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of mo…

131K tokens

2 creditsibm-granite/granite-4.0-…

Granite 4.1 8B

IBM

Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It …

131K tokens

2 creditsibm-granite/granite-4.1-…

Grok 4.20 Beta

xAI

Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines t…

2000K tokensVision

28 creditsx-ai/grok-4.20

Grok 4.20 Multi-Agent

xAI

Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents…

2000K tokensVision

30 creditsx-ai/grok-4.20-multi-age…

Grok 4.3

xAI

Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic wor…

1000K tokensVision

50 creditsx-ai/grok-4.3

Grok Build 0.1

xAI

Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports t…

256K tokensVision

20 creditsx-ai/grok-build-0.1

Henry

Replicate

A sophisticated, refined male voice for high‑end presentations.

5 creditsElegant_Man

Hermes 3 405B Instruct

Nous Research

Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m…

131K tokens

25 creditsnousresearch/hermes-3-ll…

Hermes 3 70B Instruct

Nous Research

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistra…

131K tokens

10 creditsnousresearch/hermes-3-ll…

Hermes 4 405B

Nous Research

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hy…

131K tokens

25 creditsnousresearch/hermes-4-40…

Hermes 4 70B

Nous Research

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid …

131K tokens

12 creditsnousresearch/hermes-4-70…

Hunyuan A13B Instruct

Tencent

Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parame…

131K tokens

4 creditstencent/hunyuan-a13b-ins…

Hunyuan3D 3.1

Tencent

Generate 3D models from text or images.

Free creditstencent/hunyuan-3d-3.1

HY3 Preview

Tencent

Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use…

262K tokens

1 creditstencent/hy3-preview

Intellect-3

Prime Intellect

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervise…

131K tokens

3 creditsprime-intellect/intellec…

Jamba Large 1.7

AI21

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following,…

256K tokens

12 creditsai21/jamba-large-1.7

James

Replicate

A warm, approachable male voice perfect for everyday conversation.

5 creditsFriendly_Person

Kimi (Latest)

Moonshot AI

This model always redirects to the latest model in the MoonshotAI Kimi family.

262K tokensVision

10 credits~moonshotai/kimi-latest

Kimi K2

Moonshot AI

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion…

131K tokens

5 creditsmoonshotai/kimi-k2

Kimi K2 Thinking

Moonshot AI

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long…

262K tokens

6 creditsmoonshotai/kimi-k2-think…

Kimi K2-0905

Moonshot AI

Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE)…

262K tokens

5 creditsmoonshotai/kimi-k2-0905

Kimi K2.5

Moonshot AI

Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-dire…

262K tokensVision

8 creditsmoonshotai/kimi-k2.5

Kimi K2.6

Moonshot AI

Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX gener…

262K tokensVision

10 creditsmoonshotai/kimi-k2.6

Kimi K2.7 Code

Moonshot AI

MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end progr…

262K tokensVision

20 creditsmoonshotai/kimi-k2.7-cod…

Krenola 5 Max

RainSpeed

This is a fake model that always fails.

4K tokens

1 creditsrainspeed/krenola-5-max

Leo

Replicate

A firm, resolute male voice that conveys confidence and purpose.

5 creditsDetermined_Man

Llama 3 8B Lunaris

Sao10K

Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, …

8K tokens

2 creditssao10k/l3-lunaris-8b

Llama 3.1 Euryale 70B v2.2

Sao10K

Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the success…

131K tokens

8 creditssao10k/l3.1-euryale-70b

Llama 3.2 11B Vision Instruct

Meta

Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and tex…

131K tokensVision

3 creditsmeta-llama/llama-3.2-11b…

Llama 3.2 1B Instruct

Meta

Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as s…

131K tokens

0 creditsmeta-llama/llama-3.2-1b-…

Llama 3.2 3B Instruct

Meta

Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process…

131K tokens

0 creditsmeta-llama/llama-3.2-3b-…

Llama 3.3 70B Instruct

Meta

The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B…

131K tokens

5 creditsmeta-llama/llama-3.3-70b…

Llama 4 Maverick

Meta

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-exper…

1049K tokensVision

6 creditsmeta-llama/llama-4-maver…

Llama Nemotron Embed VL

Nvidia

Free embedding model with vision-language understanding.

1 creditsnvidia/llama-nemotron-em…

Lyria 3

Google

Generates short music clips up to 30 seconds. Ideal for quick prototyping.

30 creditsgoogle/lyria-3

Lyria 3 Pro

Google

Generates full songs up to 3 minutes with detailed structure and high‑quality vocals.

40 creditsgoogle/lyria-3-pro

Magnum v4 72B

Anthracite

This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://o…

33K tokens

8 creditsanthracite-org/magnum-v4…

Maya

Replicate

An assertive, confident female voice in Indonesian.

5 creditsIndonesian_ConfidentWoma…

Mercury 2

Inception

Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens…

128K tokens

2 creditsinception/mercury-2

Mia

Replicate

A light, airy female voice, great for children's content.

5 creditsSweet_Girl_2

MiMo V2 Flash

Xiaomi

MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309…

262K tokens

17 creditsxiaomi/mimo-v2-flash

MiMo V2.5

Xiaomi

MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference…

1049K tokensVision

20 creditsxiaomi/mimo-v2.5

MiMo V2.5 Pro

Xiaomi

MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex softwar…

1049K tokens

28 creditsxiaomi/mimo-v2.5-pro

MiniMax M2 Her

Replicate

MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and express…

66K tokens

2 creditsminimax/minimax-m2-her

MiniMax M2.5

Replicate

MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex …

205K tokens

3 creditsminimax/minimax-m2.5

MiniMax M2.7

Replicate

MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous i…

205K tokens

4 creditsminimax/minimax-m2.7

MiniMax M3

Replicate

MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a …

1000K tokensVision

4 creditsminimax/minimax-m3

MiniMax Music 1.5

Replicate

Creates songs up to 4 minutes. 2 free trials!

30 creditsminimax/music-1.5

MiniMax Music 2.5

Replicate

Full song generation with rich instrumentation and natural vocals.

50 creditsminimax/music-2.5

Ministral 14B 2512

Mistral AI

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to …

262K tokensVision

3 creditsmistralai/ministral-14b-…

Ministral 3B 2512

Mistral AI

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision ca…

131K tokensVision

4 creditsmistralai/ministral-3b-2…

Ministral 8B 2512

Mistral AI

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capa…

262K tokensVision

5 creditsmistralai/ministral-8b-2…

Mistral Large

This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-availabl…

128K tokens

8 creditsmistralai/mistral-large

Mistral Medium 3.5

Mistral AI

Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with t…

262K tokensVision

8 creditsmistralai/mistral-medium…

Mixtral 8x22B Instruct

Mistral AI

Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active p…

66K tokens

5 creditsmistralai/mixtral-8x22b-…

Morph V3 Fast

Morph

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The mod…

82K tokens

3 creditsmorph/morph-v3-fast

Morph V3 Large

Morph

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transform…

262K tokens

4 creditsmorph/morph-v3-large

Nemotron 3 Nano

Nvidia

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers…

256K tokens

2 creditsnvidia/nemotron-3-nano-3…

Nemotron 3 Super

Nvidia

NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute ef…

1000K tokens

2 creditsnvidia/nemotron-3-super-…

Nora

Replicate

A serene, spiritual female voice with a calm and measured tone.

5 creditsAbbess

Nova 2 Lite V1

Amazon

Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos t…

1000K tokensVision

2 creditsamazon/nova-2-lite-v1

Nova Lite V1

Amazon

Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, an…

300K tokensVision

2 creditsamazon/nova-lite-v1

Nova Micro V1

Amazon

Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of model…

128K tokens

99999999 creditsamazon/nova-micro-v1

Nova Premier V1

Amazon

Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the bes…

1000K tokensVision

5 creditsamazon/nova-premier-v1

Nova Pro V1

Amazon

Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and…

300K tokensVision

4 creditsamazon/nova-pro-v1

Oliver

Replicate

A calm, deliberate male speaker, excellent for educational content.

5 creditsPatient_Man

OpenAI o1

The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 …

200K tokensVision

350 creditsopenai/o1

OpenAI o1-pro

The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasonin…

200K tokensVision

220000 creditsopenai/o1-pro

Pak Budi

Replicate

A calm, authoritative male leader voice in Indonesian.

5 creditsIndonesian_BossyLeader

Perceptron MK1

Perceptron

Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It a…

33K tokensVision

5 creditsperceptron/perceptron-mk…

Phi-4

Microsoft

[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficientl…

16K tokens

3 creditsmicrosoft/phi-4

Phi-4 Mini Instruct

Microsoft

Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - wit…

131K tokens

2 creditsmicrosoft/phi-4-mini-ins…

Pixtral Large 2411

Mistral AI

Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-la…

131K tokensVision

12 creditsmistralai/mistral-large-…

Qwen 3 Max

Model from Alibaba, very economical for daily use.

4 creditsqwen/qwen-3-max

Qwen 3.5 35B

Alibaba

The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear …

262K tokensVision

4 creditsqwen/qwen3.5-35b-a3b

Qwen 3.5 9B

Alibaba

Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and v…

262K tokensVision

4 creditsqwen/qwen3.5-9b

Qwen 3.5 Flash (02-23)

Alibaba

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention me…

1000K tokensVision

2 creditsqwen/qwen3.5-flash-02-23

Qwen 3.5 Plus (2026-04-20)

Alibaba

Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video inp…

1000K tokensVision

10 creditsqwen/qwen3.5-plus-202604…

Qwen 3.6 27B

Alibaba

Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It fea…

262K tokensVision

15 creditsqwen/qwen3.6-27b

Qwen 3.6 35B A3B

Alibaba

Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion act…

262K tokensVision

5 creditsqwen/qwen3.6-35b-a3b

Qwen 3.6 Flash

Alibaba

Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video inp…

1000K tokensVision

10 creditsqwen/qwen3.6-flash

Qwen 3.6 Max Preview

Alibaba

Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture…

262K tokens

19 creditsqwen/qwen3.6-max-preview

Qwen 3.6 Plus

Alibaba

Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts ro…

1000K tokensVision

10 creditsqwen/qwen3.6-plus

Qwen 3.7 Max

Alibaba

Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for age…

1000K tokens

18 creditsqwen/qwen3.7-max

Qwen 3.7 Plus

Alibaba

Qwen3.7-Plus is a cost‑effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, b…

1000K tokensVision

8 creditsqwen/qwen3.7-plus

Qwen Plus 0728

Alibaba

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced perfo…

1000K tokens

4 creditsqwen/qwen-plus-2025-07-2…

Qwen Plus 0728 (thinking)

Alibaba

Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced perfo…

1000K tokens

5 creditsqwen/qwen-plus-2025-07-2…

Qwen-Plus

Alibaba

Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost c…

1000K tokens

4 creditsqwen/qwen-plus

Qwen2.5 72B Instruct

Alibaba

Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - …

131K tokens

8 creditsqwen/qwen-2.5-72b-instru…

Qwen2.5 7B Instruct

Alibaba

Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - S…

131K tokens

2 creditsqwen/qwen-2.5-7b-instruc…

Qwen2.5 Coder 32B Instruct

Alibaba

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Cod…

128K tokens

6 creditsqwen/qwen-2.5-coder-32b-…

Qwen2.5 VL 72B Instruct

Alibaba

Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capa…

131K tokensVision

12 creditsqwen/qwen2.5-vl-72b-inst…

Qwen3 14B

Alibaba

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning an…

132K tokens

3 creditsqwen/qwen3-14b

Qwen3 235B A22B

Alibaba

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forw…

131K tokens

15 creditsqwen/qwen3-235b-a22b

Qwen3 235B A22B Instruct 2507

Alibaba

Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-…

262K tokens

8 creditsqwen/qwen3-235b-a22b-250…

Qwen3 235B A22B Thinking 2507

Alibaba

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for c…

262K tokens

18 creditsqwen/qwen3-235b-a22b-thi…

Qwen3 30B A3B Thinking 2507

Alibaba

Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring …

131K tokens

8 creditsqwen/qwen3-30b-a3b-think…

Qwen3 32B

Alibaba

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning a…

131K tokens

5 creditsqwen/qwen3-32b

Qwen3 8B

Alibaba

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks …

131K tokens

2 creditsqwen/qwen3-8b

Qwen3 Coder 480B A35B

Alibaba

Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt…

1049K tokens

15 creditsqwen/qwen3-coder

Qwen3 Coder Flash

Alibaba

Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful c…

1000K tokens

6 creditsqwen/qwen3-coder-flash

Qwen3 Coder Next

Alibaba

Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It…

262K tokens

4 creditsqwen/qwen3-coder-next

Qwen3 Coder Plus

Alibaba

Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agen…

1000K tokens

20 creditsqwen/qwen3-coder-plus

Qwen3 Max Thinking

Alibaba

Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that re…

262K tokens

25 creditsqwen/qwen3-max-thinking

Qwen3 Next 80B A3B Instruct

Alibaba

Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo…

262K tokens

6 creditsqwen/qwen3-next-80b-a3b-…

Qwen3 Next 80B A3B Thinking

Alibaba

Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” tr…

262K tokens

8 creditsqwen/qwen3-next-80b-a3b-…

Qwen3 VL 235B A22B Instruct

Alibaba

Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understan…

262K tokensVision

15 creditsqwen/qwen3-vl-235b-a22b-…

Qwen3 VL 235B Thinking

Alibaba

Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across i…

131K tokensVision

5 creditsqwen/qwen3-vl-235b-a22b-…

Qwen3 VL 30B A3B Instruct

Alibaba

Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images…

262K tokensVision

8 creditsqwen/qwen3-vl-30b-a3b-in…

Qwen3 VL 30B A3B Thinking

Alibaba

Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images…

131K tokensVision

10 creditsqwen/qwen3-vl-30b-a3b-th…

Qwen3 VL 32B Instruct

Alibaba

Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and re…

262K tokensVision

10 creditsqwen/qwen3-vl-32b-instru…

Qwen3 VL 8B Thinking

Alibaba

Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visua…

256K tokensVision

5 creditsqwen/qwen3-vl-8b-thinkin…

Relace Apply 3

Relace

Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can…

256K tokens

4 creditsrelace/relace-apply-3

Relace Search

Relace

The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant fil…

256K tokens

6 creditsrelace/relace-search

Rina

Replicate

A soft‑spoken, gentle female voice in Indonesian.

5 creditsIndonesian_GentleGirl

Ring 2.6 1T

Inclusion AI

Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that…

262K tokens

6 creditsinclusionai/ring-2.6-1t

Rocinante 12B

TheDrummer

Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary w…

33K tokens

3 creditsthedrummer/rocinante-12b

Rudi

Replicate

A compassionate, caring male voice in Indonesian.

5 creditsIndonesian_CaringMan

Sarah

Replicate

A wise, mature female voice, perfect for storytelling and narration.

5 creditsWise_Woman

Sari

Replicate

A cute, sweet young female voice in Indonesian.

5 creditsIndonesian_SweetGirl

Seed 1.6

Replicate

Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and ada…

262K tokensVision

12 creditsbytedance-seed/seed-1.6

Seed 1.6 Flash

Replicate

Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual unders…

262K tokensVision

8 creditsbytedance-seed/seed-1.6-…

Seed 2.0 Lite

Replicate

Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities…

262K tokensVision

5 creditsbytedance-seed/seed-2.0-…

Sonar

Perplexity

Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources…

127K tokensVisionWeb Search

20 creditsperplexity/sonar

Sonar Deep Research

Perplexity

Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across compl…

128K tokensWeb Search

120 creditsperplexity/sonar-deep-re…

Sonar Pro

Perplexity

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing…

200K tokensVisionWeb Search

50 creditsperplexity/sonar-pro

Sonar Pro Search

Perplexity

Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic searc…

200K tokensVision

10 creditsperplexity/sonar-pro-sea…

Sonar Reasoning Pro

Perplexity

Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing…

128K tokensVisionWeb Search

70 creditsperplexity/sonar-reasoni…

Sophie

Replicate

A sweet, pleasant young female voice perfect for audiobooks.

5 creditsLovely_Girl

Step 3.5 Flash

StepFun

Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) archit…

262K tokens

3 creditsstepfun/step-3.5-flash

Step 3.7 Flash

StepFun

Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter langua…

256K tokensVision

9 creditsstepfun/step-3.7-flash

Switchpoint Router

Switchpoint

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. …

131K tokens

5 creditsswitchpoint/router

Trinity Large Thinking

Arcee AI

Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance …

262K tokens

3 creditsarcee-ai/trinity-large-t…

UI-TARS 1.5 7B

Replicate

UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, we…

128K tokensVision

4 creditsbytedance/ui-tars-1.5-7b

Victoria

Replicate

A commanding, regal female voice that exudes authority.

5 creditsImposing_Manner

WizardLM-2 8x22B

Microsoft

WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared t…

66K tokens

8 creditsmicrosoft/wizardlm-2-8x2…

Zara

Replicate

An outgoing, lively female voice full of energy and enthusiasm.

5 creditsExuberant_Girl