Discover 225+ credit-based AI models.
AionLabs
Aion-1.0 is a multi-model system designed for high performance across various tasks, including reasoning and coding. It …
Aion-2.0 is a variant of DeepSeek V3.2 optimized for immersive roleplaying and storytelling. It is particularly strong a…
Replicate
A relaxed, informal male voice for chatting with friends.
A noble, chivalrous young male voice for heroic tales.
A cheerful, bubbly young female voice that radiates positivity.
Anthropic
Claude 3 Haiku is Anthropic's fastest and most compact model for near-instant responsiveness. Quick and accurate targete…
Claude 3.5 Haiku features offers enhanced capabilities in speed, coding accuracy, and tool use. Engineered to excel in r…
This model always redirects to the latest model in the Anthropic Claude Haiku family.
Claude Haiku 4.5 is Anthropic’s fastest and most efficient model, delivering near-frontier intelligence at a fraction of…
Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on com…
Claude Opus 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning,…
Claude Opus 4.5 is Anthropic’s frontier reasoning model optimized for complex software engineering, agentic workflows, a…
Opus 4.6 is Anthropic’s strongest model for coding and long-running professional tasks. It is built for agents that oper…
Fast-mode variant of Opus 4.6 - identical capabilities with higher output speed at premium 6x pricing.
Opus 4.7 is the next generation of Anthropic's Opus family, built for long-running, asynchronous agents. Building on the…
Fast-mode variant of Opus 4.7 - identical capabilities with higher output speed at premium 6x pricing.
Claude Opus 4.8 is Anthropic's most capable generally available model in the Opus family. It supports text, image, and f…
Fast-mode variant of Opus 4.8 - identical capabilities with higher output speed at 2x pricing relative to regular Opus 4…
This model always redirects to the latest model in the Anthropic Claude Sonnet family.
Claude Sonnet 4.5 is Anthropic’s most advanced Sonnet model to date, optimized for real-world agents and coding workflow…
Sonnet 4.6 is Anthropic's most capable Sonnet-class model yet, with frontier performance across coding, agents, and prof…
Cohere
Command R7B (12-2024) is a small, fast update of the Command R+ model, delivered in December 2024. It excels at RAG, too…
TheDrummer
Uncensored and creative writing model based on Mistral Small 3.2 24B with good recall, prompt adherence, and intelligenc…
A deep, authoritative male voice ideal for professional presentations.
Specialized chat model from DeepSeek, optimized for conversation.
DeepSeek R1 is here: Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tok…
DeepSeek
May 28th update to the original DeepSeek R1 Performance on par with OpenAI o1, but open-sourced and with fully open reas…
DeepSeek R1 Distill Llama 70B is a distilled large language model based on [Llama-3.3-70B-Instruct](/meta-llama/llama-3.…
DeepSeek R1 Distill Qwen 32B is a distilled large language model based on [Qwen 2.5 32B](https://huggingface.co/Qwen/Qwe…
DeepSeek-V3 is the latest model from the DeepSeek team, building upon the instruction following and coding abilities of …
DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from …
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinki…
Nex AGI
DeepSeek V3.1 Nex-N1 is the flagship release of the Nex-N1 series — a post-trained model designed to highlight agent aut…
DeepSeek-V3.1 Terminus is an update to [DeepSeek V3.1](/deepseek/deepseek-chat-v3.1) that maintains the model's original…
DeepSeek-V3.2 is a large language model designed to harmonize high computational efficiency with strong reasoning and ag…
DeepSeek-V3.2-Exp is an experimental large language model released by DeepSeek as an intermediate step between V3.1 and …
DeepSeek V4 Flash is an efficiency-optimized Mixture-of-Experts model from DeepSeek with 284B total parameters and 13B a…
DeepSeek V4 Pro is a large-scale Mixture-of-Experts model from DeepSeek with 1.6T total parameters and 49B activated par…
Mistral AI
Devstral 2 is a state-of-the-art open-source model by Mistral AI specializing in agentic coding. It is a 123B-parameter …
A serene, calm female voice in Indonesian.
An energetic, uplifting young female voice full of motivation.
Baidu
ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE …
A polite, well-mannered young male voice for formal settings.
Google
Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mat…
Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cos…
Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientifi…
Gemini 3 Flash Preview is a high speed, high value thinking model designed for agentic workflows, multi turn chat, and c…
Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. …
Gemini 3.1 Flash Lite Preview is Google's high-efficiency model optimized for high-volume use cases. It outperforms Gemi…
Gemini 3.1 Pro Preview is Google’s frontier reasoning model, delivering enhanced software engineering performance, impro…
Gemini 3.1 Pro Preview Custom Tools is a variant of Gemini 3.1 Pro that improves tool selection behavior by preventing o…
Gemini 3.5 Flash is Google's high-efficiency multimodal model, bringing near-Pro level coding and reasoning at Flash-tie…
This model always redirects to the latest model in the Google Gemini Flash family.
This model always redirects to the latest model in the Google Gemini Pro family.
Z.AI
GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-E…
GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applica…
GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) ar…
Compared with GLM-4.5, this generation brings several key improvements: Longer context window: The context window has be…
GLM-4.6V is a large multimodal model designed for high-fidelity visual understanding and long-context reasoning across i…
GLM-4.7 is Z.ai’s latest flagship model, featuring upgrades in two key areas: enhanced programming capabilities and more…
As a 30B-class SOTA model, GLM-4.7-Flash offers a new option that balances performance and efficiency. It is further opt…
Z-AI
GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. …
GLM-5 is Z.ai’s flagship open-source foundation model engineered for complex systems design and long-horizon agent workf…
GLM-5 Turbo is a new model from Z.ai designed for fast inference and strong performance in agent-driven environments suc…
GLM-5V-Turbo is Z.ai’s first native multimodal agent foundation model, built for vision-based coding and agent-driven ta…
OpenAI
This model always redirects to the latest model in the OpenAI GPT family.
GPT Chat Latest points to OpenAI's stable API alias chat-latest that always resolves to the latest Instant chat model us…
This model always redirects to the latest model in the OpenAI GPT Mini family.
GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for c…
GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintai…
GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs wi…
GPT-4o mini Search Preview is a specialized model for web search in Chat Completions. It is trained to understand and ex…
GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instru…
GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, a…
GPT-5.1-Codex-Max is OpenAI’s latest agentic coding model, designed for long-running, high-context software development …
GPT-5.2 is the latest frontier-grade model in the GPT-5 series, offering stronger agentic and long context perfomance co…
GPT-5.2 Chat (AKA Instant) is the fast, lightweight member of the 5.2 family, optimized for low-latency chat while retai…
GPT-5.2-Codex is an upgraded version of GPT-5.1-Codex optimized for software engineering and coding workflows. It is des…
GPT-5.3 Chat is an update to ChatGPT's most-used model that makes everyday conversations smoother, more useful, and more…
GPT-5.3-Codex is OpenAI’s most advanced agentic coding model, combining the frontier software engineering performance of…
GPT-5.4 is OpenAI’s latest frontier model, unifying the Codex and GPT lines into a single system. It features a 1M+ toke…
GPT-5.4 mini brings the core capabilities of GPT-5.4 to a faster, more efficient model optimized for high-throughput wor…
GPT-5.4 nano is the most lightweight and cost-efficient variant of the GPT-5.4 family, optimized for speed-critical and …
GPT-5.4 Pro is OpenAI's most advanced model, building on GPT-5.4's unified architecture with enhanced reasoning capabili…
GPT-5.5 is OpenAI’s frontier model designed for complex professional workloads, building on GPT-5.4 with stronger reason…
GPT-5.5 Pro is OpenAI’s high-capability model optimized for deep reasoning and accuracy on complex, high-stakes workload…
A gentle, soothing female voice ideal for relaxation and meditation.
IBM
Granite-4.0-H-Micro is a 3B parameter from the Granite 4 family of models. These models are the latest in a series of mo…
Granite 4.1 8B is a dense, decoder-only 8-billion-parameter language model from IBM, part of the Granite 4.1 family. It …
xAI
Grok 4.20 is a reasoning model from xAI with industry-leading speed and agentic tool calling capabilities. It combines t…
Grok 4.20 Multi-Agent is a variant of xAI’s Grok 4.20 designed for collaborative, agent-based workflows. Multiple agents…
Grok 4.3 is a reasoning model from xAI. It accepts text and image inputs with text output, and is suited for agentic wor…
Grok Build 0.1 is xAI’s fast coding model trained specifically for agentic software engineering workflows. It supports t…
A sophisticated, refined male voice for high‑end presentations.
Nous Research
Hermes 3 is a generalist language model with many improvements over Hermes 2, including advanced agentic capabilities, m…
Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistra…
Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hy…
Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid …
Tencent
Hunyuan-A13B is a 13B active parameter Mixture-of-Experts (MoE) language model developed by Tencent, with a total parame…
Generate 3D models from text or images.
Hy3 preview is a high-efficiency Mixture-of-Experts model from Tencent designed for agentic workflows and production use…
Prime Intellect
INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (12B active) post-trained from GLM-4.5-Air-Base using supervise…
AI21
Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following,…
A warm, approachable male voice perfect for everyday conversation.
Moonshot AI
This model always redirects to the latest model in the MoonshotAI Kimi family.
Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion…
Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long…
Kimi K2 0905 is the September update of [Kimi K2 0711](moonshotai/kimi-k2). It is a large-scale Mixture-of-Experts (MoE)…
Kimi K2.5 is Moonshot AI's native multimodal model, delivering state-of-the-art visual coding capability and a self-dire…
Kimi K2.6 is Moonshot AI's next-generation multimodal model, designed for long-horizon coding, coding-driven UI/UX gener…
MoonshotAI: Kimi K2.7 Code is a coding-focused model in Moonshot AI's Kimi K2 family, built to complete end-to-end progr…
RainSpeed
This is a fake model that always fails.
A firm, resolute male voice that conveys confidence and purpose.
Sao10K
Lunaris 8B is a versatile generalist and roleplaying model based on Llama 3. It's a strategic merge of multiple models, …
Euryale L3.1 70B v2.2 is a model focused on creative roleplay from [Sao10k](https://ko-fi.com/sao10k). It is the success…
Meta
Llama 3.2 11B Vision is a multimodal model with 11 billion parameters, designed to handle tasks combining visual and tex…
Llama 3.2 1B is a 1-billion-parameter language model focused on efficiently performing natural language tasks, such as s…
Llama 3.2 3B is a 3-billion-parameter multilingual large language model, optimized for advanced natural language process…
The Meta Llama 3.3 multilingual large language model (LLM) is a pretrained and instruction tuned generative model in 70B…
Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-exper…
Nvidia
Free embedding model with vision-language understanding.
Generates short music clips up to 30 seconds. Ideal for quick prototyping.
Generates full songs up to 3 minutes with detailed structure and high‑quality vocals.
Anthracite
This is a series of models designed to replicate the prose quality of the Claude 3 models, specifically Sonnet(https://o…
An assertive, confident female voice in Indonesian.
Inception
Mercury 2 is an extremely fast reasoning LLM, and the first reasoning diffusion LLM (dLLM). Instead of generating tokens…
A light, airy female voice, great for children's content.
Xiaomi
MiMo-V2-Flash is an open-source foundation language model developed by Xiaomi. It is a Mixture-of-Experts model with 309…
MiMo-V2.5 is a native omnimodal model by Xiaomi. It delivers Pro-level agentic performance at roughly half the inference…
MiMo-V2.5-Pro is Xiaomi’s flagship model, delivering strong performance in general agentic capabilities, complex softwar…
MiniMax M2-her is a dialogue-first large language model built for immersive roleplay, character-driven chat, and express…
MiniMax-M2.5 is a SOTA large language model designed for real-world productivity. Trained in a diverse range of complex …
MiniMax-M2.7 is a next-generation large language model designed for autonomous, real-world productivity and continuous i…
MiniMax-M3 is a multimodal foundation model from MiniMax. It supports text, image, and video inputs with text output, a …
Creates songs up to 4 minutes. 2 free trials!
Full song generation with rich instrumentation and natural vocals.
The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to …
The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision ca…
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capa…
This is Mistral AI's flagship model, Mistral Large 2 (version `mistral-large-2407`). It's a proprietary weights-availabl…
Mistral Medium 3.5 is a dense 128B instruction-following model from Mistral AI. It supports text and image inputs with t…
Mistral's official instruct fine-tuned version of [Mixtral 8x22B](/models/mistralai/mixtral-8x22b). It uses 39B active p…
Morph
Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The mod…
Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transform…
NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers…
NVIDIA Nemotron 3 Super is a 120B-parameter open hybrid MoE model, activating just 12B parameters for maximum compute ef…
A serene, spiritual female voice with a calm and measured tone.
Amazon
Nova 2 Lite is a fast, cost-effective reasoning model for everyday workloads that can process text, images, and videos t…
Amazon Nova Lite 1.0 is a very low-cost multimodal model from Amazon that focused on fast processing of image, video, an…
Amazon Nova Micro 1.0 is a text-only model that delivers the lowest latency responses in the Amazon Nova family of model…
Amazon Nova Premier is the most capable of Amazon’s multimodal models for complex reasoning tasks and for use as the bes…
Amazon Nova Pro 1.0 is a capable multimodal model from Amazon focused on providing a combination of accuracy, speed, and…
A calm, deliberate male speaker, excellent for educational content.
The latest and strongest model family from OpenAI, o1 is designed to spend more time thinking before responding. The o1 …
The o1 series of models are trained with reinforcement learning to think before they answer and perform complex reasonin…
A calm, authoritative male leader voice in Indonesian.
Perceptron
Perceptron Mk1 (Mark One) is Perceptron's highest-quality vision-language model for video and embodied reasoning.** It a…
Microsoft
[Microsoft Research](/microsoft) Phi-4 is designed to perform well in complex reasoning tasks and can operate efficientl…
Phi-4-mini-instruct is a lightweight open model built upon synthetic data and filtered publicly available websites - wit…
Pixtral Large is a 124B parameter, open-weight, multimodal model built on top of [Mistral Large 2](/mistralai/mistral-la…
Model from Alibaba, very economical for daily use.
Alibaba
The Qwen3.5 Series 35B-A3B is a native vision-language model designed with a hybrid architecture that integrates linear …
Qwen3.5-9B is a multimodal foundation model from the Qwen3.5 family, designed to deliver strong reasoning, coding, and v…
The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention me…
Qwen3.5 Plus (April 2026) is a large-scale multimodal language model from Alibaba. It accepts text, image, and video inp…
Qwen3.6 27B is a dense 27-billion-parameter language model from the Qwen Team at Alibaba, released in April 2026. It fea…
Qwen3.6-35B-A3B is an open-weight multimodal model from Alibaba Cloud with 35 billion total parameters and 3 billion act…
Qwen3.6 Flash is a fast, efficient language model from Alibaba's Qwen 3.6 series. It supports text, image, and video inp…
Qwen3.6-Max-Preview is a proprietary frontier model from Alibaba Cloud built on a sparse mixture-of-experts architecture…
Qwen 3.6 Plus builds on a hybrid architecture that combines efficient linear attention with sparse mixture-of-experts ro…
Qwen3.7-Max is the flagship model in Alibaba's Qwen3.7 series. It supports text input and output and is designed for age…
Qwen3.7-Plus is a cost‑effective model in Alibaba's Qwen3.7 series. It supports text and image input with text output, b…
Qwen Plus 0728, based on the Qwen3 foundation model, is a 1 million context hybrid reasoning model with a balanced perfo…
Qwen-Plus, based on the Qwen2.5 foundation model, is a 131K context model with a balanced performance, speed, and cost c…
Qwen2.5 72B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - …
Qwen2.5 7B is the latest series of Qwen large language models. Qwen2.5 brings the following improvements upon Qwen2: - S…
Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). Qwen2.5-Cod…
Qwen2.5-VL is proficient in recognizing common objects such as flowers, birds, fish, and insects. It is also highly capa…
Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning an…
Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forw…
Qwen3-235B-A22B-Instruct-2507 is a multilingual, instruction-tuned mixture-of-experts language model based on the Qwen3-…
Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for c…
Qwen3-30B-A3B-Thinking-2507 is a 30B parameter Mixture-of-Experts reasoning model optimized for complex tasks requiring …
Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning a…
Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks …
Qwen3-Coder-480B-A35B-Instruct is a Mixture-of-Experts (MoE) code generation model developed by the Qwen team. It is opt…
Qwen3 Coder Flash is Alibaba's fast and cost efficient version of their proprietary Qwen3 Coder Plus. It is a powerful c…
Qwen3-Coder-Next is an open-weight causal language model optimized for coding agents and local development workflows. It…
Qwen3 Coder Plus is Alibaba's proprietary version of the Open Source Qwen3 Coder 480B A35B. It is a powerful coding agen…
Qwen3-Max-Thinking is the flagship reasoning model in the Qwen3 series, designed for high-stakes cognitive tasks that re…
Qwen3-Next-80B-A3B-Instruct is an instruction-tuned chat model in the Qwen3-Next series optimized for fast, stable respo…
Qwen3-Next-80B-A3B-Thinking is a reasoning-first chat model in the Qwen3-Next line that outputs structured “thinking” tr…
Qwen3-VL-235B-A22B Instruct is an open-weight multimodal model that unifies strong text generation with visual understan…
Qwen3-VL-235B-A22B Thinking is a multimodal model that unifies strong text generation with visual understanding across i…
Qwen3-VL-30B-A3B-Instruct is a multimodal model that unifies strong text generation with visual understanding for images…
Qwen3-VL-30B-A3B-Thinking is a multimodal model that unifies strong text generation with visual understanding for images…
Qwen3-VL-32B-Instruct is a large-scale multimodal vision-language model designed for high-precision understanding and re…
Qwen3-VL-8B-Thinking is the reasoning-optimized variant of the Qwen3-VL-8B multimodal model, designed for advanced visua…
Relace
Relace Apply 3 is a specialized code-patching LLM that merges AI-suggested edits straight into your source files. It can…
The relace-search model uses 4-12 `view_file` and `grep` tools in parallel to explore a codebase and return relevant fil…
A soft‑spoken, gentle female voice in Indonesian.
Inclusion AI
Ring-2.6-1T is a 1T-parameter-scale thinking model with 63B active parameters, built for real-world agent workflows that…
Rocinante 12B is designed for engaging storytelling and rich prose. Early testers have reported: - Expanded vocabulary w…
A compassionate, caring male voice in Indonesian.
A wise, mature female voice, perfect for storytelling and narration.
A cute, sweet young female voice in Indonesian.
Seed 1.6 is a general-purpose model released by the ByteDance Seed team. It incorporates multimodal capabilities and ada…
Seed 1.6 Flash is an ultra-fast multimodal deep thinking model by ByteDance Seed, supporting both text and visual unders…
Seed-2.0-Lite is a versatile, cost‑efficient enterprise workhorse that delivers strong multimodal and agent capabilities…
Perplexity
Sonar is lightweight, affordable, fast, and simple to use — now featuring citations and the ability to customize sources…
Sonar Deep Research is a research-focused model designed for multi-step retrieval, synthesis, and reasoning across compl…
Note: Sonar Pro pricing includes Perplexity search pricing. See [details here](https://docs.perplexity.ai/guides/pricing…
Exclusively available on the OpenRouter API, Sonar Pro's new Pro Search mode is Perplexity's most advanced agentic searc…
A sweet, pleasant young female voice perfect for audiobooks.
StepFun
Step 3.5 Flash is StepFun's most capable open-source foundation model. Built on a sparse Mixture of Experts (MoE) archit…
Step 3.7 Flash is StepFun's latest high-efficiency multimodal Mixture-of-Experts model. It pairs a 196B-parameter langua…
Switchpoint
Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. …
Arcee AI
Trinity Large Thinking is a powerful open source reasoning model from the team at Arcee AI. It shows strong performance …
UI-TARS-1.5 is a multimodal vision-language agent optimized for GUI-based environments, including desktop interfaces, we…
A commanding, regal female voice that exudes authority.
WizardLM-2 8x22B is Microsoft AI's most advanced Wizard model. It demonstrates highly competitive performance compared t…
An outgoing, lively female voice full of energy and enthusiasm.