Back to Explore

MiMo V2 Omni

Chat

Xiaomi

xiaomi/mimo-v2-omni

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

23

credits / gen

Try this model
Vision File Support Reasoning 262K Context Vision (OR)

About this model

MiMo-V2-Omni is a frontier omni-modal model that natively processes image, video, and audio inputs within a unified architecture. It combines strong multimodal perception with agentic capability - visual grounding, multi-step...

Technical Specifications

Provider

Xiaomi

Type

Chat

Context Window

262,144 tokens

Pricing

23 credits

Capabilities

Vision

Can process and understand images

File Support

Can read PDF, DOCX, XLSX & more

Reasoning

Chain-of-thought reasoning exposed

262K Context

Large context window for long documents

Vision (OR)

OpenRouter reports vision support