💫 Cosmic Routers
While technically “routers” and not “models” - 💫 Cosmic Routers act through our OpenAI and Anthropic compatible API as if they were models themselves. They intelligently route every request to one of their underlying models.💫 Orbit
costa/orbitSWEBench: 57.9
Description: Orbit always includes current best practices in intelligent model routing - it is Costa’s “frontier” model router. Orbit uses a mix of routing strategies, including semantic task analysis, performance-based benchmarking,
and adaptive load balancing. It is designed to minimize manual intervention. It is particularly good at architecting projects and writing code in Python, C++, Javascript, Go and Ruby (although it does well in other languages).
Models Included:
anthropic/claude-sonnet-4.5
anthropic/claude-sonnet-4
anthropic/claude-3.7-sonnet
anthropic/claude-3.5-haiku
anthropic/claude-haiku-4.5
openai/gpt-5-mini
openai/gpt-4.1
google/gemini-2.5-flash
google/gemini-2.5-pro
qwen/qwen3-coder
moonshotai/kimi-k2
Stability: 🚀 Active Tuning
May Add Models: ✅ Yes
Last Changed 2025-10-16
💫 Quasar
costa/quasarSWEBench: 56.2
Description: Quasar is a stable model router based on Avengers Pro routing logic. It trades performance for time efficiency in an extremely clever manner. Expect fewer updates to Quasar as it is relatively stable
Models Included:
anthropic/claude-sonnet-4
anthropic/claude-3.5-haiku
qwen/qwen3-coder
moonshotai/kimi-k2
Stability: 🛰️ Stable
May Add Models: 🛑 No
Last Changed 2025-08-23
Other Top Models
Costa provides access to almost 100 models - here are the top ones used for coding:GPT 4.1
gpt-4.1SWEBench: 39.6
Provider:
OpenAIDescription: GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and GPT-4.5 across coding (54.6% SWE-bench Verified), instruction compliance (87.4% IFEval), and multimodal understanding benchmarks. It is tuned for precise code diffs, agent reliability, and high recall in large document contexts, making it ideal for agents, IDE tooling, and enterprise knowledge retrieval.
Claude 3.7 Sonnet
claude-sonnet-3.7SWEBench: 52.8
Provider:
AnthropicDescription: Claude 3.7 Sonnet is an advanced large language model with improved reasoning, coding, and problem-solving capabilities. It introduces a hybrid reasoning approach, allowing users to choose between rapid responses and extended, step-by-step processing for complex tasks. The model demonstrates notable improvements in coding, particularly in front-end development and full-stack updates, and excels in agentic workflows, where it can autonomously navigate multi-step processes.
Claude 4 Sonnet
claude-sonnet-4SWEBench: 64.9
Provider:
AnthropicDescription: Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%), Sonnet 4 balances capability and computational efficiency, making it suitable for a broad range of applications from routine coding tasks to complex software development projects. Key enhancements include improved autonomous codebase navigation, reduced error rates in agent-driven workflows, and increased reliability in following intricate instructions.
Meta Llama 4 Maverick 17b
llama-4-maverick-17b-128e-instructSWEBench: 21
Provider:
Meta (via Groq/Google Vertex)Description: Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward pass (400B total). It supports multilingual text and image input, and produces multilingual text and code output across 12 supported languages.
Gemini 2.5 Pro
gemini-2.5-proSWEBench: 53.6
Provider:
GoogleDescription: Gemini 2.5 Pro is Google's state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs "thinking" capabilities, enabling it to reason through responses with enhanced accuracy and nuanced context handling. Gemini 2.5 Pro achieves top-tier performance on multiple benchmarks, including first-place positioning on the LMArena leaderboard, reflecting superior human-preference alignment and complex problem-solving abilities.
Gemini Flash 2.5
gemini-2.5-flashSWEBench: 28.7
Provider:
GoogleDescription: Gemini 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, mathematics, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater accuracy and nuanced context handling.
Claude Haiku 4.5
claude-haiku-4.5Provider:
AnthropicDescription: Claude Haiku 4.5 is Anthropic's fastest and most efficient model, delivering near-frontier intelligence at a fraction of the cost. Matching Claude Sonnet 4's performance across reasoning and coding tasks, it brings frontier-level capability to real-time and high-volume applications.
Qwen3 Coder
qwen3-coderSWEBench: 55.4
Provider:
Qwen (via Groq/Google Vertex)Description: A Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use.
O4 Mini
o4-miniSWEBench: 45
Provider:
OpenAIDescription: A compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities.
Qwen3 32b
qwen3-32bSWEBench: 42.2
Provider:
Qwen (via Groq/Google Vertex)Description: Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for tasks like math, coding, and logical inference, and a "non-thinking" mode for faster, general-purpose conversation.
Kimi K2
kimi-k2SWEBench: 43.8
Provider:
MoonshotAI (via Groq/Google Vertex)Description: Kimi K2 is optimized for agentic capabilities, including advanced tool use, reasoning, and code synthesis. It excels across coding (LiveCodeBench, SWE-bench), reasoning (ZebraLogic, GPQA), and tool-use (Tau2, AceBench) benchmarks.
GPT OSS 20b
gpt-oss-20bSWEBench: 5
Provider:
OpenAI (via Groq/Google Vertex)Description: Trained in OpenAI's Harmony response format and supports reasoning level configuration, fine-tuning, and agentic capabilities including function calling, tool use, and structured outputs.
Qwen 2.5 Coder
qwen-2.5-coderSWEBench: 9
Provider:
Qwen (via Groq/Google Vertex)Description: One of the world's first code-optimized open weight models created by the Qwen team