GPT-5.4 Mini & Nano
Source: OpenAI Blog | Date: March 2026
Overview
OpenAI releases GPT-5.4 mini and nano - their most capable small models yet. These models bring many strengths of GPT-5.4 to faster, more efficient versions designed for high-volume workloads.
Key Insight: The best model is often not the largest one — it's the one that responds quickly, uses tools reliably, and still performs well on complex professional tasks.
GPT-5.4 Mini
- Performance: Significantly improves over GPT-5 mini across coding, reasoning, multimodal understanding, and tool use
- Speed: Runs more than 2x faster than GPT-5 mini
- Benchmark Results:
- SWE-Bench Pro: 54.4% (approaches GPT-5.4's 57.7%)
- Terminal-Bench 2.0: 60.0%
- GPQA Diamond: 88.0%
- OSWorld-Verified: 72.1%
- Pricing: $0.75 per 1M input tokens, $4.50 per 1M output tokens
- Context: 400k tokens
GPT-5.4 Nano
- Purpose: Smallest, cheapest version for tasks where speed and cost matter most
- Use Cases: Classification, data extraction, ranking, coding subagents for simpler tasks
- Pricing: $0.20 per 1M input tokens, $1.25 per 1M output tokens
- Benchmark Results:
- SWE-Bench Pro: 52.4%
- Terminal-Bench 2.0: 46.3%
- GPQA Diamond: 82.8%
Architecture Pattern: Model Composition
GPT-5.4 mini enables a powerful new pattern where larger models handle planning and coordination while delegating to smaller subagents:
- Larger model (GPT-5.4): Planning, coordination, final judgment
- Smaller models (GPT-5.4 mini): Execute narrower subtasks in parallel
- Example: Search codebase, review large files, process supporting documents
Benchmark Comparison Table
| Benchmark | GPT-5.4 | GPT-5.4 mini | GPT-5.4 nano | GPT-5 mini |
|---|---|---|---|---|
| SWE-Bench Pro | 57.7% | 54.4% | 52.4% | 45.7% |
| Terminal-Bench 2.0 | 75.1% | 60.0% | 46.3% | 38.2% |
| GPQA Diamond | 93.0% | 88.0% | 82.8% | 81.6% |
Availability
- API: Text/image inputs, tool use, function calling, web search, file search, computer use, skills
- Codex: CLI, IDE extension, web - uses only 30% of GPT-5.4 quota
- ChatGPT: Free and Go users via "Thinking" feature
Why This Matters
The release marks a significant shift in how AI systems are built:
- 不再是"一个模型做所有事" - 组合多个模型实现最佳效果
- 小模型也能达到接近大模型的性能,同时速度更快、成本更低
- 子代理模式成为主流:大的做规划,小的执行
探索时间: 2026-03-18 | 来源: Hacker News Best