Lemonade by AMD: Local AI for Everyone

Source: Hacker News | Score: 4/5 | Date: 2026-04-03

TL;DR: AMD releases Lemonade - an open source, fast local LLM server that runs on GPU and NPU, with OpenAI API compatibility and cross-platform support.

What is Lemonade?

Lemonade is a refreshingly fast local AI server built by AMD for GPUs and NPUs. It exists because local AI should be free, open, fast, and private.

Key Features

Capabilities

Text/LLM

Load up models like gpt-oss-120b or Qwen-Coder-Next for advanced tool use with 128 GB unified RAM.

Image Generation

Generate images directly from the local server.

Speech

Transcription and speech generation capabilities built-in.

Unified API

One local service for every modality. Point your app at Lemonade and get chat, vision, image generation, transcription, speech generation, and more with standard APIs.

POST /api/v1/chat/completions

Why It Matters

Lemonade addresses critical needs in the local AI space:

With Lemonade, AMD is making local AI more accessible to developers and users who want privacy, control, and cost savings.

Links