🚀 Public Preview•CPU-first runtime

Local Edge AI for builders and bold teams

Run billion-parameter LLMs on CPUs with a unified multimodal runtime — no GPUs, no cloud, no data leak.

Connect with Slack

Bring local AI into your channels — no cloud relay.

Slash command

/ask “summarize this thread & action items”

Result (local)

• Summary of last 50 messages
• 3 action items with owners
• Links back to messages

Open Machine AI — Desktop

v0.1 • Preview

You

Draft a product spec for a CPU-first local AI assistant.

Open Machine AI

Here’s an outline with goals, constraints, and an implementation plan. (Preview output…)

Why Open Machine AI

Edge-grade capability. Cloud-free control.

Bring LLM-class power to CPUs with a unified runtime and zero data leak.

CPU-first perf

Optimized kernels + scheduler for commodity CPUs. Deploy on laptops, servers, or edge boxes.

Sovereignty

No data leaves your device. On-prem by default. Bring-your-own models when available.

Unified runtime

Single decoder with multimodal I/O, tools and grounding wired into the loop.

Concurrency

Thread-pinned decoding, streaming I/O, and backpressure-aware gateway.

Connectors

APIs and CLI. Integrations for Slack and more on the beta track.

SDKs

TypeScript, Python (coming). Stable, minimal surface.

Proof points

Preview metrics on our 7M baseline model (illustrative).

Cold start

< 200ms

Median latency

~ 25ms/token

CPU decode

~ 30 tok/s

Memory

< 2.4 GB RSS