🚀 Public PreviewCPU-first runtime

Local Edge AI for builders and bold teams

Run billion-parameter LLMs on CPUs with a unified multimodal runtime — no GPUs, no cloud, no data leak.

Connect with Slack
Bring local AI into your channels — no cloud relay.
Slash command
/ask “summarize this thread & action items”
Result (local)
• Summary of last 50 messages
• 3 action items with owners
• Links back to messages
Open Machine AI — Desktop
v0.1 • Preview
You
Draft a product spec for a CPU-first local AI assistant.
Open Machine AI
Here’s an outline with goals, constraints, and an implementation plan. (Preview output…)
Why Open Machine AI

Edge-grade capability. Cloud-free control.

Bring LLM-class power to CPUs with a unified runtime and zero data leak.

CPU-first perf

Optimized kernels + scheduler for commodity CPUs. Deploy on laptops, servers, or edge boxes.

Sovereignty

No data leaves your device. On-prem by default. Bring-your-own models when available.

Unified runtime

Single decoder with multimodal I/O, tools and grounding wired into the loop.

Concurrency

Thread-pinned decoding, streaming I/O, and backpressure-aware gateway.

Connectors

APIs and CLI. Integrations for Slack and more on the beta track.

SDKs

TypeScript, Python (coming). Stable, minimal surface.

Proof points

Fast paths, small footprints

Preview metrics on our 7M baseline model (illustrative).

Cold start
< 200ms
Median latency
~ 25ms/token
CPU decode
~ 30 tok/s
Memory
< 2.4 GB RSS