🚀 Public Preview•CPU-first runtime
Local Edge AI for builders and bold teams
Run billion-parameter LLMs on CPUs with a unified multimodal runtime — no GPUs, no cloud, no data leak.
Slash command
/ask “summarize this thread & action items”Result (local)
• Summary of last 50 messages
• 3 action items with owners
• Links back to messages
• 3 action items with owners
• Links back to messages
Open Machine AI — Desktop
v0.1 • Preview
You
Draft a product spec for a CPU-first local AI assistant.
Open Machine AI
Here’s an outline with goals, constraints, and an implementation plan. (Preview output…)
Why Open Machine AI
Edge-grade capability. Cloud-free control.
Bring LLM-class power to CPUs with a unified runtime and zero data leak.
CPU-first perf
Optimized kernels + scheduler for commodity CPUs. Deploy on laptops, servers, or edge boxes.
Sovereignty
No data leaves your device. On-prem by default. Bring-your-own models when available.
Unified runtime
Single decoder with multimodal I/O, tools and grounding wired into the loop.
Concurrency
Thread-pinned decoding, streaming I/O, and backpressure-aware gateway.
Connectors
APIs and CLI. Integrations for Slack and more on the beta track.
SDKs
TypeScript, Python (coming). Stable, minimal surface.
Proof points
Fast paths, small footprints
Preview metrics on our 7M baseline model (illustrative).
Cold start
< 200ms
Median latency
~ 25ms/token
CPU decode
~ 30 tok/s
Memory
< 2.4 GB RSS