Pipeline live — dogfooding daily

Ambient voice intelligence
for AI agents.

Not just transcription. Percept builds a knowledge graph from your conversations — entity extraction, speaker identification, relationship mapping, semantic search — so your agent actually understands what's being said.

Open source · Local-first · Works with 🦞 OpenClaw or any agent framework

Speak. It happens.

Seven action types by voice — email, text, reminder, search, calendar, note, and order.

"Hey Atlas, email Sarah about the quarterly review"
→ Contact resolved, email sent. Percept knows Sarah's address from your relationship graph.
Walk out of a meeting
→ Auto-summary delivered with action items, attendees identified, entities extracted — in 60 seconds.
"Hey Atlas, remind me in thirty minutes to call the contractor"
→ Reminder set. Spoken numbers parsed naturally — "an hour and a half" just works.
"What did we discuss about the API migration last week?"
→ Semantic search across all conversations. Vector + full-text, speaker-attributed.
The Moat

Context Intelligence Layer

Transcription is a commodity. What happens after is what matters. The CIL transforms raw speech into structured, actionable context — so "email the client" actually works because your agent knows who the client is.

  • Two-pass entity extraction — fast regex + LLM semantic pass
  • Relationship graph with weighted edges and linear decay
  • 5-tier entity resolution — exact, fuzzy, contextual, recency, semantic
  • NVIDIA NIM embeddings + LanceDB vector search
  • FTS5 full-text search with porter stemming
  • Context packets — single JSON blob with everything an agent needs
  • SQLite persistence — single file, zero deps, WAL mode
// Context Packet for "email the client"
{
  "resolved_entity": "Sarah Chen",
  "confidence": 0.94,
  "resolution_path": "contextual",
  "evidence": [
    "mentioned 3x in last conversation",
    "relationship: client_of (weight: 0.87)"
  ],
  "contact": {
    "email": "sarah@acme.com",
    "org": "Acme Corp"
  }
}

Everything that's working today.

Shipped in 5 days. Dogfooded daily. Not a roadmap — a product.

Entity Extraction

Two-pass pipeline extracts people, orgs, locations, projects, and topics. Maps relationships automatically from co-occurrence.

Knowledge Graph

Relationship graph with weighted edges and linear decay. 5-tier entity resolution — exact, fuzzy, contextual, recency, semantic. Your agent knows who "the client" is.

Speaker ID

Knows who's talking. Resolves contacts, builds per-speaker analytics. "That was Sarah" teaches it new voices.

Semantic Search

NVIDIA NIM embeddings + LanceDB vectors, with FTS5 keyword fallback. Search what anyone said, anytime.

Dashboard

Full management UI — conversation history, speaker management, contacts, settings, analytics, search, data export and purge.

Three-Tier Transcriber

Local (faster-whisper) → NVIDIA Riva → Cloud. Privacy by default. Your audio never leaves your machine unless you choose.

TTL Auto-Purge

Configurable retention — utterances 30d, summaries 90d, relationships 180d. Your data, your rules. Export anytime.

Works with what you wear.

Omi pendant for ambient intelligence. Apple Watch for push-to-talk. More coming.

Omi Pendant

All-day battery. BLE to phone. Forget it's there.
✓ Live

Apple Watch

Push-to-talk walkie-talkie style. Raise to speak.
Beta — launching with v1

Any Webhook Source

Standard HTTP endpoint. POST transcripts from anything.
✓ Ready

Percept Protocol

A framework-agnostic JSON schema for voice → intent → action handoff. Six event types. Three transports. Unix composable.

Your agent framework doesn't matter. LangChain, CrewAI, AutoGen, 🦞 OpenClaw, or a bash script — if it reads JSON, it works with Percept.

transcript conversation intent action_request action_response summary
Terminal
# Pipe voice events to any consumer
percept listen \
  | jq 'select(.type == "intent")' \
  | my-agent --stdin

# Or use webhooks
percept serve \
  --webhook https://my-agent.com/voice

# Or WebSocket
percept serve --ws
// Voice → Context → Agent → Action

You: "Hey Atlas, email the client
    about the proposal deadline"

// CIL resolves "the client"
Percept: entity: Sarah Chen (0.94)
Percept: action: email → sarah@acme.com

// 🦞 OpenClaw agent executes
Agent: ✉️ Email sent to Sarah Chen

// Confirmation delivered
Text: "Emailed Sarah about the
    proposal deadline ✓"

Built for 🦞 OpenClaw

Percept is designed as a native 🦞 OpenClaw skill — your agent gets ears, context intelligence, and ambient awareness out of the box.

Every voice command flows through your agent's full context: memory, tools, integrations, relationship graph. "Email the client" just works because the system knows.

Works standalone too. Any framework that consumes JSON can integrate via the Percept Protocol.

Self-host in 5 minutes.

MIT license. Your audio stays on your machine. Single SQLite file — nothing to configure.

Terminal
# Install
pip install getpercept

# Start (receiver + dashboard + CIL)
percept serve

# Dashboard at :8960 · Receiver at :8900
# Your agent can hear now.
MIT License Python 3.10+ Local-First SQLite NVIDIA NIM 🦞 OpenClaw

Self-host free. Forever.

Open source is the product, not a teaser. Run it on your hardware, no strings attached.

$0
Self-hosted · MIT License · No limits
☁️ Hosted API coming soon.
Don't want to self-host? Join the waitlist — we'll notify you when the managed cloud is ready.

Get early access.

Be first to know when the hosted API launches and the repo goes public.

🎉 You're on the list. We'll be in touch.