Pipeline live — dogfooding daily

Ambient voice intelligence
for AI agents.

Not just transcription. Percept builds a knowledge graph from your conversations — entity extraction, speaker identification, relationship mapping, semantic search — so your agent actually understands what's being said.

Open source · Local-first · Works with 🦞 OpenClaw or any agent framework

⭐ Star on GitHub Join Waitlist

What it feels like

Speak. It happens.

Seven action types by voice — email, text, reminder, search, calendar, note, and order.

"Hey Atlas, email Sarah about the quarterly review"

→ Contact resolved, email sent. Percept knows Sarah's address from your relationship graph.

Walk out of a meeting

→ Auto-summary delivered with action items, attendees identified, entities extracted — in 60 seconds.

"Hey Atlas, remind me in thirty minutes to call the contractor"

→ Reminder set. Spoken numbers parsed naturally — "an hour and a half" just works.

"What did we discuss about the API migration last week?"

→ Semantic search across all conversations. Vector + full-text, speaker-attributed.

The Moat

Context Intelligence Layer

Transcription is a commodity. What happens after is what matters. The CIL transforms raw speech into structured, actionable context — so "email the client" actually works because your agent knows who the client is.

Two-pass entity extraction — fast regex + LLM semantic pass
Relationship graph with weighted edges and linear decay
5-tier entity resolution — exact, fuzzy, contextual, recency, semantic
NVIDIA NIM embeddings + LanceDB vector search
FTS5 full-text search with porter stemming
Context packets — single JSON blob with everything an agent needs
SQLite persistence — single file, zero deps, WAL mode

// Context Packet for "email the client"
{
  "resolved_entity": "Sarah Chen",
  "confidence": 0.94,
  "resolution_path": "contextual",
  "evidence": [
    "mentioned 3x in last conversation",
    "relationship: client_of (weight: 0.87)"
  ],
  "contact": {
    "email": "sarah@acme.com",
    "org": "Acme Corp"
  }
}

Built, not planned

Everything that's working today.

Shipped in 5 days. Dogfooded daily. Not a roadmap — a product.

◉

Entity Extraction

Two-pass pipeline extracts people, orgs, locations, projects, and topics. Maps relationships automatically from co-occurrence.

◉

Knowledge Graph

Relationship graph with weighted edges and linear decay. 5-tier entity resolution — exact, fuzzy, contextual, recency, semantic. Your agent knows who "the client" is.

◉

Speaker ID

Knows who's talking. Resolves contacts, builds per-speaker analytics. "That was Sarah" teaches it new voices.

◉

Semantic Search

NVIDIA NIM embeddings + LanceDB vectors, with FTS5 keyword fallback. Search what anyone said, anytime.

◉

Dashboard

Full management UI — conversation history, speaker management, contacts, settings, analytics, search, data export and purge.

◉

Three-Tier Transcriber

Local (faster-whisper) → NVIDIA Riva → Cloud. Privacy by default. Your audio never leaves your machine unless you choose.

◉

TTL Auto-Purge

Configurable retention — utterances 30d, summaries 90d, relationships 180d. Your data, your rules. Export anytime.

Two form factors at launch

Works with what you wear.

Omi pendant for ambient intelligence. Apple Watch for push-to-talk. More coming.

Omi Pendant

All-day battery. BLE to phone. Forget it's there.

✓ Live

Apple Watch

Push-to-talk walkie-talkie style. Raise to speak.

Beta — launching with v1

Any Webhook Source

Standard HTTP endpoint. POST transcripts from anything.

✓ Ready

Open Standard

Percept Protocol

A framework-agnostic JSON schema for voice → intent → action handoff. Six event types. Three transports. Unix composable.

Your agent framework doesn't matter. LangChain, CrewAI, AutoGen, 🦞 OpenClaw, or a bash script — if it reads JSON, it works with Percept.

transcript conversation intent action_request action_response summary

Terminal

# Pipe voice events to any consumer
percept listen \
  | jq 'select(.type == "intent")' \
  | my-agent --stdin

# Or use webhooks
percept serve \
  --webhook https://my-agent.com/voice

# Or WebSocket
percept serve --ws

// Voice → Context → Agent → Action

You: "Hey Atlas, email the client
about the proposal deadline"

// CIL resolves "the client"

Percept: entity: Sarah Chen (0.94)

Percept: action: email → sarah@acme.com

// 🦞 OpenClaw agent executes

Agent: ✉️ Email sent to Sarah Chen

// Confirmation delivered

Text: "Emailed Sarah about the
proposal deadline ✓"

🦞 First-Class Integration

Built for 🦞 OpenClaw

Percept is designed as a native 🦞 OpenClaw skill — your agent gets ears, context intelligence, and ambient awareness out of the box.

Every voice command flows through your agent's full context: memory, tools, integrations, relationship graph. "Email the client" just works because the system knows.

Works standalone too. Any framework that consumes JSON can integrate via the Percept Protocol.

Open Source

Self-host in 5 minutes.

MIT license. Your audio stays on your machine. Single SQLite file — nothing to configure.

Terminal

# Install
pip install getpercept

# Start (receiver + dashboard + CIL)
percept serve

# Dashboard at :8960 · Receiver at :8900
# Your agent can hear now.

MIT License Python 3.10+ Local-First SQLite NVIDIA NIM 🦞 OpenClaw

Pricing

Self-host free. Forever.

Open source is the product, not a teaser. Run it on your hardware, no strings attached.

Self-hosted · MIT License · No limits

☁️ Hosted API coming soon.
Don't want to self-host? Join the waitlist — we'll notify you when the managed cloud is ready.

Get early access.

Be first to know when the hosted API launches and the repo goes public.

🎉 You're on the list. We'll be in touch.

Ambient voice intelligencefor AI agents.