Project · msearch

msearch

an intelligence runtime — composable pipelines for search, inference, training, and agents

A DAG execution engine for composable intelligence pipelines. Search, infer, traverse, train, coordinate — composed into a single execution plan, parallel across CPU and GPU, zero-copy throughout. Profiled for Apple Silicon's unified memory, deployable on any server.

Intelligence pipelines as a first-class intelligence runtime.

01 The thesis

The dominant pattern for AI applications today is to chain separate services — search here, inference there, coordination somewhere else — across the network. As pipelines become more adaptive and agents compose their own plans at runtime, that architecture becomes the bottleneck.

msearch flips that. Pipeline are composed as a directed graph that the engine executes. Pipelines are made of up common intelligence primitives — vector search, inference, training, reasoning, indexing. Execution occurs across GPU and CPU cores in parallel, each operation comsumping the compute framework best suited to it.

The result is a class of application that hasn't really existed: a dynamic, on-device intelligence runtime — with latency budgets measured in milliseconds and privacy guarantees that hold because the data never left the device.

02 How it's built

The architecture in one read.

Composable DAG execution

Pipelines are graphs of operators, not chained service calls. Search, infer, traverse, train, coordinate — composed and optimised together as one execution plan.

Zero-copy across the pipeline

Data stays in place from query to result. No intermediate copies, no serialisation between stages. The pipeline never duplicates what's already in memory.

Parallel CPU and GPU lanes

Operators choose where they run. Graph traversal stays on CPU; distance kernels and inference lift onto GPU. Both lanes coordinate through shared memory without copies.

Profiled for Apple Silicon

On unified memory, the zero-copy architecture extends all the way to GPU compute. No transfer tax between processors. A single binary that also deploys on any server.

Built on mgraph

mgraph carries structured state across environments. msearch turns that structured state into pipelines that produce structured output. The two are co-designed.

03 Status

Status

Active development

Since

2024

Stack

Rust · Metal · UMA

Repo

msearch.app

04 Roadmap

Where it goes from here.

2025

foundation

Pipeline runtime over UMA
Operator graph executor on Apple Silicon, zero-copy between stages.
developed
First-class vector index
GPU-accelerated index built into the runtime — the vector DB project.
developed
Reranker and structured output operators
The pipeline can return JSON-shaped, schema-validated results.
developed

2026

adoption

Public open-core release
Source, spec, examples, and the first wave of partner integrations.
in flight
Graph-traversal operator
Typed graph queries as a native operator alongside vector and rerank.
in flight
Cross-platform runtime
Linux + CUDA backend that mirrors the UMA contract for non-Apple GPUs.
next

2027

horizon

Distributed pipeline coordination
Pipelines that span device, edge, and cloud as one execution plan.
next
Specialised model operators
Native operators for graph-neural-network reasoning over typed knowledge.
next

Contact

Tell us what you're working on.

We reply within two business days. If a call would be faster, book a thirty minute conversation.

Prefer to talk?

Book a thirty-minute call