Project · msearch

msearch

an intelligence runtime — composable pipelines for search, inference, training, and agents

A DAG execution engine for composable intelligence pipelines. Search, infer, traverse, train, coordinate — composed into a single execution plan, parallel across CPU and GPU, zero-copy throughout. Profiled for Apple Silicon's unified memory, deployable on any server.

Intelligence pipelines as a first-class intelligence runtime.

01 The thesis

The dominant pattern for AI applications today is to chain separate services — search here, inference there, coordination somewhere else — across the network. As pipelines become more adaptive and agents compose their own plans at runtime, that architecture becomes the bottleneck.

msearch flips that. Pipeline are composed as a directed graph that the engine executes. Pipelines are made of up common intelligence primitives — vector search, inference, training, reasoning, indexing. Execution occurs across GPU and CPU cores in parallel, each operation comsumping the compute framework best suited to it.

The result is a class of application that hasn't really existed: a dynamic, on-device intelligence runtime — with latency budgets measured in milliseconds and privacy guarantees that hold because the data never left the device.

02 How it's built

The architecture in one read.

01

Composable DAG execution

Pipelines are graphs of operators, not chained service calls. Search, infer, traverse, train, coordinate — composed and optimised together as one execution plan.

02

Zero-copy across the pipeline

Data stays in place from query to result. No intermediate copies, no serialisation between stages. The pipeline never duplicates what's already in memory.

03

Parallel CPU and GPU lanes

Operators choose where they run. Graph traversal stays on CPU; distance kernels and inference lift onto GPU. Both lanes coordinate through shared memory without copies.

04

Profiled for Apple Silicon

On unified memory, the zero-copy architecture extends all the way to GPU compute. No transfer tax between processors. A single binary that also deploys on any server.

05

Built on mgraph

mgraph carries structured state across environments. msearch turns that structured state into pipelines that produce structured output. The two are co-designed.

03 Status
Status
Active development
Since
2024
Stack
Rust · Metal · UMA
04 Roadmap

Where it goes from here.

2025
foundation
  • Pipeline runtime over UMA
    Operator graph executor on Apple Silicon, zero-copy between stages.
    developed
  • First-class vector index
    GPU-accelerated index built into the runtime — the vector DB project.
    developed
  • Reranker and structured output operators
    The pipeline can return JSON-shaped, schema-validated results.
    developed
2026
adoption
  • Public open-core release
    Source, spec, examples, and the first wave of partner integrations.
    in flight
  • Graph-traversal operator
    Typed graph queries as a native operator alongside vector and rerank.
    in flight
  • Cross-platform runtime
    Linux + CUDA backend that mirrors the UMA contract for non-Apple GPUs.
    next
2027
horizon
  • Distributed pipeline coordination
    Pipelines that span device, edge, and cloud as one execution plan.
    next
  • Specialised model operators
    Native operators for graph-neural-network reasoning over typed knowledge.
    next
Contact

Tell us what you're working on.

We reply within two business days. If a call would be faster, book a thirty minute conversation.

We don't share your details. Replies come from a real person, not a CRM.