Hive online · v0.1 · bootstrap

The supercomputer you already own

cadencIA shards giant AI models across a P2P hive of phones, PCs and home NPUs. No data centers. No billion-dollar power bills. Just the heat your hardware already wastes.

P2P
Dynamic mesh
INT4
KV cache
BFT
Consensus
Wasm
Universal runtime
cadencia ~ hive · live
cadencia infer --model llama3-70b "explain RAFT"

Sharding · 12 layers → 4 alpha · 8 beta nodes

Routing · WebRTC mesh · 5ms local cluster

KV cache · INT4 quantized · −80% bandwidth

Consensus · Merkle hash OK · 0 byzantine

Response assembled · 247 tokens · 1.2s · 0 datacenters used
Live mesh

This is how the hive breathes

Each dot is a real device (phone, PC, NPU). The blue pulses are tensors traveling between shards in real time.

live
Alpha node (PC / NPU) Beta node (mobile) Tensor in flight
01 Engine room

Four pillars holding the hive together

Not an academic paper: the physiology of a distributed nervous system.

01

Dynamic P2P mesh

Core

Elastic WebRTC network with a tweaked DHT. Distance is logical, not geographic: with few nodes, Spain and New Zealand are neighbors.

WebRTC Kademlia Gossip protocol Encrypted UDP
02

Hybrid inference sharding

Pipeline parallelism across layers + cross-node attention. MLP to beta (mobile) nodes, attention to alpha nodes with RAM and AC power. KV cache quantized to INT4.

Pipeline parallel MLP shard KV cache INT4 Speculative decoding
03

BFT + zero-trust

Dynamic Merkle trees sign every tensor. Cryptographic slashing for toxic nodes. Sharding = anonymization: no node ever sees the full prompt.

Merkle proofs Slashing TEE Trust score
04

Universal runtime

Same Wasm kernel on iOS, Android, Windows and Linux. WebGPU talks directly to mobile NPUs or NVIDIA Tensor Cores — no rewrites.

WebAssembly WebGPU NPU Sandbox
02 Pipeline flow

The lifecycle of a prompt

From your thumb to 12 devices spread across the planet and back — without ever touching AWS.

01

Local tokenization

Your device turns the prompt into tokens and end-to-end encrypts it.

02

Segmentation

The local orchestrator splits the model into 12 shards using the nearby-nodes table.

03

P2P injection

Shards travel through WebRTC tunnels. Broadcast to 3 nodes: fastest wins.

04

Compute cascade

Node A layers 1-5 → Node B layers 6-10 → Node C final block. KV cache shared in INT4.

05

Merkle consensus

The assembler validates partial hashes. If a node lied, instant slashing.

06

De-tokenization

The answer flows back to the user. Target latency: < 1.5s in a local cluster.

SCORING ENGINE

Every node emits a Health Vector

The hive does not assign tasks at random. A fitness function combines TFLOPS, battery, latency and historical stability to decide who processes what — and the weights shift per task.

// Node fitness function
Score
= w₁ · TFLOPS
+ w₂ · Battery%
+ w₃ · 1 / Latency
+ w₄ · Stability
wₓ tunable per task type · LLM inference → w₃ dominates
03 Accordion World

The metaverse is the interface, not the product

Every tensor op projects as a game mechanic. Latency is visible. Consensus is played. Compute is built.

MDS projection

Inter-node latencies are not Euclidean. A force algorithm collapses the distance matrix into a 2D/3D map where "close" means "fast".

CRDT synchronization

Without a central server, the same tech behind Figma guarantees your city and your neighbor's converge regardless of packet order.

Watt economy

Reward = ∫ (Task_Complexity · Uptime) · Efficiency_Factor. If your phone overheats, your factor drops. Incentivizes healthy hardware, not blind mining.

While OpenAI pays for cooling,
cadencIA uses the heat you already waste.

The energy is paid. The NPUs are idle. Home networks are highways with potholes — we do not treat them as perfect pipes, we design for constant packet loss. That is the most sustainable and democratic processing model that exists.

04 Roadmap

From cloud seed to a 100% edge hive

Controlled bootstrap: our sentinel nodes spin the network up, the community inherits it.

Q2 2026 In progress

Phase 0 · Sentinels

AWS/Azure seed nodes acting as "ghost players". They carry 90% of initial compute.

  • Local orchestrator v0.1
  • Wasm runtime POC
  • 3-region mesh
Q3 2026 Next

Phase 1 · Public alpha

Open to 500 real devices. Pipeline parallelism with Llama-3 8B sharded across 4 nodes.

  • iOS / Android client
  • INT4 KV cache
  • Trust score v1
Q4 2026 Planned

Phase 2 · Accordion World

Gamified visual layer. CRDTs syncing state. MDS projection of latencies.

  • City editor
  • Verifiable credits
  • Speculative decoding
2027 Planned

Phase 3 · Sentinels off

Cloud nodes shut down. 100% community infrastructure. 70B models running on the hive.

  • Cross-node attention
  • Full BFT
  • Hive federation