Science

Methodology, cited end-to-end.

Furcate is a composition of open-source projects, not a new framework. Every layer of the stack maps to projects that already work at production scale; every interface is explicit. We do not lock customers into proprietary runtimes, opaque orchestrators, or vendor-only mesh protocols.

The stack

Open foundations, cited and named.

Furcate composes the best open-source projects in edge AI into one coherent fabric — never a black box. Every decision the runtime, orchestrator, or agent makes is traceable to the project that produced it.

Read the methodology

TensorRT Edge-LLM

NVIDIA

LLM + VLM inference on Jetson / DRIVE

High-performance C++ runtime for LLM and VLM inference on resource-constrained NVIDIA platforms — FP8, NVFP4, INT4 quantization, EAGLE-3 speculative decoding, KV-cache compression. Demonstrated at CES 2026 with Bosch, ThunderSoft, MediaTek partner showcases.

LiteRT

Google

Lightweight cross-platform inference

TensorFlow Lite's evolution into LiteRT — built-in quantization + compression, runs on Android, embedded Linux, microcontrollers via TFLite Micro. Default for cross-vendor mobile/edge deployments.

ONNX Runtime

Microsoft

Cross-hardware AI inference

Cross-platform inference engine optimising AI models across CPUs, GPUs, NPUs, and specialised accelerators with minimal model modification. Deployed widely as a hardware-agnostic runtime.

ExecuTorch

Documented in the field.

Edge AI silicon: Jetson Orin Nano Super delivers 67 TOPS at $249 and 7-25 W (NVIDIA, 2025). The Hailo-10H AI HAT+ 2 lifts a Raspberry Pi 5 to 40 TOPS INT4 at 2.5 W and runs 2B-parameter LLMs at ~10 tok/s for $130 (Hailo, 2025). Google Coral Edge TPU sits at 4 TOPS / 2 W for quantised TFLite vision workloads.

WebAssembly cold start: Wasmtime leads JIT/AOT cold-start performance among standalone WASM runtimes (January 2026 benchmarks). 1-5 ms WASI cold starts vs 100ms-1s+ for traditional Linux containers — a 100× improvement that lets edge inference become serverless. Cloudflare Workers runs ~10M WebAssembly requests per second across 300+ edge locations.

Kubernetes-at-edge: KubeEdge published scaling to 100,000 concurrent edge nodes managing 1,000,000+ active pods. K3s exhibits the lowest resource consumption among lightweight K8s distributions; OpenYurt offers the strongest offline-capable edge story. Akri extends K8s-native device discovery to OPC UA / udev / ONVIF leaf devices.

Federated learning in production: NVIDIA FLARE deploys hierarchical FL across thousands of edge devices. Real production deployments include Eli Lilly TuneLab (built by Rhino Federated Computing on FLARE), Taiwan MOHW national healthcare FL, and a Tri-Labs (Sandia/LANL/LLNL) federated AI pilot. Flower interoperates natively with FLARE; ExecuTorch handles mobile-side fine tuning.

Sovereign edge inference: Microsoft Sovereign Private Cloud scales to thousands of nodes per sovereign environment (announced April 27, 2026). Azure Local Disconnected Operations enables fully air-gapped deployment with consistent management UX. HPE Private Cloud AI offers turnkey air-gapped AI training and inference.

Standards

Native to the protocols of the edge.

Edge AI runs on a thicket of standards that span radio, security, orchestration, and ML frameworks. A fabric that doesn’t speak them natively is doomed to integration debt. We speak them natively.

Matter 1.4 + Thread
Energy-class device support, IPv6 mesh
LoRaWAN
Long-range low-power IoT
Private 5G + TSN
Deterministic latency wireless
MQTT 5 + Sparkplug B
Pub/sub bus, ISA-95 namespace
OPC UA + DDS + ROS 2
Industrial + robotics middleware
TPM 2.0 + TEE
Hardware root of trust, secure boot
WASI 0.3 / 1.0
WASM systems interface (Feb 2026 / late 2026)
IEC 62443
Industrial cybersecurity zones-and-conduits
NIS2 (EU mandatory)
Critical-infra cybersecurity
FIPS 140-3 + NIST SP 800-series
Cryptography compliance
GDPR + HIPAA + CMMC
Data-residency + handling regimes
ONNX + GGUF + GGML
Model interchange formats

Selected references

Where the work comes from.

TensorRT Edge-LLM: Accelerating LLM and VLM Inference for Automotive and Robotics
NVIDIA Technical Blog, CES 2026
NVIDIA FLARE: Federated Learning from Simulation to Real-World
Roth et al., arXiv:2210.13291
Supercharging Federated Learning with Flower and NVIDIA FLARE
arXiv:2407.00031
KubeEdge: Performance Test — Scaling to 100,000 Edge Nodes
CNCF / KubeEdge community
WASI 0.3.0 Release — WebAssembly Replaces Containers for Edge
Bytecode Alliance, February 2026
Effortless Federated Learning on Mobile with NVIDIA FLARE and Meta ExecuTorch
NVIDIA Technical Blog 2025
Microsoft Sovereign Private Cloud Scales to Thousands of Nodes
Microsoft Official Blog, April 2026
Build sovereign AI at the edge with Azure Local
Microsoft Azure Blog 2026
8 CNCF Tools to Run Kubernetes at the Edge and Bare Metal
Cloud Native Now, 2026
Comparative Analysis of Lightweight Kubernetes Distributions for Edge Computing
Springer Nature 2024
WebAssembly Runtime Benchmarks 2026: Wasmtime vs Wasmer vs WasmEdge
wasmRuntime.com
TinyML on ESP32 with TensorFlow Lite Micro
Hackster / EloquentArduino