Ferrum — GA4GH infrastructure that actually runs.
Complete GA4GH stack on-premise — for clinics, data integration centres, and genomDE data nodes that cannot send raw data to the cloud. Tested, documented, in Rust.
Evidence you can inspect
Five public repositories form the GA4GH stack: Ferrum (data/compute), ga4gh-infra (identity), Lab Kit (deploy), Demo (benchmark), and HelixTest (conformance). Apache-2.0 where stated; BUSL-1.1 covers the integrated Ferrum runtime with a clear research allowance and four-year conversion to Apache-2.0 (see LICENSE).
Why Ferrum exists
Most GA4GH implementations are cloud-first, hard to verify, or simply not finished. Ferrum is built for teams that know their data must stay on-premise — while still needing to interoperate with GA4GH-compatible networks. The GA4GH APIs are good. There should be a system that implements them consistently. So we built it.
Designed for resource-constrained environments
Ferrum is not built only for European institutions with stable infrastructure. A growing group of users works in settings with intermittent connectivity, limited hardware, and a need for sovereign data infrastructure without cloud dependency. Ferrum runs in laptop mode on a single device, recovers from power loss via checkpoints, and ingests Nanopore data directly from the lab.
Features
| Feature | Description | |
|---|---|---|
| Offline-first / laptop mode | Runs on a single laptop with SQLite and local storage. No PostgreSQL, no MinIO required. Recovers cleanly from power loss. | |
| Nanopore / ONT integration | Native ingestion of POD5/FAST5/BLOW5 files. Stores ONT quality metrics (Q-score, N50) alongside GA4GH DRS objects. | |
| Multi-pathogen Beacon | Beacon v2 queries across TB, malaria, AMR, and viral pathogens — on the same infrastructure as human genomics data. | |
| Outbreak mode | Controlled emergency data sharing: pre-configured policies that grant authorised public-health beacon access during an outbreak, with a full audit trail. Not open, not closed — controlled. | |
| Federated Beacon (P2P) | Ferrum instances query each other directly — no central coordinator, no cloud dependency. Works across slow or intermittent links. | |
| Battery / solar aware | Detects power source (AC/battery/UPS). Reduces load on battery, writes a checkpoint before emergency shutdown. | |
| Data residency audit | Cryptographically chained, append-only log of all data movements. Proves data stayed within your institution. |
What we need to say honestly
Ferrum is tested — HelixTest runs in CI, the GA4GH demo is reproducible, the architecture is written to scale. What we do not yet have: a deployment with truly large clinical datasets. That is not an architecture limitation — it is because we have not had the resources for that yet. That is something we want to do with the first real partner.
We are looking for a first productive pilot
Ferrum is tested and documented — but a deployment with real clinical data volumes, e.g. at a DIC site or genomDE data node, has not happened yet. That is something we want to do with the right partner. Who that is: an institution that wants to become GA4GH-compliant, on-premise, without cloud dependency — and that brings openness for genuine collaboration rather than a classical software purchase.
Ferrum + GHGA — two sides of one infrastructure
GHGA is the national archive for genomic data. Ferrum is the local side: GA4GH-compliant infrastructure at the clinic or data node, before data is submitted to GHGA. Crypt4GH is implemented in Ferrum — the same encryption GHGA uses. DRS interface for data transfer. No cloud as an intermediate step.
MII core dataset integration
Ferrum MII Connect checks FHIR profiles against the MII core dataset offline — no external API dependency. JSON/SARIF reports for ETL-CI. For DIC sites that want to embed MII conformance checks into their pipeline CI.
MII Connect documentation →GA4GH Services
TRS · DRS · WES · TES · htsget · Beacon v2 · Passports — all under one gateway, with shared authentication.
GA4GH implementation on GitHub →Deployment options
Demo, single node, HPC cluster, Kubernetes — all options are documented.
| Option | Description |
|---|---|
| Demo | Docker stack for evaluation and conformance testing (HelixTest, GA4GH demo). |
| Single Node | PostgreSQL and MinIO on one server — standard layout for production on-premise deployments. |
| HPC Cluster | SLURM/LSF integration for workflow execution in cluster environments. |
| Kubernetes | Kubernetes manifests and Helm charts for scalable cluster deployments. |
| Laptop / offline | No external dependencies. SQLite and local filesystem. Suitable for field labs and resource-constrained environments. Field & offline deployment → |
Installation
Quickstart on GitHub →Licence & collaboration
Ferrum is licensed under BUSL-1.1 — free for research and non-commercial use, with a commercial licence for production deployments. After four years the licence converts to Apache-2.0. You can licence it, request features, or use it as a starting point for your own infrastructure. No vendor lock-in.
Designed for regulated environments (GDPR, EHDS, NIS2, HIPAA as orientation). Includes technical compliance tools (MII profile checks, CI reports). No certification promise — technical evidence, not legal advice.
Read the Ferrum & GA4GH white paper — HelixTest conformance checks, GA4GH demo benchmarks, architecture, and how we work.
Regulatory context: EHDS · NIS2 · GDPR & health data
We typically reply within two business days. Repository and licence details are available on request.