White paper: Ferrum & GA4GH
How Synaptic Four built unified GA4GH infrastructure—and what that shows about our engineering practice.
March 2026 · Synaptic Four · Stuttgart, Germany
For decision makers
Executive reading guide
Short plain-language context. GA4GH in one sentence. How inclusion fits us. The rest of this page keeps the full technical depth.
Expand Collapse
For decision makers
Executive reading guide
Short plain-language context. GA4GH in one sentence. How inclusion fits us. The rest of this page keeps the full technical depth.
This document explains why we built Ferrum, how the architecture fits together, how HelixTest validates it, and how we work—including transparent use of AI as a tool.
- You get a defensible story for internal sponsors: motivation, scope, and limitations (small benchmark slices—no inflated claims).
- It separates “technical signal” from “official certification”—important language for legal and quality stakeholders.
- It shows our operating model: precision, reproducibility, and inclusion as part of the same company—not a CSR slide deck separate from engineering.
- Share the PDF as a handout; use this page for quick orientation and live links to repositories.
What is GA4GH?
The Global Alliance for Genomics and Health (GA4GH) is an international initiative that defines shared technical interfaces for genomic data and analyses. Partners connect under clear contracts instead of rebuilding bespoke integrations for every link.
Synaptic Four combines rigorous engineering with an explicit, lived commitment to neurodiversity and autism inclusion. That is part of who we are—not an add-on to the tech story. More: About us Autism
Download (PDF)
You can download the full papers for printing, your archive, or to share with colleagues. This page is a shorter on-screen overview with links to the repositories.
Abstract
Ferrum implements a broad set of Global Alliance for Genomics and Health (GA4GH) APIs in one composable, on-premises-first runtime. This paper summarises motivation, architecture, Crypt4GH and multi-engine execution, validation with the independent HelixTest suite, and reproducible benchmarks via the Apache-2.0 Ferrum GA4GH Demo. It also explains how we use AI-assisted engineering transparently—as acceleration, not a substitute for engineering judgement.
1. Origin and motivation
The gap we saw
GA4GH defines interoperable APIs for genomic data and compute (TES, WES, DRS, TRS, htsget, Beacon, Passports, Crypt4GH). Many production systems implement subsets only; cross-service, reproducible stacks are rare. The ecosystem benefits from working implementations that exercise the standards together in real pipelines.
Related work
Strong component implementations exist (e.g. TESK, Funnel, cwl-WES, WESkit, GA4GH Starter Kit pieces, Galaxy integrations). Ferrum targets the combination of TRS + DRS + WES + TES + htsget + Beacon + Passports + Crypt4GH in one runtime with a unified gateway, shared authentication, and continuous cross-service conformance tests—not as a replacement for those projects, but as a focused integration point.
Why we built it
Synaptic Four is a small consultancy. Instead of waiting for a client to commission an integrated stack, we became our own first customer: Ferrum is both product and public proof of how we work—precise, documented, and test-driven.
“We went looking for this integrated tool. We did not find it. So we built it—in the language the GA4GH community speaks.”
2. What Ferrum is
Services are composed behind a single gateway (Rust, async). Metadata lives in PostgreSQL; objects in S3-compatible storage (MinIO, cloud S3, POSIX, OpenDAL-backed backends where applicable). Crypt4GH is integrated at the DRS layer with O(1) header re-wrapping for per-requester delivery.
Deployment is selective: run only the GA4GH services you need. Ferrum Lab Kit (BUSL) provides an opinionated on-ramp for labs, ELIXIR node candidates, and GDI-style footprints—see the Lab Kit repository.
WES backends include Cromwell (WDL), Nextflow, CWL, and Snakemake, routed via TES; SLURM and LSF are supported for on-premises HPC.
3. Validation and benchmarks
End-to-end flows span TRS workflow retrieval, DRS data access, WES/TES execution, and optional Crypt4GH profiles—documented in the Ferrum GA4GH Demo with hap.py metrics and DRS micro-benchmarks (plain vs at-rest encryption). Scope is intentionally modest (small synthetic-style slices); we do not over-claim beyond what we measure.
4. HelixTest: conformance in CI
HelixTest is an Apache-2.0 suite maintained by Synaptic Four. It runs on every Ferrum push/PR against a live stack (Postgres, MinIO, Keycloak, seeded data), covering API contracts, workflow E2E, cross-service chains, Passports/OIDC-style auth, and Crypt4GH-oriented tests. Anyone can reuse HelixTest for their own GA4GH-compatible platform. Results are a technical signal—not official GA4GH certification.
5. How Synaptic Four works
AI as a tool, not a shortcut
We use AI-assisted engineering for scaffolding, spec navigation, and repetitive but precise coding—never as a substitute for architecture, review, or accountability. Correctness and design remain human-owned.
“AI is a tool. Responsibility for correctness, design, and consequences stays with the engineer.”
Precision and transparency
Public benchmarks state dataset scope, repetition counts, and limitations. Reproducibility matters: demo repos pin versions and emit machine-readable artefacts.
Licensing snapshot
Ferrum: BUSL-1.1 (see LICENSE and docs/BUSINESS-MODEL.md). HelixTest and Ferrum GA4GH Demo: Apache-2.0. Always follow the LICENSE file in each repository.
6. Conclusion
A small team can deliver a unified GA4GH-oriented runtime when standards, tests, and honest scope are treated as first-class. HelixTest and the GA4GH Demo make that claim checkable. We welcome collaboration, deployment partnerships, and research contact.
This white paper may be shared freely for non-commercial purposes with attribution.