Overview
This document compares Cascade Protocol against four commonly used approaches to health data representation: FHIR R4, Apple HealthKit, OpenEHR, and ad-hoc JSON. The goal is to give implementors and evaluators an honest picture of where each approach fits, where Cascade has genuine advantages, and where it falls short.
This is reference material, not a sales pitch. Each system solves real problems. The right choice depends on your use case, infrastructure, and constraints. The comparison focuses on seven dimensions that matter most for modern health data applications, particularly those involving AI agents, patient ownership, and longitudinal data.
Scope of this comparison
This comparison addresses health data representation and storage, not clinical workflow software or EHR systems broadly. FHIR, for example, covers far more than data modeling—it includes APIs, messaging, terminology, and clinical decision support. This comparison focuses only on its role as a data model.
Systems Compared
FHIR R4
The dominant interoperability standard for clinical health data exchange. Mandated by US federal regulation for EHR data access. Resource-based JSON/XML with REST API conventions. Maintained by HL7 International since 2011, R4 released in 2019.
Apple HealthKit
Apple's on-device health data store for iOS and watchOS. Provides a walled-garden API for consumer wellness and clinical records imported from participating hospitals. Access is iOS-only and mediated through Apple's permission system.
OpenEHR
Archetype-based clinical data modeling standard developed since 1999. Separates the information model from the knowledge model using ADL archetypes. Strong in complex clinical modeling; widely used in Europe and Australia.
Ad-hoc JSON
Custom JSON schemas designed per application. Most digital health startups begin here. Flexible and fast to ship, but typically lacks provenance, standard terminology mappings, and interoperability outside the originating system.
Cascade Protocol
RDF/Turtle-based health data vocabulary built on W3C Linked Data standards. Three-layer ontology (standards → domain → patient-facing), W3C PROV-O provenance, local-first encrypted storage via Solid Pods, and AI agent access via MCP. First production use: POTS Check, App Store, December 2025.
Comparison Table
Each dimension is rated qualitatively. Ratings reflect the system's design intent and out-of-the-box capability, not what is theoretically achievable with custom extensions.
| Dimension | FHIR R4 | Apple HealthKit | OpenEHR | Ad-hoc JSON | Cascade Protocol |
|---|---|---|---|---|---|
| Data Model Expressiveness |
Rich resource types covering most clinical domains. Extensions for edge cases. Standardized terminology bindings. |
Good for consumer wellness and common vitals. Limited clinical modeling depth. Opaque type system. |
Extremely detailed clinical modeling via archetypes. Steeper learning curve. Designed for the most complex clinical data. |
Completely flexible. Quality depends entirely on the team's design choices and domain expertise. |
Three-layer ontology covers wellness, clinical, and patient-facing views. Newer vocabulary; covers all common EHR export types (procedures, encounters, lab panels, medications, devices, imaging studies, claims, EOBs) plus consumer wellness and patient-facing summaries. Fewer total resource types than FHIR's full specification. Extensible via domain-specific namespaces. |
| Provenance Support |
Provenance resource exists but is optional and often omitted in practice. Meta.source and meta.lastUpdated are widely used but coarse. |
Source bundle tracks originating app. No provenance model for distinguishing clinical vs device vs user-entered data at the record level. |
Audit trails and contribution metadata are part of the reference model. Detailed contributor tracking built in. |
No standard mechanism. Teams typically add ad-hoc created_by and updated_at fields, if anything. |
W3C PROV-O on every record. Five typed provenance classes: ClinicalGenerated, DeviceGenerated, SelfReported, AIExtracted, AIGenerated. Non-optional by design.
|
| Agent Readability |
JSON is processable, but lacks self-describing semantics. FHIR profiles and terminology require external schema documentation for correct agent interpretation. |
Platform API; not directly accessible to AI agents. Requires iOS app intermediary. No standard query protocol. |
Rich semantic model but complex to parse. Agents need archetype definitions to interpret data correctly. |
Fully opaque to agents without per-application schema documentation. No standard semantics. |
RDF data is self-describing. Every predicate is a URI with a defined meaning. Agents can reason over data without external schema docs. Native MCP server integration for structured AI access. |
| Local-First Capability |
FHIR is server-centric by design (REST API). Local-only FHIR is possible but requires additional infrastructure (e.g., embedded HAPI FHIR). |
Fully on-device by design. Apple manages local storage; no server required for core functionality. |
Reference implementations are typically server-based. Local-only deployments are possible but non-standard. |
Depends on implementation choices. Can be local or cloud depending on where the database is deployed. |
Local-first by design. Turtle files on disk. No server required. AES-256-GCM encryption at rest via CryptoKit. Zero network calls in core SDK. |
| Patient Ownership |
Patient access via SMART on FHIR APIs. Data still lives on provider or payer servers. Export is possible but not frictionless. |
Patient holds data on their device, but it is locked to the Apple ecosystem. No standard export format; data can leave via Health Records XML but loses queryability. |
OpenEHR is primarily an institutional data store. Patient-controlled deployment is architecturally possible but not the primary use case. |
No standard ownership or portability model. Typically controlled by whoever operates the server. |
Patient holds the Pod on their device. Open RDF/Turtle format is readable without proprietary tools. Solid-compatible for future server sync. Designed around individual data sovereignty. |
| Format Conversion |
Rich converter ecosystem (HAPI FHIR, Azure API for FHIR, Google Cloud Healthcare). FHIR is the de facto exchange format for clinical data. |
Apple Health Records XML, FHIR import via CDA documents. No standard outbound conversion path. |
FHIR ↔ OpenEHR converters exist. Archetype complexity makes lossless conversion difficult. |
Custom converters required for every integration. No standard tooling. |
CLI cascade convert command supports FHIR output. RDF’s graph model maps naturally to FHIR resources. Tooling is early-stage and limited compared to the FHIR ecosystem.
|
| Ecosystem Maturity |
Mandated by US federal regulations (21st Century Cures Act). Supported by every major EHR vendor. Enormous tooling ecosystem. HL7 International backing. |
Billions of iOS devices. Every iOS app with health integration uses HealthKit. Apple's ecosystem lock-in is both a strength and a limitation. |
25+ years of development. Significant adoption in Europe (particularly Scandinavia and the UK) and Australia. Smaller ecosystem than FHIR in the US. |
No ecosystem by definition. Each implementation is bespoke. |
One production application (POTS Check). Open-source Swift SDK and CLI. No enterprise adoption. No third-party tooling yet. This is an honest limitation. |
Data Model Expressiveness
"Expressiveness" here means: how completely can a system represent the structure, relationships, and semantics of health data without lossy simplification?
FHIR R4
FHIR has the most comprehensive standard resource library for clinical data. Resources like MedicationRequest, Observation, Condition, AllergyIntolerance, and DiagnosticReport cover the majority of clinical documentation needs with well-defined semantics. FHIR Extensions allow customization, though they reduce interoperability if not profiled properly.
OpenEHR
OpenEHR is arguably the most expressive system in this comparison for complex clinical data. Its archetype model separates the generic information model from domain knowledge, allowing extremely detailed clinical modeling. The CKM (Clinical Knowledge Manager) contains thousands of community-reviewed archetypes. The tradeoff is significant implementation complexity.
Apple HealthKit
HealthKit is strong for consumer wellness data: heart rate, step count, sleep analysis, blood oxygen, and similar metrics. It imports clinical records (via CDA/FHIR) from participating institutions. Its type system is opaque and not designed for extensibility or complex clinical relationships between records.
Ad-hoc JSON
Completely flexible but offers nothing by default. The expressiveness of any given ad-hoc JSON API depends entirely on the design team's domain knowledge. Without standard terminology bindings, relationships between records often cannot be resolved without custom integration code.
Cascade Protocol
Cascade's three-layer ontology provides good coverage for its target domains: consumer wellness data (health:), clinical records imported from EHR systems (clinical:), patient-facing summaries (checkup:), and specialized screening protocols (pots:). The protocol maps all domain terms to SNOMED CT and LOINC codes where they exist, maintaining FHIR alignment.
Cascade currently covers fewer resource types than FHIR R4. It is well-suited for individual patient data management and AI agent access; it is not designed as an enterprise clinical data exchange format.
clinical:, health:, coverage:) for storage — flat, queryable properties designed for patient-owned data rather than clinical system exchange. FHIR's official RDF serialization (FHIR RDF) is not used as storage because it requires verbose blank-node nesting and preserves clinical-system metadata irrelevant to patients. Instead, every Cascade Layer 2 class declares rdfs:subClassOf alignment to the corresponding FHIR type, maintaining semantic traceability without inheriting FHIR RDF's ergonomic costs.
Provenance Support
Provenance answers: Where did this data come from? Who created it? When? Through what process? This matters for clinical decision-making (a self-reported symptom score has different weight than a lab result from a certified instrument) and for AI agent interactions (an AI-generated summary should be distinguishable from a clinician-authored note).
Cascade's provenance model is a first-class design concern rather than an optional extension. Every record carries a typed provenance class from the core vocabulary:
cascade:ClinicalGenerated— originated from an EHR or clinical systemcascade:DeviceGenerated— measured by a wearable or medical devicecascade:SelfReported— entered directly by the patientcascade:AIExtracted— structured data parsed from an existing clinical document by AIcascade:AIGenerated— created or synthesized by an AI agent
These types are enforced at the serialization layer, not optional metadata. The MCP server automatically applies AIGenerated provenance to any record written through agent access, and this cannot be overridden by the agent.
FHIR's Provenance resource provides similar capabilities, but in practice it is rarely populated in EHR exports. HL7's own implementation guides acknowledge that provenance data completeness varies widely across FHIR implementations. OpenEHR has strong audit trail support as part of its reference model, making it a genuine peer to Cascade in this dimension.
Agent Readability
Agent readability means: can an AI agent correctly interpret health data without requiring hand-crafted system prompts, per-schema documentation, or custom parsing code?
This is where RDF-based data formats have a structural advantage over JSON or binary formats. In RDF, every predicate is a globally unique URI that resolves to a defined meaning. A triple like:
<urn:uuid:abc123>
health:heartRate "72"^^xsd:integer ;
health:heartRateSnomedCode "364075005" .
…is self-describing. An agent encountering health:heartRate can resolve the URI https://ns.cascadeprotocol.org/health/v1#heartRate to retrieve a definition. The SNOMED code 364075005 maps to the universal clinical concept for heart rate. No system prompt required to tell the agent what this field means.
By contrast, a FHIR JSON payload like:
{
"resourceType": "Observation",
"code": { "coding": [{ "system": "http://loinc.org", "code": "8867-4" }] },
"valueQuantity": { "value": 72, "unit": "bpm" }
}
…is machine-readable, but the agent needs to know the FHIR Observation resource structure to correctly extract the heart rate value, and needs LOINC code 8867-4 to be in its training data to understand the concept. FHIR is well-covered by LLM training data, so this is less of a practical concern for established resource types, but the requirement for external schema knowledge remains.
The Cascade Protocol also provides a native MCP server (cascade serve --mcp) that exposes structured query and write tools directly to AI agents, eliminating the need to parse raw Turtle files. Agents interact with typed tools that return provenance-annotated records.
Local-First Capability
A local-first system stores data on the user's device and operates without a network connection. This has significant implications for privacy, reliability, and patient control.
FHIR was designed for server-based data exchange. Running FHIR locally requires deploying a FHIR server (HAPI FHIR, IBM FHIR Server, etc.) on the local machine, which is viable for development but adds operational overhead for consumer applications.
Apple HealthKit is natively local-first. Data lives on the device. The limitation is ecosystem lock-in: the data is only accessible through Apple's APIs, on Apple hardware.
Cascade Protocol's local-first design is intentional and complete. The CascadeSDK (Swift) stores Turtle files directly on the local filesystem, encrypted with AES-256-GCM using keys stored in the platform keychain. There are no server dependencies. The CLI and MCP server run as local processes. This is verifiable by auditing the source code for network calls—there are none in the core SDK.
Patient Ownership
Patient ownership means the patient holds, controls, and can portably export their health data without depending on a provider, payer, or platform.
FHIR improves patient access significantly compared to pre-2016 EHR data silos. SMART on FHIR enables patient-mediated access. However, the data still lives on the provider's or payer's FHIR server. Patients can read their data; they do not hold it.
Apple HealthKit gives patients strong local control, but the data is encoded in Apple's proprietary format and accessible only through Apple's APIs. Exporting HealthKit data in a portable, queryable format requires significant custom development.
Cascade Protocol stores data as plain Turtle files that any RDF tool can read. The open .ttl format is readable in a text editor. The Pod structure is compatible with the Solid specification, enabling future server sync while maintaining the patient as the primary data holder. The protocol does not require any Cascade-specific server or tooling to access stored data.
Code Examples
The following examples show how each system represents the same data: a heart rate measurement of 72 bpm, taken by a wearable device on a specific patient at a specific time. This illustrates the structural and semantic differences between formats.
FHIR R4 (JSON)
{
"resourceType": "Observation",
"id": "heart-rate-example-001",
"status": "final",
"category": [{
"coding": [{
"system": "http://terminology.hl7.org/CodeSystem/observation-category",
"code": "vital-signs",
"display": "Vital Signs"
}]
}],
"code": {
"coding": [{
"system": "http://loinc.org",
"code": "8867-4",
"display": "Heart rate"
}]
},
"subject": {
"reference": "Patient/patient-123"
},
"effectiveDateTime": "2026-01-15T10:30:00Z",
"valueQuantity": {
"value": 72,
"unit": "beats/minute",
"system": "http://unitsofmeasure.org",
"code": "/min"
},
"device": {
"display": "Apple Watch Series 9"
}
}
FHIR is explicit and well-specified. The LOINC code 8867-4 is the universal identifier for heart rate. The resource is self-contained and parseable without external documentation if you know the FHIR Observation structure. The verbosity is a deliberate tradeoff for unambiguous interoperability.
Apple HealthKit (Swift API)
// Writing a heart rate sample
let heartRateType = HKQuantityType(.heartRate)
let heartRateQuantity = HKQuantity(
unit: HKUnit(from: "count/min"),
doubleValue: 72.0
)
let sample = HKQuantitySample(
type: heartRateType,
quantity: heartRateQuantity,
start: Date(),
end: Date(),
metadata: [HKMetadataKeyDeviceName: "Apple Watch Series 9"]
)
healthStore.save(sample) { success, error in
// handle result
}
// Reading is via HKSampleQuery -- data is opaque until queried
// No direct file access; no standard export format
HealthKit is an API, not a file format. Data is stored in Apple's on-device database, accessible only through the HealthKit API. There is no way to inspect or export raw data without going through Apple's query interfaces. The metadata dictionary provides some flexibility, but it is not semantically typed.
OpenEHR (ADL Archetype snippet)
-- Excerpt from openEHR-EHR-OBSERVATION.pulse.v2 archetype
-- Full archetype is ~300 lines; this shows the heart rate data point
OBSERVATION[at0000] matches { -- Pulse/Heart beat
data matches {
HISTORY[at0002] matches {
events cardinality matches {1..*; unordered} matches {
EVENT[at0003] occurrences matches {0..*} matches { -- Any event
data matches {
ITEM_TREE[at0001] matches {
items cardinality matches {0..*; unordered} matches {
ELEMENT[at0004] occurrences matches {0..1} matches { -- Rate
value matches {
C_DV_QUANTITY <
list = <
["1"] = <
units = <"1/min">
magnitude = <|>=0.0|>
>
>
>
}
}
}
}
}
}
}
}
}
}
OpenEHR archetypes are thorough and clinically precise. The pulse archetype covers far more than the rate: rhythm, character, clinical interpretation, and many other attributes. This completeness is valuable for complex clinical scenarios. For a simple wearable heart rate sample, it is substantially more structure than required.
Ad-hoc JSON
{
"user_id": "patient-123",
"metric": "heart_rate",
"value": 72,
"unit": "bpm",
"timestamp": "2026-01-15T10:30:00Z",
"source": "apple_watch",
"created_by": "device_sync"
}
Fast to implement, but the semantics are entirely private to this application. The string "heart_rate" has no universal meaning outside this system. There is no standard code that a clinical system could map to. The "created_by": "device_sync" field is a provenance attempt, but it is neither typed nor standardized. Any integration with a third system requires custom mapping code.
Cascade Protocol (Turtle RDF)
@prefix cascade: <https://ns.cascadeprotocol.org/core/v1#> .
@prefix health: <https://ns.cascadeprotocol.org/health/v1#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<urn:uuid:f7a3c21b-8e4d-4f09-9a2c-3b4e5f6a7b8c>
a health:HeartRateRecord ;
# Semantic value with standard code mapping
health:heartRate "72"^^xsd:integer ;
health:heartRateUnit "beats/min" ;
health:heartRateSnomedCode "364075005" ; # SNOMED: Heart rate (observable entity)
health:heartRateLoincCode "8867-4" ; # LOINC: Heart rate
# Timestamp
health:recordedAt "2026-01-15T10:30:00Z"^^xsd:dateTime ;
# Provenance (non-optional, typed)
cascade:dataProvenance cascade:DeviceGenerated ;
cascade:schemaVersion "2.2" ;
# PROV-O linkage
prov:wasAttributedTo <https://id.cascadeprotocol.org/users/patient-123> ;
prov:generatedAtTime "2026-01-15T10:30:00Z"^^xsd:dateTime ;
# Device context
health:sourceDevice "Apple Watch Series 9" .
The Cascade Turtle representation is more verbose than ad-hoc JSON, but each predicate is a globally dereferenceable URI. The SNOMED and LOINC codes provide standard clinical mappings. Provenance is typed (DeviceGenerated) rather than a free-text string. Any RDF-capable agent or tool can process this without Cascade-specific knowledge.
Cascade Weaknesses
Honest assessment
The following are genuine limitations of the Cascade Protocol as of February 2026. They are not engineering oversights—they reflect the protocol's current stage of development and its deliberately narrow initial focus on patient-controlled data and AI agent access.
Small ecosystem
There is one production application (POTS Check), one organization building with the protocol, and no third-party tooling. Compare this to FHIR, which has thousands of implementations and is mandated by US law for EHR access. If your project requires enterprise EHR integration, broad vendor support, or regulatory credibility, FHIR is the appropriate choice.
No enterprise EHR integration
Cascade does not have certified integrations with Epic, Cerner, or other major EHR vendors. There is no Cascade FHIR Server that hospitals can connect to. Data can be imported from FHIR sources (via the Swift SDK's ClinicalRecordImporter) and exported to FHIR format, but the protocol has no native position in the institutional health IT stack.
Newer and less proven
OpenEHR has been in development since 1999. FHIR has been in production since 2011. The Cascade Protocol published its first stable vocabulary in 2025. The provenance model and three-layer architecture are well-reasoned, but they have not been subjected to the scale, adversarial use, and edge-case pressure that decades of production systems accumulate.
RDF tooling is less approachable
Turtle and RDF are familiar to semantic web practitioners but represent a learning curve for most web and mobile developers. JSON is universally understood; SPARQL and Turtle are not. The protocol mitigates this with high-level SDK abstractions, but implementors who need to work directly with the Turtle layer will encounter a steeper onramp than FHIR JSON.
Limited vocabulary coverage
Cascade vocabularies cover the data types used by the protocol's existing applications. There are currently no Cascade vocabulary definitions for radiology, genomics, surgical procedures, or many other clinical domains. FHIR covers these with standardized resources; Cascade does not.
When to Use Each
| Use Case | Recommended Approach | Rationale |
|---|---|---|
| Enterprise EHR integration | FHIR R4 | Mandated by regulation, supported by all major EHR vendors, vast tooling ecosystem |
| iOS wellness or consumer health app | HealthKit or Cascade | HealthKit for Apple-ecosystem-only apps; Cascade if you need provenance, AI agent access, or portability outside Apple |
| Complex clinical data modeling | OpenEHR | Most expressive for nuanced clinical models; strong institutional adoption in Europe |
| Individual patient data + AI agent access | Cascade Protocol | Self-describing RDF, typed provenance, native MCP server, local-first, open format |
| Patient-controlled longitudinal records | Cascade Protocol | Pod-based storage on user device, open format readable without proprietary tools, Solid-compatible for sync |
| Quick prototype or single-app data store | Ad-hoc JSON | Lowest friction. Acceptable if interoperability and provenance are not requirements |
| Specialized screening protocol (e.g., POTS, diabetes) | Cascade Protocol | Domain-specific vocabulary layers (pots:, diabetes:) provide specialized modeling while retaining standard code mappings |
These are not mutually exclusive
Many production systems use multiple approaches. A hospital EHR exchanges data via FHIR; the patient imports that data into HealthKit; a Cascade Protocol application imports it from HealthKit via the Swift SDK and stores it in a local Pod for AI agent access. The protocol is designed to be a downstream consumer of FHIR data, not a replacement for it.