Prerequisites
Before you begin, make sure you have the following:
Python 3.9+Runtime
pipPackage manager
Recommended
Verify your Python version:
python3 --version
# Python 3.9.0 or higher
Install the SDK
Install the core SDK with pip:
pip install cascade-protocol
The SDK supports optional extras for specific use cases:
| Extra | Install command | Adds |
|---|---|---|
pandas |
pip install "cascade-protocol[pandas]" |
DataFrame conversion via to_dataframe() |
validation |
pip install "cascade-protocol[validation]" |
SHACL validation via pyshacl |
notebooks |
pip install "cascade-protocol[notebooks]" |
Jupyter notebook support |
all |
pip install "cascade-protocol[all]" |
All of the above |
The SDK has zero mandatory runtime dependencies beyond the Python standard library. Optional extras are installed only when you need them.
Create your first record
Create a file called create_record.py. This creates a medication record and serializes it to RDF/Turtle format:
from cascade_protocol import Medication, serialize
# Create a medication record with required fields
med = Medication(
id="urn:uuid:med0-0001-aaaa-bbbb-ccccddddeeee",
medication_name="Metoprolol Succinate",
is_active=True,
dose="25mg",
data_provenance="ClinicalGenerated",
schema_version="1.3",
)
# Serialize to Turtle
turtle = serialize(med)
print(turtle)
Run it:
python3 create_record.py
@prefix cascade: <https://ns.cascadeprotocol.org/core/v1#> .
@prefix health: <https://ns.cascadeprotocol.org/health/v1#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<urn:uuid:med0-0001-aaaa-bbbb-ccccddddeeee> a health:MedicationRecord ;
health:medicationName "Metoprolol Succinate" ;
health:isActive true ;
cascade:dataProvenance cascade:ClinicalGenerated ;
cascade:schemaVersion "1.3" ;
health:dose "25mg" .
The serialize() function converts the Python object into a complete Turtle document with correct namespace prefixes, RDF types, and XSD datatype annotations. Python snake_case field names map automatically to the camelCase RDF predicates used in the Cascade Protocol vocabulary.
Supported data types
The SDK supports twelve data model classes covering medications, conditions, allergies, lab results, vital signs, immunizations, procedures, family history, insurance coverage, patient profile, activity, and sleep. See the schema reference for the full list.
Validate your data
The SDK includes a validate() function that checks Turtle output against the Cascade Protocol's SHACL constraint shapes. Install the validation extra first, then call validate() on any serialized Turtle string:
pip install "cascade-protocol[validation]"
from cascade_protocol import Medication, serialize, validate
med = Medication(
id="urn:uuid:med0-0001-aaaa-bbbb-ccccddddeeee",
medication_name="Metoprolol Succinate",
is_active=True,
dose="25mg",
data_provenance="ClinicalGenerated",
schema_version="1.3",
)
turtle = serialize(med)
# Validate structural integrity against SHACL shapes
result = validate(turtle)
print("Valid:", result.is_valid)
print("Errors:", result.errors)
Valid: True
Errors: []
The validator uses SHACL (Shapes Constraint Language) to check required fields, data types, value constraints, and vocabulary usage against the Cascade Protocol schema.
Open a Pod
A Pod is the Cascade Protocol's local data container — a directory of RDF/Turtle files organized by data category. Use Pod.open() to open an existing Pod and pod.query() to read records by type:
from cascade_protocol import Pod
# Open a Cascade Pod from a local directory
pod = Pod.open("./my-pod")
# Query by data type
meds = pod.query("medications")
vitals = pod.query("vital-signs")
profile = pod.query("patient-profile")
# Iterate records
for med in meds:
print(med.medication_name, med.dose)
# Query other data categories
conditions = pod.query("conditions")
labs = pod.query("lab-results")
allergies = pod.query("allergies")
The pod.query() method reads the appropriate .ttl file from the Pod directory, parses the Turtle, and returns a typed result set. Results are Python objects with the same field names as the model classes.
You can also parse Turtle strings directly if you have Turtle data from another source:
from cascade_protocol.deserializer import parse, parse_one
# Parse multiple records from a Turtle string
meds = parse(turtle_string, "MedicationRecord")
# Parse a single record (raises if zero or more than one)
med = parse_one(turtle_string, "MedicationRecord")
print(med.medication_name)
Pandas integration
Install the pandas extra, then call .to_dataframe() on any query result to convert records into a pandas DataFrame. This is useful for data analysis, visualization, and export workflows:
pip install "cascade-protocol[pandas]"
from cascade_protocol import Pod, Medication
pod = Pod.open("./my-pod")
meds = pod.query("medications")
# Convert to pandas DataFrame
df = meds.to_dataframe()
print(df.head())
# Standard pandas operations work as expected
active_meds = df[df["is_active"] == True]
print(f"Active medications: {len(active_meds)}")
# Reconstruct typed models from DataFrame rows
restored = Medication.from_dataframe(df)
for med in restored:
print(med.medication_name, med.dose)
The from_dataframe() class method reconstructs typed model instances from a DataFrame, preserving all field values and provenance metadata. This enables round-trip workflows: Pod → DataFrame → analysis → back to models → re-serialize.
Jupyter notebooks
The SDK ships with three Jupyter notebooks that demonstrate common workflows interactively. Install the notebooks extra, then open them from the SDK directory:
pip install "cascade-protocol[notebooks]"
Quick Start
Covers installation, creating records, serialization, and validation with runnable cells.
notebooks/01-quickstart.ipynb
Pod Exploration
Opening a Pod, querying multiple data types, and building summary visualizations with pandas and matplotlib.
notebooks/02-pod-exploration.ipynb
Data Analysis
Statistical analysis of wellness time-series data — trend detection, provenance filtering, and correlation across data types.
notebooks/03-data-analysis.ipynb
The notebooks use the reference patient Pod included in the SDK repository, so you can run them immediately without creating your own data.
Privacy & Security
The Python SDK is designed with the same privacy-first principles as the rest of the Cascade Protocol.
- Zero network calls during normal operation. All processing is local.
- No telemetry or analytics. The package does not phone home.
- Data never leaves your machine. Serialization and validation run entirely in-process.
- Delete your Pod directory at any time to remove all data:
rm -rf ./my-pod
Read the Security & Compliance Guide for the full trust model and data flow architecture.
Next Steps
Now that you have the SDK working, here is where to go next: