Semantic web representation for the SyntheaTM and CSVs to Turtle (.ttl) conversion tool.
- clone the repo
python3 -m venv .venv
source .venv/bin/activate
poetry install
🔌 Activate .venv
environment everytime before using synthea-rdf
by running source .venv/bin/activate
command.
pip install synthea-rdf
All conversion configurations should be specified in configuration.yaml
.
Here is a sample configuration.yaml
.
model_path: synthea_ontology/synthea_ontology.ttl
synthea_csv_path: ../synthea/output/1000k/csv
output_path: result/1000k
chunk_size: 300000
include_dua: True
include_trustscore: True
skip:
- allergies.csv
- careplans.csv
- claims_transactions.csv
- claims.csv
- conditions.csv
- devices.csv
- encounters.csv
- imaging_studies.csv
- immunizations.csv
- medications.csv
- observations.csv
- organizations.csv
- patients.csv
- patient_expenses.csv
- payer_transitions.csv
do_shutdown: False
After specification, simply run:
python3 conversion.py
Running conversion process with TMUX
The bigger the data size, the more time that the data conversion needs. In this case, it would be better to use CLI in the background and check the progress time to time. The best way is to run the process in a TMUX session and detach it. It is possible to check the progress by attaching the TMUX session.
Example:
$ tmux
$ python3 conversion.py
- Press
[CTRL]+[b]
, then[d]
to detach the TMUX session. - Now it is okay to log off. (:warning:DO NOT SHUT DOWN THE MACHINE!!)
$ tmux a
to attach the session and check the progress
Use Trust score and Data Usage Agreement (DUA) generator to generate optional Trust Score
and DUA
data.
python3 trustscore_dua_generator.py