⚕️ Interpretable Clinical Decision Rules ⚕️️

Validating and deriving clinical-decision rules. Work-in-progress.

This is a collaborative repository intended to validate and derive clinical-decision rules. We use a unified pipeline across a variety of contributed datasets to vet previous modeling practices for clinical decision rules. Additionally, we hope to externally validate the rules under study here with data from UCSF.

Rule derivation datasets

Dataset	Task	Size	References	Processed
iai_pecarn	Predict intra-abdominal injury requiring acute intervention before CT	12,044 patients, 203 with IAI-I	📄, 🔗	✅
tbi_pecarn	Predict traumatic brain injuries before CT	42,412 patients, 376 with ciTBI	📄, 🔗	❌
csi_pecarn	Predict cervical spine injury in children	3,314 patients, 540 with CSI	📄, 🔗	❌
tig_pecarn	Predict bacterial/non-bacterial infections in febrile infants from RNA transcriptional biosignatures	279 patients, ? with infection	🔗	❌
exxagerate	Predict 30-day mortality for acute exacerbations of chronic obstructive pulmonary disease (AECOPD)	1,696 patients, 17 mortalities	📄, 🔗	❌
heart_disease_uci	Predict heart disease presence from basic attributes / screening	920 patients, 509 with heart disease	📄, 🔗	❌

Research paper 📄, Data download link 🔗

Datasets are all tabular (or at least have interpretable input features), reasonably large (e.g. have at least 100 positive and negative cases), and have a binary outcome. For PECARN datasets, please read and agree to the research data use agreement on the PECARN website.

Contributing checklist

To contribute a new project (e.g. a new dataset + modeling), create a pull request following the steps below. The easiest way to do this is to copy-paste an existing project (e.g. iai_pecarn) into a new folder and then edit that one.

Helpful docs: Collaboration details | Lab writeup | Slides

[ ] Repo set up
[ ] Create a fork of this repo (see tutorial on forking/merging here)
[ ] Install the repo as shown below
[ ] Select a dataset - once you've selected, open an issue in this repo with the name of the dataset + a brief description so others don't work on the same dataset
[ ] Assign a project_name to the new project (e.g. iai_pecarn)
[ ] Data preprocessing
[ ] Download the raw data into data/{project_name}/raw
- Don't commit any very large files
[ ] Copy the template files from rulevetting/projects/iai_pecarn to a new folder rulevetting/projects/{project_name}
- [ ] Rewrite the functions in dataset.py for processing the new dataset (e.g. see the dataset for iai_pecarn)
- [ ] Document any judgement calls you aren't sure about using the dataset.get_judgement_calls_dictionary function
  - See the template file for documentation of each function or the API documentation
- Notebooks / helper functions are optional, all files should be within rulevetting/projects/{project_name}
[ ] Data description
[ ] Describe each feature in the processed data in a file named data_dictionary.md
[ ] Summarize the data and the prediction task in a file named readme.md. This should include basic details of data collection (who, how, when, where), why the task is important, and how a clinical decision rule may be used in this context. Should also include your names/affiliations.
[ ] Modeling
[ ] Baseline model - implement baseline.py for predicting given a baseline rule (e.g. from the existing paper)
- should override the model template in a class named Baseline
[ ] New model - implement model_best.py for making predictions using your newly derived best model
- also should override the model template in a class named Model
[ ] Lab writeup (see instructions)
[ ] Save writeup into writeup.pdf + include source files
Should contain details on exploratory analysis, modeling, validation, comparisons with baseline, etc.
[ ] Submitting
[ ] Ensure that all tests pass by running pytest --project {project_name} from the repo directory
[ ] Open a pull request and it will be reviewed / merged
[ ] Reviewing submissions
[ ] Each pull request will be reviewed by others before being merged

Installation

Note: requires python 3.7 and pytest (for running the automated tests). It is best practice to create a venv or pipenv for this project.

python -m venv rule-env
source rule-env/bin/activate

Then, clone the repo and install the package and its dependencies.

git clone https://github.com/Yu-Group/rule-vetting
cd rule-vetting
pip install -e .

Now run the automatic tests to ensure everything works.

pytest --project iai_pecarn

To use with jupyter, might have to add this venv as a jupyter kernel.

python -m ipykernel install --user --name=rule-env

Clinical Trial Datasets

Dataset	Task	Size	References	Processed
bronch_pecarn	Effectiveness of oral dexamethasone for acute bronchiolitisintra-abdominal injury requiring acute intervention before CT	600 patients, 50% control	📄, 🔗	❌
gastro_pecarn	Impact of Emergency Department Probiotic Treatment of Pediatric Gastroenteritis	886 patients, 50% control	📄, 🔗	❌

Research paper 📄, Data download link 🔗

Reference

Background reading

Be familiar with the imodels: package
See the TRIPOD statement on medical reporting
See the Veridical data science paper

Related packages

imodels: rule-based modeling
veridical-flow: stability-based analysis
gplearn: symbolic regression/classification
pygam: generative additive models
interpretml: boosting-based gam

Updates

For updates, star the repo, see this related repo, or follow @csinva_
Please make sure to give authors of original datasets appropriate credit!
Contributing: pull requests very welcome!

Related open-source collaborations

The imodels package maintains many of the rule-based models here
Inspired by the BIG-bench effort.
See also NL-Augmenter and NLI-Expansion

Expand source code

"""
.. include:: ../readme.md
"""
import os
from os.path import join as oj

MRULES_PATH = os.path.dirname(os.path.abspath(__file__))
REPO_PATH = os.path.dirname(MRULES_PATH)
DATA_PATH = oj(REPO_PATH, 'data')
PROJECTS_PATH = oj(MRULES_PATH, 'projects')
AUTOGLUON_CACHE_PATH = oj(DATA_PATH, 'autogluon_cache')

Sub-modules

rulevetting.api
rulevetting.projects
rulevetting.templates