Skip to content

antoniolopezmc/Discovering-multiple-antibiotic-resistance-phenotypes-using-diverse-top-k-subgroup-list-discovery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery

This repository contains all the needed resources in order to reproduce the research from the paper "Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery". For this, it is necessary to execute the files in the following order:

1. Initial query

The first step consists of executing a SQL query over the MIMIC-III database, which was previously stored in a PostgreSQL database. The Initial-query/query.txt file contains this query, which produces a dataset in CSV format.

2. Preprocessing

After executing the SQL query, preprocessing is applied over the obtained dataset in order to generate the mining view. This preprocessing is included and detailed in the Preprocessing/notebook.ipynb file.

3. Experiments

Once having the mining view, it is necessary to execute the Experiments/notebook.ipynb file in order to prepare it for the experiments.

Next, the Experiments/subgroups/mine_subgroups.py file contains all the needed code in order to execute the VLSD algorithm and mine the candidate subgroups.

After that, the format of the file generated above must be transformed to be able to execute our approach. This transformation is carried out with the Experiments/phenotypes/transform_subgroups.py file.

Now, we can use this file with the candidate subgroups in order to mine diverse top-k phenotypes. This mining process is included in the Experiments/phenotypes/mine_phenotypes.py file.

When we have mined all the diverse top-k phenotypes, we select the best model using the code included in the Experiments/phenotypes/select_best_model.ipynb file.

Finally, the code to carry out the statistical validation of the best diverse top-k phenotypes is included in the Experiments/statistical_validation/notebook.ipynb file.

About

This repository contains all the resources used for the development of the paper "Discovering multiple antibiotic resistance phenotypes using diverse top-k subgroup list discovery".

Resources

Stars

Watchers

Forks

Contributors