ADNI metabolomics QC
Metabolomics data processing for the ADNI data sets. Currently, only supports the Biocrates p180 and Nightingale NMR platforms.
Installation
metabo_adni is distributed as a python package, so install it by running:
pip install metabo_adni
Usage
In the folder with the required datasets, simply run:
clean_files
And metabo_adni will run with the default parameters. Note: do not change the original name of the files.
Commands
-D
: define the directory were the files are located. Default, current working directory-P
: define the platform, either p180 or nmr. Default, p180-F
: define the fasting file. Default, BIOMARK.csv-L
: define the directory were the LOD p180 files are located. Default, current working directory--mmc
: remove metabolites with missing proportions greater than cutoff. Default, 0.2--mpc
: remove participants with missing proportions greater than cutoff. Default, 0.2--cv
: remove metabolites with CV values greater than cutoff. Default, 0.2--icc
: remove metabolites with ICC values lower than cutoff. Default, 0.65--log2
: apply log2 transformation to metabolite concentration values--merge
: merge data frames across cohorts--zscore
: apply zscore transformation to metabolite concentration values--winsorize
: winsorize extreme values (more than 3 std of mean)--remove-moutliers
: remove multivariate outliers using the Mahalanobis distance--residualize-meds
: replace metabolite values with residuals from a regression with medication intake. Note that residuals are scaled to unit variance
Files
The following are the files needed to run metabo_adni.
NMR
For Nightingale's NMR metabolomics platform the file recommended is the one reanalyzed using the "2020 update", an updated quantification library.
The name of the item to download is ADMC Nightingale Platform NMR Post-Unblinding Re-Analysis of Lipoproteins and Metabolites [ADNI1,GO,2]
, which downloads the ADNINIGHTINGALE2.csv
file.
Biocrates p180
For the Biocrates p180 platform, several files need to be downloaded that contain the proper data, QC tags, and LOD values:
Data
Four datasets contain the proper data, divided by method (FIA, UPLC) and cohort (ADNI1, ADNI2GO):
ADMCDUKEP180FIA_01_15_16.csv
obtained fromADMC Duke Biocrates P180 Kit Flow injection analysis [ADNI1]
itemADMCDUKEP180FIAADNI2GO.csv
obtained fromADMC Duke Biocrates p180 Kit Flow injection analysis [ADNIGO,2]
itemADMCDUKEP180UPLC_01_15_16.csv
obtained fromADMC Duke Biocrates P180 Kit Ultra Performance Liquid Chromatography [ADNI1]
itemADMCDUKEP180UPLCADNI2GO.csv
obtained fromADMC Duke p180 Ultra Performance Liquid Chromatography [ADNIGO,2]
item. Note: make sure to add a single quote (") at the end of this file, if not, pandas will not read correctly the file.
LOD
LOD values for the ADNI2GO can be found in the supplementary material ADMC Duke p180 Supplementary Files [ADNIGO,2]
item.
LOD values for the ADNI1 can be found in the supplementary material ADMC Supplemental Materials
item.
Fasting
To remove non-fasting participants, we need the BIOMARK.csv
file downloaded from the Biomarker Samples [ADNI1,GO,2,3,4]
item from ADNI.
Medications
Medication intake information can be found in the ADMC Duke ADNI2/GO Drug Classes
item, which donwloads the ADMCPATIENTDRUGCLASSES_20170512.csv
file