Package 'MAP'

Title: Multimodal Automated Phenotyping
Description: Electronic health records (EHR) linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. Towards that end, we developed an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP). Specifically, our proposed method, called MAP (Map Automated Phenotyping algorithm), fits an ensemble of latent mixture models on aggregated ICD and NLP counts along with healthcare utilization. The MAP algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying subjects with phenotype yes/no (See Katherine P. Liao, et al. (2019) <doi:10.1093/jamia/ocz066>.).
Authors: Jiehuan Sun [aut, cre], Katherine Liao [aut], Sheng Yu [aut], Tianxi Cai [aut]
Maintainer: Jiehuan Sun <[email protected]>
License: GPL-3
Version: 0.1.4
Built: 2024-12-03 05:57:08 UTC
Source: https://github.com/celehs/map

Help Index


MAP algorithm

Description

Main function to perform MAP algorithm to calculate predicted probabilities of positive phenotype for each patient based on NLP and ICD counts adjusted for healthcare utilization.

Usage

MAP(mat = NULL, note = NULL, yes.con = FALSE, full.output = FALSE)

Arguments

mat

Count data (sparse matrix). One of the columns has to be ICD data with name being ICD.

note

Note count (sparse matrix) indicating health utilization.

yes.con

A logical variable indicating if concomitant is desired. Not used for now.

full.output

A logical variable indicating if full outputs are desired.

Value

Returns a list with following objects:

scores

indicates predicted probabilities.

cut.MAP

the cutoff value that can be used to derive binary phenotype.

References

High-throughput Multimodal Automated Phenotyping (MAP) with Application to PheWAS. Katherine P. Liao, Jiehuan Sun, Tianrun A. Cai, Nicholas Link, Chuan Hong, Jie Huang, Jennifer Huffman, Jessica Gronsbell, Yichi Zhang, Yuk-Lam Ho, Victor Castro, Vivian Gainer, Shawn Murphy, Christopher J. O’Donnell, J. Michael Gaziano, Kelly Cho, Peter Szolovits, Isaac Kohane, Sheng Yu, and Tianxi Cai with the VA Million Veteran Program (2019) <doi:10.1101/587436>.

Examples

## simulate data to test the algorithm
n = 400
ICD = c(rpois(n/4,10), rpois(n/4,1), rep(0,n/2) )
NLP = c(rpois(n/4,10), rpois(n/4,1), rep(0,n/2) )
mat = Matrix(data=cbind(ICD,NLP),sparse = TRUE)
note = Matrix(rpois(n,10)+5,ncol=1,sparse = TRUE)
res = MAP(mat = mat,  note=note)
head(res$scores)
res$cut.MAP

MAP dictionary

Description

MAP dictionary that maps phecode to CUIs

Usage

phecode.cuis.list

Format

A list of 1866

Examples

head(phecode.cuis.list)
tail(phecode.cuis.list)