--- title: "Pediatric Complex Chronic Conditions" author: - Peter DeWitt - Seth Russell - James Feinstein - Tell Bennett date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{pccc-overview} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") ``` # Introduction This vignette describes how the `pccc` package generates the Complex Chronic Condition Categories (CCC) from ICD-9 and ICD-10 codes. A CCC is “any medical condition that can be reasonably expected to last at least 12 months (unless death intervenes) and to involve either several different organ systems or 1 organ system severely enough to require specialty pediatric care and probably some period of hospitalization in a tertiary care center." The categorization is based on the work of Feudtner et al. (2000 & 2014), as referenced below. A supplemental reference document showing the lists of codes for each category was published as a supplement to Feudtner et al. (2014) and we have made it available as part of the `pccc` package. After installing the package, you can find the file on your system with the below `system.file` call. Open the file with your preferred/available program for `.docx` files (Word, etc.). ```{r, eval = FALSE} system.file("pccc_references/Categories_of_CCCv2_and_Corresponding_ICD.docx", package = "pccc") ``` To evaluate the code chunks in this example you will need to load the following R packages. ```{r, message = FALSE} library(pccc) library(dplyr) ``` # Logic employed There are 12 total categories of CCCs used in this package. The first group of 10 are mutually exclusive - only one of them can be derived from a single ICD code: * Neurologic and Neuromuscular * Cardiovascular * Respiratory * Renal and Urologic * Gastrointestinal * Hematologic or immunologic * Metabolic * Other Congenital or Genetic Defect * Malignancy * Premature and Neonatal The last 2 can be be selected in addition to the above codes - for example, one ICD code could generate CCC categorization as both Gastrointestinal and Technology Dependency: * Technology Dependency * Transplant To see actual specific ICD codes by category, see [pccc-icd-codes](pccc-icd-codes.html). # Generating CCC categories from ICD codes The `ccc` function is the workhorse here. Simply put, a user will provide ICD codes as strings and `ccc` will return CCC categories. CCC codes for ICD-9-CM are matched on substrings and ICD 10 codes are matched on full codes, but the `ccc` function uses the same "starts with substring" matching logic for both, except in a few cases described in the next paragraph. ## Substring matching exceptions Some datasets may contain different degrees of specificity of ICD-9-CM codes, which can lead to issues with substring matching for certain codes. For example, consider a patient with _Congenital hereditary muscular dystrophy_. The least specific ICD-9-CM code for _Muscular dystrophy_ is 359, which is a CCC code. The exact ICD-9-CM code specifying _Congenital hereditary muscular dystrophy_ is 3590. Even when describing the same patient, one dataset may contain the 359 code while another dataset may contain the 3590 code. If we use substring matching logic above and match on 359, we would capture the patient in both datasets. However, we would also capture non-CCC diagnoses like 3594, _Toxic myopathy_. If we use substring matching logic and match on 3590, we would only capture the patient in the dataset with more specific ICD-9-CM codes. We address this problem by exact matching for less specific codes (e.g., the code 359 will match only if the dataset contains the 3-digit code 359) and substring matching for more specific codes (e.g., code 3590 will match any code _beginning with_ 3590). This approach improves the sensitivity of detecting CCCs in datasets with less specific codes (e.g. 359) and also reduces misclassification errors in datasets with more specific codes (e.g. 3590). **We have listed these exact match exceptions under their corresponding CCC category in the [pccc-icd-codes](pccc-icd-codes.html) description.** # Preparing ICD-9-CM and ICD-10-CM codes for analysis using the PCCC package **Users of the `pccc` package will need to pre-process the ICD-9 and ICD-10 codes in their data so that the strings are formatted in the way that the `pccc` package will recognize them.** Specific rules to format ICD Codes correctly: * Codes should be alphanumeric only (e.g. _Diabetes with renal manifestations, type II or unspecified type, uncontrolled_ should be sent as 25042) * Codes should NOT contain periods, spaces or other separator characters periods (e.g. ICD-9-CM 04.92 will only be matched by the string "0492") * ICD-9-CM codes should be at a minimum 3 digits long: * Codes less than 10 should be left padded with 2 zeros. E.g. _Cholera due to vibrio cholerae el tor_, ICD-9-CM 001.1, should be sent as 0011) * Codes less than 100 should be left padded with 1 zero. E.g. _Whooping cough, unspecified organism_, ICD-9-CM 033.9, should be sent as 0339) Potential issues with improperly formatted ICD codes: * All codes in all categories employ "starts with substring" matching logic. Because of this, if a code to be evaluated starts with a code listed in one of the CCC categories, a match will be found. As an example, if a record with an ICD-9-CM procedure code of "0492,25042" is passed due to failure to properly parse an input file, PCCC would indicate a match for the Neuromuscular CCC since one of the Neuromuscular CCC procedure substrings is 0492. * CCCs are matched in the order of the CCCs shown in the "Logic employed" section. Once a match is found, other categories are not evaluated with the exception of Technology Dependency and Transplant CCCs. * If there are changes in either ICD-9-CM or ICD-10-CM, this library may require updating to continue functioning correctly. Users of PCCC may find the R Package [ICD](https://jackwasey.github.io/icd/) useful. # PCCC Examples To illustrate the how the input formatting impacts the identification of a CCC, consider the data `data.frame` named `dat` below. These data have information about three patients (A-C). Each subject has the same ICD-9-CM diagnosis code (e.g. _Hypertrophic obstructive cardiomyopathy_, ICD-9-CM 425.11, which should be sent as 4251) and the same ICD-9-CM procedure code (e.g. _Heart transplantation_, ICD-9-CM 37.51, which should be sent as 3751), but each input is formatted differently. Based on the ICD-9-CM diagnosis code, the `ccc` function will only identify subject `A` as having a CCC. Based on the ICD-9-CM procedure code, the `ccc` function will only identify subject `B` as having a CCC and will also flag the Transplantation category. ## Basic Example ```{r} dat <- data.frame(ids = c("A", "B", "C"), dxs = c("4251", "425.1", "425.1"), procs = c("37.51", "3751", "37.51")) dat ccc(dat, id = ids, dx_cols = dxs, pc_cols = procs, icdv = 9) ``` ## Extended Example This example used a tool developed by Seth Russell (available at [icd_file_generator](https://github.com/magic-lantern/icd_file_generator)) to create a sample data file for ICD-9-CM and ICD-10-CM. The generated data files contain randomly generated ICD codes for 1,000 patients and is comprised of 10 columns of diagnosis codes (d_cols), 10 columns of procedure codes (p_cols), and 10 columns of other data (g_cols). Sample of how ICD-9-CM test file was generated: ```{r eval = FALSE} pccc_icd9_dataset <- generate_sample( v = 9, n_rows = 10000, d_cols = 10, p_cols = 10, g_cols = 10 ) save(pccc_icd9_dataset, file="pccc_icd9_dataset.rda") ``` Example using sample patient data set: ```{r} library(dplyr) library(pccc) ccc_result <- ccc(pccc::pccc_icd9_dataset[, c(1:21)], # get id, dx, and pc columns id = id, dx_cols = dplyr::starts_with("dx"), pc_cols = dplyr::starts_with("pc"), icdv = 09) # review results head(ccc_result) # view number of patients with each CCC sum_results <- dplyr::summarize_at(ccc_result, vars(-id), sum) %>% print.data.frame # view percent of total population with each CCC dplyr::summarize_at(ccc_result, vars(-id), mean) %>% print.data.frame ``` # References * Feudtner C, et al. Pediatric complex chronic conditions classification system version 2: updated for ICD-10 and complex medical technology dependence and transplantation, BMC Pediatrics, 2014, 14:199, DOI: 10.1186/1471-2431-14-199. * Feudtner C, et al. Pediatric deaths attributable to complex chronic conditions: a population-based study of Washington State, 1980-1997. Pediatrics. 2000;106(1 Pt 2):205-209.