Title: | Healthcare Analysis Methods |
Version: | 1.0.0 |
Description: | Conducts analyses for healthcare program evaluations or intervention studies. Calculates regression analyses for standard ordinary least squares (OLS or linear) or logistic models. Performs regression models used for causal modeling such as differences-in-differences (DID) and interrupted time series (ITS) models. Provides limited interpretations of model results and a ranking of variable importance in models. Performs propensity score models, top-coding of model outcome variables, and can return new data with the newly formed variables. Also performs Cronbach's alpha for various scale items (e.g., survey questions). See Github URL for examples in the README file. For more details on the statistical methods, see Allen & Yen (1979, ISBN:0-8185-0283-5), Angrist & Pischke (2009, ISBN:9780691120355), Harrell (2016, ISBN:978-3-319-19424-0), Kline (1999, ISBN:9780415211581), and Linden (2015) <doi:10.1177/1536867X1501500208>. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Depends: | R (≥ 2.10) |
LazyData: | true |
URL: | https://github.com/szuniga07/ham |
BugReports: | https://github.com/szuniga07/ham/issues |
Imports: | methods |
Suggests: | testthat (≥ 3.0.0) |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-08-23 03:28:31 UTC; szuni |
Author: | Stephen Zuniga |
Maintainer: | Stephen Zuniga <rms.shiny@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2025-08-28 08:50:11 UTC |
Calculates Cronbach's alpha on scale items
Description
Performs Cronbach's alpha of specified items from a data frame. Cronbach's Alpha is a formula for estimating the internal consistency reliability of a measurement instrument such as survey items (see Allen & Yang, 1979; Kline, 1999). Survey items can have 2 or more categories such as 5-point scales and contain 2 or more items.
Usage
alpha(items, data)
Arguments
items |
Vector of item names that form a scale (e.g., 5-point Likert scales) |
data |
Data frame object. |
Value
A list object with Cronbach's alpha summary statistics.
References
Allen, M. J., & Yen, W. M. (1979). Introduction to Measurement Theory. Brooks/Cole. ISBN: 0-8185-0283-5. Kline, Paul (1999). Handbook of Psychological Testing (2nd ed). Routledge, New York. ISBN: 9780415211581.
Examples
alpha(items=c("i1","i2","i3","i4","i5"), data=cas)
# remove i1 as suggested in the previous example, returns higher alpha
alpha(items=c("i2","i3","i4","i5"), data=cas)
Assess models with regression
Description
Fit ordinary least squares (OLS) and logistic models. And fit models for causal inference such as differences-in-differences and interrupted time series. Run these models to evaluate program performance or test intervention effects (e.g., healthcare programs). Options are available for top coding the outcome variable as well as propensity scores. New data can optionally be returned that has these additional variables and constructed variables that are used for DID and ITS models.
Usage
assess(
formula,
data,
regression = "none",
did = "none",
its = "none",
intervention = NULL,
int.time = NULL,
treatment = NULL,
interrupt = NULL,
topcode = NULL,
propensity = NULL,
newdata = FALSE
)
Arguments
formula |
a formula object. Use 'Y ~ .' in DID and ITS models to only specify the constructed model variables (e.g., right side of the DID model: Y ~ Post.All + Int.Var + DID). If regression=ols or regression=logistic, 'Y ~ .' will use all variables in the data.frame as is standard in formulas. |
data |
a data.frame in which to interpret the variables named in the formula. |
regression |
Select a regression method for standard regression models (i.e., neither DID nor ITS). Options are regression="ols" (ordinary least squares AKA linear) or regression="logistic". Default is regression="none" for no standard regression model. |
did |
option for Differences-in-Differences (DID) regression. Select did="two" for models with only 2 time points (e.g., pre/post-test). Select did="many" for >= 3 time points (e.g., monthly time points in 12 months of data). Default is did="none" for no DID. |
its |
option for Interrupted Time Series (ITS) regression. Select its="one" for one group (e.g., intervention only). Select its="two" for two groups (intervention and control). Default is did="none" for no ITS. |
intervention |
optional intervention variable name selected for DID, ITS, and propensity score models that indicate which cases are in the intervention or not. |
int.time |
optional intervention time variable name selected for DID or ITS models. This indicates the duration of time relative to when the intervention started. |
treatment |
optional treatment start period variable name selected for DID models. Select 1 value from 'int.time' to indicate the start of the intervention. |
interrupt |
optional interruption (or intervention) period(s) variable name selected for ITS models. Select 1 or 2 values from 'int.time' to indicate the start and/or key intervention periods. |
topcode |
optional value selected to top code Y (or left-hand side) of the formula. Analyses will be performed using the new top coded variable. |
propensity |
optional character vector of variable names to perform a propensity score model. This requires the 'intervention' option to be selected. All models will include 'pscore' (propensity score) in the analysis as a covariate adjustment using the propensity score. |
newdata |
optional logical value that indicates if you want the new data returned. newdata=TRUE will return the data with any new columns created from the DID, ITS, propensity score, or top coding. The default is newdata=FALSE. No new data will be returned if none was created. |
Value
a list of results from selected regression models. Will return new data if selected. And returns relevant model information such as variable names, type of analysis, formula, study information, and summary of ITS effects if analyzed.
References
Angrist, J. D., & Pischke, J. S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. ISBN: 9780691120355.
Linden, A. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 15, 2, 480-500. https://doi.org/10.1177/1536867X1501500208
Examples
# ordinary least squares R^2
summary(assess(hp ~ mpg+wt, data=mtcars, regression="ols")$model)
# logistic
summary(assess(formula=vs~mpg+wt+hp, data=mtcars, regression="logistic")$model)
# OLS with a propensity score
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", propensity=c("female","age","risk"))$model)
# OLS: top coding los at 8.27 and propensity score means (top.los and pscore)
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", topcode=8.27, propensity=c("female","age","risk"),
newdata=TRUE)$newdata[, c("los", "top.los", "pscore")])
# differences-in-differences model: using 2 time periods, pre- and post-intervention
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two")$DID)
# DID model: using time points
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="many")$DID)
#interrupted time series model: two groups and 1 interruption (interrupt= 5)
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)$ITS)
#interrupted time series model: two groups and 2 interruptions (interrupt= c(5,9))
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = c(5,9))$ITS)
Patient survey data
Description
Artificial data of a 5 item hospital satisfaction survey for a Cronbach's alpha scale (cas).
Usage
cas
Format
cas
An artificial data frame with 100 rows and 5 columns:
- i1 - i5
5 survey items
...
Source
Artificial dataset created with rbinom for 5 items. For example, rbinom(100, 5, .9) generates 1 item. The prob argument is modified to give more or less consistent ratings per item.
Patient hospital program/intervention data, intervention group only
Description
Patient hospital program/intervention data, intervention group only
Usage
hosp1
Format
hosprog
An artificial data frame with 352 rows and 10 columns, intervention patients only:
- survey
Patient satisfaction survey mean score.
- los
Hospital length of stay (los)
- cost
Hospital stay cost
- rdm30
Patient readmission within 30 days of discharge
- death30
Patient death within 30 days of discharge
- female
Patient sex, 1 indicates female, 0 otherwise
- age
Patient age
- risk
Patient health risk score ranging from 0 to 1
- month
12 month indicator (1 to 12)
- program
Indicates patient program participation. 1='yes', 0='no'
...
Source
hosp1 is a subset of the artificial dataset hosprog. It is the intervention group's data used for single group interrupted time series.
Patient hospital program/intervention data
Description
Patient hospital program/intervention data
Usage
hosprog
Format
hosprog
An artificial data frame with 720 rows and 10 columns:
- survey
Patient satisfaction survey mean score.
- los
Hospital length of stay (los)
- cost
Hospital stay cost
- rdm30
Patient readmission within 30 days of discharge
- death30
Patient death within 30 days of discharge
- female
Patient sex, 1 indicates female, 0 otherwise
- age
Patient age
- risk
Patient health risk score ranging from 0 to 1
- month
12 month indicator (1 to 12)
- program
Indicates patient program participation. 1='yes', 0='no'
...
Source
Artificial dataset created by using runif. The strength in the association between each variable is weighted by multiplying each subsequent predictor in increments of 1. For example, Y equals runif(720) multiplied by 1 plus runif(720) multiplied by 2 and so on. This allows some predictors to have stronger correlations with Y.
Importance of variables based on partial chi-square statistic
Description
Calculates partial chi-square (Wald chi-square for individual coefficients) from assess class objects. The importance is the partial chi-square minus its degrees of freedom based on the regression coefficients (Harrell, 2015). A higher chi-square indicates a larger effect by the predictors. Therefore, the rank of the chi-square can indicate which predictors can contribute more in explaining the variation in the outcome variable.
Usage
importance(model)
Arguments
model |
an assess class object or models with lm or glm class. |
Value
a data.frame object with partial X^2 summary statistics.
References
Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.
Examples
# OLS regression
importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model)
# logistic regression
importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model)
Interpret model output
Description
Provides simple interpretations of regression coefficients and Cronbach's alpha from assess and alpha function classes. The interpretations describe coefficients and significance values as well as modifying item scales. The interpretations are text comments associated with specific parameters of the various analyses.
Usage
interpret(object)
Arguments
object |
alpha and assess class objects: alpha, ITS, DID, linear (ols) or logistic models. |
Value
a list with interpretations of Cronbach's alpha scales or regression model results.
Examples
# Interpret Cronbach's alpha
interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))
# interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
interpret(hos1)$model
# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did #interpret(hos2) also runs, returns ITS results if present
# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its
Interrupted time series analysis effects
Description
Calculates effects for intervention and control groups based on interrupted time series models from an assess class object. Within a period (or interruption), the effect that represents the trend during the period is calculated for both groups as well as the difference between the groups. Summary statistics are provided that include the effect sizes, t-statistic, standard errors, p-values, and 95% confidence intervals of the effect sizes. These values are provided for the intervention group, control group (when applicable), and the differences between the two groups (Linden, 2015). These values are automatically generated while running a model in assess.
Usage
itsEffect(model, type)
Arguments
model |
an interrupted time series (ITS) model with the "lm" class, |
type |
analysis type for single or multiple groups and single or multiple time periods. If selected type="sgst", it is single-group single-time; type="sgmt", it is single-group multiple-time; type="mgst", it is multiple-group single-time; and type="mgmt", it is multiple-group multiple-time. |
Value
a data.frame object of ITS effects and summary statistics. Generally run within the assess function.
References
Linden, Ariel. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 2015, 15(2), 480-500, https://doi.org/10.1177/1536867X1501500208
Examples
i21 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",topcode =NULL,
int.time="month", regression="none", interrupt=5, its="two", newdata=TRUE, propensity=NULL)
itsEffect(model= i21$ITS, type= "mgst")
Prediction plot of treatment and control groups for DID and ITS models
Description
Provides partial prediction plots for treatment and control groups from difference-in-difference (DID) and interrupted time series (ITS) models. The graph will produce lines for treatment/intervention and control groups to gain understanding through a visual representation of the regression coefficients. The treatment/intervention group is represented with a blue line, the control group is represented with a red line, and the counterfactual line, when available, is a dashed line.
Usage
## S3 method for class 'assess'
plot(x, y, xlim = NULL, ylim = NULL, add.legend = NULL, ...)
Arguments
x |
assess object. Either difference-in-difference or interrupted time series model with no covariate adjustment. |
y |
type of model, specify either 'DID' (difference-in-difference) or 'ITS' (interrupted time series). Will not accept other models. |
xlim |
specify plot's x-axis limits with a 2 value vector. |
ylim |
specify plot's y-axis limits with a 2 value vector. |
add.legend |
add a legend by selecting the location as "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center". No legend if nothing selected. |
... |
additional arguments. |
Value
plot of partial predictions for treatment and control groups.
Examples
am2 <- assess(formula= los ~ ., data=hosprog, intervention = "program",
topcode =NULL, int.time="month", regression="none", treatment= 5,
interrupt=c(5,9), did="many", its="two", newdata=TRUE, propensity=NULL)
plot(am2, "DID", add.legend="bottomleft", ylim=c(2, 8)) #DID model
plot(am2, "ITS", add.legend="top", ylim=c(2, 8)) #ITS model
Plot of variable importance ranked by partial chi-square statistic
Description
Plots an importance class object. Produces a dot chart that places the predictor variable with the highest partial chi-square (Wald chi-square for individual coefficients) at the bottom. It is a metric of the partial chi-square minus its degrees of freedom (Harrell, 2015). Predictor variables with significant p-values at the 0.05 alpha are highlighted red. Consider graphical parameters of mar=c(4.2, 2, 3.5, 3) and oma = c(0, 0, 0, 3).
Usage
## S3 method for class 'importance'
plot(x, y, ...)
Arguments
x |
importance object. |
y |
not currently used. |
... |
additional arguments. |
Value
plot of variable importance, significant variables highlighted in red.
References
Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.
Examples
# OLS regression
plot(importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model))
# logistic regression
plot(importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model))
Print alpha results
Description
Formats alpha class results to display summary statistics of scale information. These include the overall alpha, scale mean and standard deviation, item statistics, scale statistics if an item is removed from the scale, and the total sample size.
Usage
## S3 method for class 'alpha'
print(x, ...)
Arguments
x |
alpha object from Cronbach's alpha calculation. |
... |
Additional arguments. |
Value
formatted alpha results.
Examples
print(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))
Print interpret object
Description
Formats interpretations from interpret class objects. Provides simple interpretations of regression coefficients and Cronbach's alpha. Print specific model interpretations (or run all), returned in sentence and paragraph formats.
Usage
## S3 method for class 'interpret'
print(x, ...)
Arguments
x |
interpret object. |
... |
Additional arguments. |
Value
formatted interpret object results.
Examples
#Cronbach's alpha
print(interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas)))
#' # interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
print(interpret(hos1)$model)
# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did
# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its