Title: Healthcare Analysis Methods
Version: 1.0.0
Description: Conducts analyses for healthcare program evaluations or intervention studies. Calculates regression analyses for standard ordinary least squares (OLS or linear) or logistic models. Performs regression models used for causal modeling such as differences-in-differences (DID) and interrupted time series (ITS) models. Provides limited interpretations of model results and a ranking of variable importance in models. Performs propensity score models, top-coding of model outcome variables, and can return new data with the newly formed variables. Also performs Cronbach's alpha for various scale items (e.g., survey questions). See Github URL for examples in the README file. For more details on the statistical methods, see Allen & Yen (1979, ISBN:0-8185-0283-5), Angrist & Pischke (2009, ISBN:9780691120355), Harrell (2016, ISBN:978-3-319-19424-0), Kline (1999, ISBN:9780415211581), and Linden (2015) <doi:10.1177/1536867X1501500208>.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.2
Depends: R (≥ 2.10)
LazyData: true
URL: https://github.com/szuniga07/ham
BugReports: https://github.com/szuniga07/ham/issues
Imports: methods
Suggests: testthat (≥ 3.0.0)
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2025-08-23 03:28:31 UTC; szuni
Author: Stephen Zuniga ORCID iD [aut, cre, cph]
Maintainer: Stephen Zuniga <rms.shiny@gmail.com>
Repository: CRAN
Date/Publication: 2025-08-28 08:50:11 UTC

Calculates Cronbach's alpha on scale items

Description

Performs Cronbach's alpha of specified items from a data frame. Cronbach's Alpha is a formula for estimating the internal consistency reliability of a measurement instrument such as survey items (see Allen & Yang, 1979; Kline, 1999). Survey items can have 2 or more categories such as 5-point scales and contain 2 or more items.

Usage

alpha(items, data)

Arguments

items

Vector of item names that form a scale (e.g., 5-point Likert scales)

data

Data frame object.

Value

A list object with Cronbach's alpha summary statistics.

References

Allen, M. J., & Yen, W. M. (1979). Introduction to Measurement Theory. Brooks/Cole. ISBN: 0-8185-0283-5. Kline, Paul (1999). Handbook of Psychological Testing (2nd ed). Routledge, New York. ISBN: 9780415211581.

Examples

alpha(items=c("i1","i2","i3","i4","i5"), data=cas)

# remove i1 as suggested in the previous example, returns higher alpha
alpha(items=c("i2","i3","i4","i5"), data=cas)

Assess models with regression

Description

Fit ordinary least squares (OLS) and logistic models. And fit models for causal inference such as differences-in-differences and interrupted time series. Run these models to evaluate program performance or test intervention effects (e.g., healthcare programs). Options are available for top coding the outcome variable as well as propensity scores. New data can optionally be returned that has these additional variables and constructed variables that are used for DID and ITS models.

Usage

assess(
  formula,
  data,
  regression = "none",
  did = "none",
  its = "none",
  intervention = NULL,
  int.time = NULL,
  treatment = NULL,
  interrupt = NULL,
  topcode = NULL,
  propensity = NULL,
  newdata = FALSE
)

Arguments

formula

a formula object. Use 'Y ~ .' in DID and ITS models to only specify the constructed model variables (e.g., right side of the DID model: Y ~ Post.All + Int.Var + DID). If regression=ols or regression=logistic, 'Y ~ .' will use all variables in the data.frame as is standard in formulas.

data

a data.frame in which to interpret the variables named in the formula.

regression

Select a regression method for standard regression models (i.e., neither DID nor ITS). Options are regression="ols" (ordinary least squares AKA linear) or regression="logistic". Default is regression="none" for no standard regression model.

did

option for Differences-in-Differences (DID) regression. Select did="two" for models with only 2 time points (e.g., pre/post-test). Select did="many" for >= 3 time points (e.g., monthly time points in 12 months of data). Default is did="none" for no DID.

its

option for Interrupted Time Series (ITS) regression. Select its="one" for one group (e.g., intervention only). Select its="two" for two groups (intervention and control). Default is did="none" for no ITS.

intervention

optional intervention variable name selected for DID, ITS, and propensity score models that indicate which cases are in the intervention or not.

int.time

optional intervention time variable name selected for DID or ITS models. This indicates the duration of time relative to when the intervention started.

treatment

optional treatment start period variable name selected for DID models. Select 1 value from 'int.time' to indicate the start of the intervention.

interrupt

optional interruption (or intervention) period(s) variable name selected for ITS models. Select 1 or 2 values from 'int.time' to indicate the start and/or key intervention periods.

topcode

optional value selected to top code Y (or left-hand side) of the formula. Analyses will be performed using the new top coded variable.

propensity

optional character vector of variable names to perform a propensity score model. This requires the 'intervention' option to be selected. All models will include 'pscore' (propensity score) in the analysis as a covariate adjustment using the propensity score.

newdata

optional logical value that indicates if you want the new data returned. newdata=TRUE will return the data with any new columns created from the DID, ITS, propensity score, or top coding. The default is newdata=FALSE. No new data will be returned if none was created.

Value

a list of results from selected regression models. Will return new data if selected. And returns relevant model information such as variable names, type of analysis, formula, study information, and summary of ITS effects if analyzed.

References

Angrist, J. D., & Pischke, J. S. (2009). Mostly Harmless Econometrics: An Empiricist's Companion. Princeton University Press. ISBN: 9780691120355.

Linden, A. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 15, 2, 480-500. https://doi.org/10.1177/1536867X1501500208

Examples

# ordinary least squares R^2
summary(assess(hp ~ mpg+wt, data=mtcars, regression="ols")$model)

# logistic
summary(assess(formula=vs~mpg+wt+hp, data=mtcars, regression="logistic")$model)

# OLS with a propensity score
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", propensity=c("female","age","risk"))$model)

# OLS: top coding los at 8.27 and propensity score means (top.los and pscore)
summary(assess(formula=los ~ month+program, data=hosprog, intervention = "program",
regression="ols", topcode=8.27, propensity=c("female","age","risk"),
newdata=TRUE)$newdata[, c("los", "top.los", "pscore")])

# differences-in-differences model: using 2 time periods, pre- and post-intervention
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two")$DID)

# DID model: using time points
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="many")$DID)

#interrupted time series model: two groups and 1 interruption (interrupt= 5)
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)$ITS)

#interrupted time series model: two groups and 2 interruptions (interrupt= c(5,9))
summary(assess(formula=los ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = c(5,9))$ITS)


Patient survey data

Description

Artificial data of a 5 item hospital satisfaction survey for a Cronbach's alpha scale (cas).

Usage

cas

Format

cas

An artificial data frame with 100 rows and 5 columns:

i1 - i5

5 survey items

...

Source

Artificial dataset created with rbinom for 5 items. For example, rbinom(100, 5, .9) generates 1 item. The prob argument is modified to give more or less consistent ratings per item.


Patient hospital program/intervention data, intervention group only

Description

Patient hospital program/intervention data, intervention group only

Usage

hosp1

Format

hosprog

An artificial data frame with 352 rows and 10 columns, intervention patients only:

survey

Patient satisfaction survey mean score.

los

Hospital length of stay (los)

cost

Hospital stay cost

rdm30

Patient readmission within 30 days of discharge

death30

Patient death within 30 days of discharge

female

Patient sex, 1 indicates female, 0 otherwise

age

Patient age

risk

Patient health risk score ranging from 0 to 1

month

12 month indicator (1 to 12)

program

Indicates patient program participation. 1='yes', 0='no'

...

Source

hosp1 is a subset of the artificial dataset hosprog. It is the intervention group's data used for single group interrupted time series.


Patient hospital program/intervention data

Description

Patient hospital program/intervention data

Usage

hosprog

Format

hosprog

An artificial data frame with 720 rows and 10 columns:

survey

Patient satisfaction survey mean score.

los

Hospital length of stay (los)

cost

Hospital stay cost

rdm30

Patient readmission within 30 days of discharge

death30

Patient death within 30 days of discharge

female

Patient sex, 1 indicates female, 0 otherwise

age

Patient age

risk

Patient health risk score ranging from 0 to 1

month

12 month indicator (1 to 12)

program

Indicates patient program participation. 1='yes', 0='no'

...

Source

Artificial dataset created by using runif. The strength in the association between each variable is weighted by multiplying each subsequent predictor in increments of 1. For example, Y equals runif(720) multiplied by 1 plus runif(720) multiplied by 2 and so on. This allows some predictors to have stronger correlations with Y.


Importance of variables based on partial chi-square statistic

Description

Calculates partial chi-square (Wald chi-square for individual coefficients) from assess class objects. The importance is the partial chi-square minus its degrees of freedom based on the regression coefficients (Harrell, 2015). A higher chi-square indicates a larger effect by the predictors. Therefore, the rank of the chi-square can indicate which predictors can contribute more in explaining the variation in the outcome variable.

Usage

importance(model)

Arguments

model

an assess class object or models with lm or glm class.

Value

a data.frame object with partial X^2 summary statistics.

References

Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.

Examples

# OLS regression
importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model)

# logistic regression
importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model)


Interpret model output

Description

Provides simple interpretations of regression coefficients and Cronbach's alpha from assess and alpha function classes. The interpretations describe coefficients and significance values as well as modifying item scales. The interpretations are text comments associated with specific parameters of the various analyses.

Usage

interpret(object)

Arguments

object

alpha and assess class objects: alpha, ITS, DID, linear (ols) or logistic models.

Value

a list with interpretations of Cronbach's alpha scales or regression model results.

Examples

# Interpret Cronbach's alpha
interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))

# interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
interpret(hos1)$model

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did  #interpret(hos2) also runs, returns ITS results if present

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its

Interrupted time series analysis effects

Description

Calculates effects for intervention and control groups based on interrupted time series models from an assess class object. Within a period (or interruption), the effect that represents the trend during the period is calculated for both groups as well as the difference between the groups. Summary statistics are provided that include the effect sizes, t-statistic, standard errors, p-values, and 95% confidence intervals of the effect sizes. These values are provided for the intervention group, control group (when applicable), and the differences between the two groups (Linden, 2015). These values are automatically generated while running a model in assess.

Usage

itsEffect(model, type)

Arguments

model

an interrupted time series (ITS) model with the "lm" class,

type

analysis type for single or multiple groups and single or multiple time periods. If selected type="sgst", it is single-group single-time; type="sgmt", it is single-group multiple-time; type="mgst", it is multiple-group single-time; and type="mgmt", it is multiple-group multiple-time.

Value

a data.frame object of ITS effects and summary statistics. Generally run within the assess function.

References

Linden, Ariel. (2015). Conducting Interrupted Time-series Analysis for Single- and Multiple-group Comparisons. The Stata Journal, 2015, 15(2), 480-500, https://doi.org/10.1177/1536867X1501500208

Examples

i21 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",topcode =NULL,
int.time="month", regression="none", interrupt=5, its="two", newdata=TRUE, propensity=NULL)
itsEffect(model= i21$ITS, type= "mgst")


Prediction plot of treatment and control groups for DID and ITS models

Description

Provides partial prediction plots for treatment and control groups from difference-in-difference (DID) and interrupted time series (ITS) models. The graph will produce lines for treatment/intervention and control groups to gain understanding through a visual representation of the regression coefficients. The treatment/intervention group is represented with a blue line, the control group is represented with a red line, and the counterfactual line, when available, is a dashed line.

Usage

## S3 method for class 'assess'
plot(x, y, xlim = NULL, ylim = NULL, add.legend = NULL, ...)

Arguments

x

assess object. Either difference-in-difference or interrupted time series model with no covariate adjustment.

y

type of model, specify either 'DID' (difference-in-difference) or 'ITS' (interrupted time series). Will not accept other models.

xlim

specify plot's x-axis limits with a 2 value vector.

ylim

specify plot's y-axis limits with a 2 value vector.

add.legend

add a legend by selecting the location as "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "topright", "right", "center". No legend if nothing selected.

...

additional arguments.

Value

plot of partial predictions for treatment and control groups.

Examples

am2 <- assess(formula= los ~ ., data=hosprog, intervention = "program",
topcode =NULL, int.time="month", regression="none", treatment= 5,
interrupt=c(5,9), did="many", its="two", newdata=TRUE, propensity=NULL)
plot(am2, "DID", add.legend="bottomleft", ylim=c(2, 8))  #DID model
plot(am2, "ITS", add.legend="top", ylim=c(2, 8))         #ITS model

Plot of variable importance ranked by partial chi-square statistic

Description

Plots an importance class object. Produces a dot chart that places the predictor variable with the highest partial chi-square (Wald chi-square for individual coefficients) at the bottom. It is a metric of the partial chi-square minus its degrees of freedom (Harrell, 2015). Predictor variables with significant p-values at the 0.05 alpha are highlighted red. Consider graphical parameters of mar=c(4.2, 2, 3.5, 3) and oma = c(0, 0, 0, 3).

Usage

## S3 method for class 'importance'
plot(x, y, ...)

Arguments

x

importance object.

y

not currently used.

...

additional arguments.

Value

plot of variable importance, significant variables highlighted in red.

References

Harrell, F. E., Jr. (2016). Regression Modeling Strategies. Springer International Publishing. ISBN: 978-3-319-19424-0.

Examples

# OLS regression
plot(importance(assess(mpg ~ hp + wt + cyl, data=mtcars, regression= "ols")$model))

# logistic regression
plot(importance(assess(vs~mpg+wt+hp, data=mtcars, regression= "logistic")$model))

Print alpha results

Description

Formats alpha class results to display summary statistics of scale information. These include the overall alpha, scale mean and standard deviation, item statistics, scale statistics if an item is removed from the scale, and the total sample size.

Usage

## S3 method for class 'alpha'
print(x, ...)

Arguments

x

alpha object from Cronbach's alpha calculation.

...

Additional arguments.

Value

formatted alpha results.

Examples

print(alpha(items=c("i1","i2","i3","i4","i5"), data=cas))


Print interpret object

Description

Formats interpretations from interpret class objects. Provides simple interpretations of regression coefficients and Cronbach's alpha. Print specific model interpretations (or run all), returned in sentence and paragraph formats.

Usage

## S3 method for class 'interpret'
print(x, ...)

Arguments

x

interpret object.

...

Additional arguments.

Value

formatted interpret object results.

Examples


#Cronbach's alpha
print(interpret(alpha(items=c("i1","i2","i3","i4","i5"), data=cas)))

#' # interpret a standard linear (OLS) regression
hos1 <- assess(formula=survey ~ program + month, data=hosprog, regression= "ols")
print(interpret(hos1)$model)

# interpret a differences-in-differences model
hos2 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", treatment = 5, did="two", newdata=TRUE)
interpret(hos2)$did

# interpret an interrupted time series model
hos3 <- assess(formula=survey ~ ., data=hosprog, intervention = "program",
int.time="month", its="two", interrupt = 5)
interpret(hos3)$its