--- title: "Example : emhawkes package" author: "Kyungsub Lee" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteEncoding{UTF-8} %\VignetteIndexEntry{Example : emhawkes package} %\VignetteEngine{knitr::rmarkdown} editor_options: markdown: wrap: sentence --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ## Basic Hawkes model ### Univariate Hawkes process ```{r} library(emhawkes) ``` This subsection outlines the steps for constructing, running simulations, and estimating a univariate Hawkes model. To begin, create an `hspec` object, which defines the Hawkes model. The S4 class `hspec` contains slots for the model parameters: `mu`, `alpha`, `beta`, `dimens`, `rmark`, and `impact`. In a univariate model, the basic parameters of the model---`mu`, `alpha`, and `beta`---can be given as numeric values. If numeric values are provided, they will be converted to matrices. Below is an example of a univariate Hawkes model without a mark. ```{r} mu1 <- 0.3; alpha1 <- 1.2; beta1 <- 1.5 hspec1 <- new("hspec", mu = mu1, alpha = alpha1, beta = beta1) show(hspec1) ``` The function `hsim` implements simulation where the input arguments are `hspec`, `size`, and the initial values of the intensity component process, `lambda_component0`, and the initial values of the Hawkes processes, `N0`. More precisely, the intensity process of the basic univariate Hawkes model is represented by $$ \lambda(t) = \mu + \int_{-\infty}^t \alpha e^{-\beta (t-s)} d N(s) = \mu + \lambda_c(0) e^{-\beta t} + \int_0^t \alpha e^{-\beta (t-s)} d N(s) $$ where the `lambda_component0` denotes $$ \lambda_c(0) = \int_{-\infty}^0 \alpha e^{\beta s} d N(s). $$ If `lambda_component0` is not provided, the internally determined initial values for the intensity process are used. If `size` is sufficiently large, the exact value of `lambda_component0` may not be important. The default initial value of the counting process, `N0`, is zero. ```{r, warning=FALSE} set.seed(1107) res1 <- hsim(hspec1, size = 1000) summary(res1) ``` The results of `hsim` is an S3 class `hreal`, which consists of `hspec`, `inter_arrival`, `arrival`, `type`, `mark`, `N`, `Nc`, `lambda`, `lambda_component`, `rambda`, `rambda_component`. - `hspec` is the model specification. - `inter_arrival` is the inter-arrival time of every event. - `arrival` is the cumulative sum of `inter_arrival`. - `type` is the type of events, i.e., $i$ for $N_i$, and is used for a multivariate model. - `mark` is a numeric vector that represents additional information for the event. - `lambda` represents $\lambda$, which is the left continuous and right limit version. - The right continuous version of intensity is `rambda`. - `lambda_component` represents $\lambda_{ij}$, and `rambda_component` is the right continuous version. `inter_arrival`, `type`, `mark`, `N`, and `Nc` start at zero. Using the `summary()` function, one can print the first 20 elements of `arrival`, `N`, and `lambda`. The `print()` function can also be used. By definition, we have `lambda == mu + lambda_component`. ```{r} # first and third columns are the same cbind(res1$lambda[1:5], res1$lambda_component[1:5], mu1 + res1$lambda_component[1:5]) ``` For all rows except the first, `rambda` equals `lambda + alpha` in this model. ```{r} # second and third columns are the same cbind(res1$lambda[1:5], res1$rambda[1:5], res1$lambda[1:5] + alpha1) ``` Additionally, verify that the exponential decay is accurately represented in the model. ```{r} # By definition, the following two are equal: res1$lambda[2:6] mu1 + (res1$rambda[1:5] - mu1) * exp(-beta1 * res1$inter_arrival[2:6]) ``` The log-likelihood function is calculated using the `logLik` method. In this context, the inter-arrival times and `hspec` are provided as inputs to the function. ```{r, warning=FALSE} logLik(hspec1, inter_arrival = res1$inter_arrival) ``` The likelihood estimation is performed using the `hfit` function. The specification of the initial parameter values, `hspec0`, is required. Note that only `inter_arrival` is needed for this univariate model. For more accurate simulation, it is recommended to specify `lambda0`, the initial value of the lambda component. If `lambda0` is not provided, the function uses internally determined initial values. By default, the BFGS method is employed for numerical optimization. ```{r, warning=FALSE} # initial value for numerical optimization mu0 <- 0.5; alpha0 <- 1.0; beta0 <- 1.8 hspec0 <- new("hspec", mu = mu0, alpha = alpha0, beta = beta0) # the intial values are provided through hspec mle <- hfit(hspec0, inter_arrival = res1$inter_arrival) summary(mle) ``` ### Bivariate Hawkes model The intensity process of a basic bivariate Hawkes model is defined by $$ \lambda_1(t) = \mu_1 + \int_{-\infty}^t \alpha_{11} e^{-\beta_{11}(t-s)} d N_1(s) + \int_{-\infty}^t \alpha_{12} e^{-\beta_{12}(t-s)} d N_2(s), $$ $$ \lambda_2(t) = \mu_2 + \int_{-\infty}^t \alpha_{21} e^{-\beta_{21}(t-s)} d N_1(s) + \int_{-\infty}^t \alpha_{22} e^{-\beta_{22}(t-s)} d N_2(s). $$ In a bivariate model, the parameters within the slots of `hspec` are matrices. Specifically, `mu` is a 2-by-1 matrix, while `alpha` and `beta` are 2-by-2 matrices. $$ \mu = \begin{bmatrix} \mu_1 \\ \mu_2 \end{bmatrix}, \quad \alpha = \begin{bmatrix} \alpha_{11} & \alpha_{12} \\ \alpha_{21} & \alpha_{22} \end{bmatrix}, \quad \beta = \begin{bmatrix} \beta_{11} & \beta_{12} \\ \beta_{21} & \beta_{22} \end{bmatrix} $$ `rmark` is a random number generating function for marks and is not used in non-mark models. `lambda_component0` is a 2-by-2 matrix that represents the initial values of `lambda_component`, which includes the set of values `lambda11`, `lambda12`, `lambda21`, and `lambda22`. The intensity processes are represented by $$ \lambda_1(t) = \mu_1 + \lambda_{11}(t) + \lambda_{12}(t), $$ $$ \lambda_2(t) = \mu_2 + \lambda_{21}(t) + \lambda_{22}(t). $$ The terms $\lambda_{ij}$ are referred to as lambda components, and `lambda0` represents \$\lambda\_{ij}(0)`. The parameter`lambda_component0\` can be omitted in this model, in which case internally determined initial values will be used. ```{r} mu2 <- matrix(c(0.2), nrow = 2) alpha2 <- matrix(c(0.5, 0.9, 0.9, 0.5), nrow = 2, byrow = TRUE) beta2 <- matrix(c(2.25, 2.25, 2.25, 2.25), nrow = 2, byrow = TRUE) hspec2 <- new("hspec", mu=mu2, alpha=alpha2, beta=beta2) print(hspec2) ``` To perform a simulation, use the `hsim` function. ```{r} set.seed(1107) res2 <- hsim(hspec2, size=1000) summary(res2) ``` In multivariate models, `type` is crucial as it represents the type of event. ```{r} # Under bi-variate model, there are two types, 1 or 2. res2$type[1:10] ``` In multivariate models, the column names of `N` are `N1`, `N2`, `N3`, and so on. ```{r} res2$N[1:3, ] ``` Similarly, the column names of `lambda` are `lambda1`, `lambda2`, `lambda3`, and so on. ```{r} res2$lambda[1:3, ] ``` The column names of `lambda_component` are `lambda_component11`, `lambda_component12`, `lambda_component13`, and so on. ```{r} res2$lambda_component[1:3, ] ``` By definition, the following two expressions are equivalent: ```{r} mu2[1] + rowSums(res2$lambda_component[1:5, c("lambda11", "lambda12")]) res2$lambda[1:5, "lambda1"] ``` From the results, we obtain vectors of realized `inter_arrival` and `type`. A bivariate model requires both `inter_arrival` and `type` for estimation. ```{r} inter_arrival2 <- res2$inter_arrival type2 <- res2$type ``` The log-likelihood is computed using the `logLik` function. ```{r} logLik(hspec2, inter_arrival = inter_arrival2, type = type2) ``` Maximum log-likelihood estimation is performed using the `hfit` function. In this process, the parameter values in `hspec0`, such as `mu`, `alpha`, and `beta`, serve as starting points for the numerical optimization. For illustration purposes, we set `hspec0 <- hspec2`. Since the true parameter values are unknown in practical applications, these initial guesses are used. The realized `inter_arrival` and `type` data are utilized for estimation. ```{r, warning=FALSE} hspec0 <- hspec2 mle <- hfit(hspec0, inter_arrival = inter_arrival2, type = type2) summary(mle) ``` ```{r} coef(mle) ``` ```{r} miscTools::stdEr(mle) ``` Also see the [extended vignette on GitHub](https://ksublee.github.io/emhawkes/articles/extended_example.html).