---
title: "Temporal Empirical Dynamic Modeling"
date: |
  | Last update: 2025-08-25
  | Last run: 2025-08-25
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{tEDM}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


## 1. Introduction to the `tEDM` package

The `tEDM` package provides a suite of tools for exploring and quantifying causality in time series using Empirical Dynamic Modeling (EDM). It implements four fundamental EDM-based causal discovery methods:

- [**Convergent Cross Mapping (CCM)**][1]

- [**Partial Cross Mapping (PCM)**][2]

- [**Cross Mapping Cardinality (CMC)**][3]

- [**Multispatial Convergent Cross Mapping (MultispatialCCM)**][4]

These methods enable researchers to:

- **Identify** potential causal interactions without assuming a predefined model structure.

- **Distinguish** between direct causation and indirect (mediated or confounded) influences.

- **Reconstruct** underlying causal dynamics from replicated univariate time series observed across multiple spatial units.

## 2. Example data in the `tEDM` package

### Hong Kong Air Pollution and Cardiovascular Admissions

A daily time series dataset(from 1995-3 to 1997-11) for Hong Kong that includes cardiovascular hospital admissions and major air pollutant concentrations.

**File**: `cvd.csv`

**Columns**:

| Column | Description                                                 |
| ------ | ----------------------------------------------------------- |
| `cvd`  | Daily number of cardiovascular-related hospital admissions. |
| `rsp`  | Respirable suspended particulates (μg/m³).                  |
| `no2`  | Nitrogen dioxide concentration (μg/m³).                     |
| `so2`  | Sulfur dioxide concentration (μg/m³).                       |
| `o3`   | Ozone concentration (μg/m³).                                |

**Source**: Data adapted from [PCM article][2].

---

### US County-Level Carbon Emissions Dataset

A panel dataset covering U.S. county-level temperature and carbon emissions across time.

**File**: `carbon.csv.gz`

**Columns**:

| Column   | Description                                                              |
| -------- | ------------------------------------------------------------------------ |
| `year`   | Observation year (1981–2017).                                            |
| `fips`   | County FIPS code (5-digit Federal Information Processing Standard code). |
| `tem`    | Mean annual temperature (in Kelvin).                                     |
| `carbon` | Total carbon emissions per year (in kilograms of CO₂).                   |

**Source**: Data adapted from [FsATE article][5].

---

### COVID-19 Infection Counts in Japan

A spatio-temporal dataset capturing the number of confirmed COVID-19 infections across Japan’s 47 prefectures over time.

**File**: `covid.csv`

**Structure**:

* Each **column** represents one of the 47 Japanese prefectures (e.g., `Tokyo`, `Osaka`, `Hokkaido`).
* Each **row** corresponds to a time step (daily).

**Source**: Data adapted from [CMC article][3].

## 3. Case studies of the `tEDM` package

Install the stable version:

```r
install.packages("tEDM", dep = TRUE)
```

or dev version:

```r
install.packages("tEDM",
                 repos = c("https://stscl.r-universe.dev",
                           "https://cloud.r-project.org"),
                 dep = TRUE)
```

### Air Pollution and Cardiovascular Health in Hong Kong

Employing PCM to investigate the causal relationships between various air pollutants and cardiovascular diseases:


``` r
library(tEDM)

cvd = readr::read_csv(system.file("case/cvd.csv",package = "tEDM"))
## Rows: 1032 Columns: 5
## ── Column specification ─────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (5): cvd, rsp, no2, so2, o3
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(cvd)
## # A tibble: 6 × 5
##     cvd   rsp   no2   so2    o3
##   <dbl> <dbl> <dbl> <dbl> <dbl>
## 1   214  73.7  74.5  19.1 17.4 
## 2   203  77.6  80.9  18.8 39.4 
## 3   202  64.8  67.1  13.8 56.4 
## 4   182  68.8  74.7  30.8  5.6 
## 5   181  49.4  62.3  23.1  3.6 
## 6   129  67.4  63.6  17.4  6.73
```


``` r
cvd_long = cvd |>
  tibble::rowid_to_column("id") |>
  tidyr::pivot_longer(cols = -id,
                      names_to = "variable", values_to = "value")

fig_cvds_ts = ggplot2::ggplot(cvd_long, ggplot2::aes(x = id, y = value, color = variable)) +
  ggplot2::geom_line(linewidth = 0.5) +
  ggplot2::labs(x = "Days (from 1995-3 to 1997-11)", y = "Concentrations or \nNO. of CVD admissions", color = "") +
  ggplot2::theme_bw() +
  ggplot2::theme(legend.direction = "horizontal",
                 legend.position = "inside",
                 legend.justification = c("center","top"),
                 legend.background = ggplot2::element_rect(fill = "transparent", color = NA))
fig_cvds_ts
```

![**Figure 1**. Time series of air pollutants and confirmed CVD cases in Hong Kong from March 1995 to November 1997.](../man/figures/edm/fig_cvds_ts-1.png)

Determining optimal embedding dimension:


``` r
tEDM::fnn(cvd,"cvd",E = 2:50,eps = stats::sd(cvd$cvd))
##       E:1       E:2       E:3       E:4       E:5       E:6       E:7       E:8 
## 0.8275862 0.4882729 0.3026188 0.2880698 0.2133851 0.2065955 0.1920466 0.1871969 
##       E:9      E:10      E:11      E:12      E:13      E:14      E:15      E:16 
## 0.1842871 0.1862270 0.1794374 0.1823472 0.1765276 0.1687682 0.1619787 0.1619787 
##      E:17      E:18      E:19      E:20      E:21      E:22      E:23      E:24 
## 0.1503395 0.1483996 0.1629486 0.1513094 0.1542192 0.1658584 0.1610087 0.1610087 
##      E:25      E:26      E:27      E:28      E:29      E:30      E:31      E:32 
## 0.1619787 0.1833172 0.1610087 0.1697381 0.1668283 0.1755577 0.1668283 0.1755577 
##      E:33      E:34      E:35      E:36      E:37      E:38      E:39      E:40 
## 0.1648885 0.1668283 0.1668283 0.1580989 0.1464597 0.1571290 0.1590689 0.1639185 
##      E:41      E:42      E:43      E:44      E:45      E:46      E:47      E:48 
## 0.1580989 0.1580989 0.1590689 0.1610087 0.1542192 0.1590689 0.1707081 0.1736178 
##      E:49 
## 0.1677983
```

Starting at $E = 11$, the FNN ratio stabilizes near 0.18; thus, embedding dimension E and neighbor number k are chosen from 11 onward for subsequent self-prediction parameter selection.


``` r
tEDM::simplex(cvd,"cvd","cvd",E = 11:25,k = 12:26)
## The suggested E and k for variable cvd is 11 and 12
tEDM::simplex(cvd,"rsp","rsp",E = 11:25,k = 12:26)
## The suggested E and k for variable rsp is 11 and 12
tEDM::simplex(cvd,"no2","no2",E = 11:25,k = 12:26)
## The suggested E and k for variable no2 is 11 and 12
tEDM::simplex(cvd,"so2","so2",E = 11:25,k = 12:26)
## The suggested E and k for variable so2 is 11 and 16
tEDM::simplex(cvd,"o3","o3",E = 11:25,k = 12:26)
## The suggested E and k for variable o3 is 11 and 13

s1 = tEDM::simplex(cvd,"cvd","cvd",E = 11,k = 12:26)
s2 = tEDM::simplex(cvd,"rsp","rsp",E = 11,k = 12:26)
s3 = tEDM::simplex(cvd,"no2","no2",E = 11,k = 12:26)
s4 = tEDM::simplex(cvd,"so2","so2",E = 11,k = 12:26)
s5 = tEDM::simplex(cvd,"o3","o3",E = 11,k = 12:26)

list(s1,s2,s3,s4,s5)
## [[1]]
## The suggested E and k for variable cvd is 11 and 12 
## 
## [[2]]
## The suggested E and k for variable rsp is 11 and 12 
## 
## [[3]]
## The suggested E and k for variable no2 is 11 and 12 
## 
## [[4]]
## The suggested E and k for variable so2 is 11 and 16 
## 
## [[5]]
## The suggested E and k for variable o3 is 11 and 13

simplex_df = purrr::map2_dfr(list(s1,s2,s3,s4,s5),
                             c("cvd","rsp","no2","so2","o3"),
                             \(.list,.name) dplyr::mutate(.list$xmap,variable = .name))
ggplot2::ggplot(data = simplex_df) +
  ggplot2::geom_line(ggplot2::aes(x = k, y = rho, color = variable))
```

![**Figure 2**. Variation of prediction skill with the number of nearest neighbors.](../man/figures/edm/fig_simplex_cvd-1.png)

To investigate the causal influences of air pollutants on the incidence of cardiovascular diseases, we performed PCM analysis using an embedding dimension of 11 and 12 nearest neighbors per variable pair.


``` r
vars = c("cvd", "rsp", "no2", "so2", "o3")
res = list()
var_pairs = combn(vars, 2, simplify = FALSE)

for (pair in var_pairs) {
  var1 = pair[1]
  var2 = pair[2]
  conds = setdiff(vars, pair)
  key = paste0(var1, "_", var2)
  res[[key]] = tEDM::pcm(data = cvd,
                         cause = var2,
                         effect = var1,
                         conds = conds,
                         libsizes = seq(12, 1012, 100),
                         E = 11, k = 12,
                         progress = FALSE)
}
```


The PCM results are shown in the figure below:


``` r
fig_cvd_rsp = plot(res[["cvd_rsp"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(-0.01,0.2),
                   ybreaks = seq(0,0.2,by = 0.05),
                   legend_texts = c("CVD xmap RSP, P = 0",
                                    "RSP xmap CVD, P = 0.001"))
fig_cvd_rsp_p = plot(res[["cvd_rsp"]], xlimits = c(0,1030),
                     ylimits = c(-0.01,0.2), ybreaks = seq(0,0.2,by = 0.05),
                     legend_texts = c("CVD xmap RSP | NO2 & SO2 & O3, P = 0",
                                      "RSP xmap CVD | NO2 & SO2 & O3, P = 0.0179"))

fig_cvd_no2 = plot(res[["cvd_no2"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(0,0.35),
                   ybreaks = seq(0,0.35,by = 0.05),
                   legend_texts = c("CVD xmap NO2, P = 0",
                                    "NO2 xmap CVD, P = 0"))
fig_cvd_no2_p = plot(res[["cvd_no2"]], xlimits = c(0,1030),
                     ylimits = c(0,0.35), ybreaks = seq(0,0.35,by = 0.05),
                     legend_texts = c("CVD xmap NO2 | RSP & SO2 & O3, P = 0",
                                      "NO2 xmap CVD | RSP & SO2 & O3, P = 0"))

fig_cvd_so2 = plot(res[["cvd_so2"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(0,0.25),
                   ybreaks = seq(0,0.25,by = 0.05),
                   legend_texts = c("CVD xmap SO2, P = 0",
                                    "SO2 xmap CVD, P = 0"))
fig_cvd_so2_p = plot(res[["cvd_so2"]], xlimits = c(0,1030),
                     ylimits = c(0,0.25), ybreaks = seq(0,0.25,by = 0.05),
                     legend_texts = c("CVD xmap SO2 | RSP & NO2 & O3, P = 0",
                                      "SO2 xmap CVD | RSP & NO2 & O3, P = 0.003"))

fig_cvd_o3 = plot(res[["cvd_o3"]], partial = FALSE,
                  xlimits = c(0,1030), ylimits = c(-0.05,0.35),
                  ybreaks = seq(-0.05,0.35,by = 0.1),
                  legend_texts = c("CVD xmap O3, P = 0.495",
                                   "O3 xmap CVD, P = 0.002"))
fig_cvd_o3_p = plot(res[["cvd_o3"]], xlimits = c(0,1030),
                    ylimits = c(-0.05,0.35), ybreaks = seq(-0.05,0.35,by = 0.1),
                    legend_texts = c("CVD xmap O3 | RSP & NO2 & SO2, P = 0.402",
                                     "O3 xmap CVD | RSP & NO2 & SO2, P = 0.028"))

fig_rsp_no2 = plot(res[["rsp_no2"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(0.2,0.9),
                   ybreaks = seq(0.2,0.9,by = 0.1),
                   legend_texts = c("RSP xmap NO2, P = 0",
                                    "NO2 xmap RSP, P = 0"))
fig_rsp_no2_p = plot(res[["rsp_no2"]], xlimits = c(0,1030),
                     ylimits = c(0.2,0.9), ybreaks = seq(0.2,0.9,by = 0.1),
                     legend_texts = c("RSP xmap NO2 | CVD & SO2 & O3, P = 0",
                                      "NO2 xmap RSP | CVD & SO2 & O3, P = 0"))

fig_rsp_so2 = plot(res[["rsp_so2"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(0.05,0.55),
                   ybreaks = seq(0.05,0.55,by = 0.1),
                   legend_texts = c("RSP xmap SO2, P = 0",
                                    "SO2 xmap RSP, P = 0"))
fig_rsp_so2_p = plot(res[["rsp_so2"]], xlimits = c(0,1030),
                     ylimits = c(0.05,0.55), ybreaks = seq(0.05,0.55,by = 0.1),
                     legend_texts = c("RSP xmap SO2 | CVD & NO2 & O3, P = 0",
                                      "SO2 xmap RSP | CVD & NO2 & O3, P = 0"))

fig_rsp_o3 = plot(res[["rsp_o3"]], partial = FALSE,
                  xlimits = c(0,1030), ylimits = c(0.05,0.65),
                  ybreaks = seq(0.05,0.65,by = 0.1),
                  legend_texts = c("RSP xmap O3, P = 0",
                                   "O3 xmap RSP, P = 0"))
fig_rsp_o3_p = plot(res[["rsp_o3"]], xlimits = c(0,1030),
                    ylimits = c(0.05,0.65), ybreaks = seq(0.05,0.65,by = 0.1),
                    legend_texts = c("RSP xmap O3 | CVD & SO2 & NO2, P = 0",
                                     "O3 xmap RSP | CVD & SO2 & NO2, P = 0"))

fig_no2_so2 = plot(res[["no2_so2"]], partial = FALSE,
                   xlimits = c(0,1030), ylimits = c(0.15,0.75),
                   ybreaks = seq(0.15,0.75,by = 0.1),
                   legend_texts = c("NO2 xmap SO2, P = 0",
                                    "SO2 xmap NO2, P = 0"))
fig_no2_so2_p = plot(res[["no2_so2"]], xlimits = c(0,1030),
                     ylimits = c(0.15,0.75), ybreaks = seq(0.15,0.75,by = 0.1),
                     legend_texts = c("NO2 xmap SO2 | CVD & RSP & O3, P = 0",
                                      "SO2 xmap NO2 | CVD & RSP & O3, P = 0"))

fig_no2_o3 = plot(res[["no2_o3"]], partial = FALSE,
                  xlimits = c(0,1030), ylimits = c(0.05,0.65),
                  ybreaks = seq(0.05,0.65,by = 0.1),
                  legend_texts = c("NO2 xmap O3, P = 0",
                                   "O3 xmap NO2, P = 0"))
fig_no2_o3_p = plot(res[["no2_o3"]], xlimits = c(0,1030),
                    ylimits = c(0.05,0.65), ybreaks = seq(0.05,0.65,by = 0.1),
                    legend_texts = c("NO2 xmap O3 | CVD & RSP & SO2, P = 0",
                                     "O3 xmap NO2 | CVD & RSP & SO2, P = 0"))

fig_so2_o3 = plot(res[["so2_o3"]], partial = FALSE,
                  xlimits = c(0,1030), ylimits = c(-0.05,0.45),
                  ybreaks = seq(-0.05,0.45,by = 0.1),
                  legend_texts = c("SO2 xmap O3, P = 0",
                                   "O3 xmap SO2, P = 0.094"))
fig_so2_o3_p = plot(res[["so2_o3"]], xlimits = c(0,1030),
                    ylimits = c(-0.05,0.45), ybreaks = seq(-0.05,0.45,by = 0.1),
                    legend_texts = c("SO2 xmap O3 | CVD & RSP & NO2, P = 0",
                                     "O3 xmap SO2 | CVD & RSP & NO2, P = 0.131"))

fig_pcm = cowplot::plot_grid(fig_cvd_rsp, fig_cvd_rsp_p, fig_cvd_no2, fig_cvd_no2_p,
                             fig_cvd_so2, fig_cvd_so2_p, fig_cvd_o3, fig_cvd_o3_p,
                             fig_rsp_no2, fig_rsp_no2_p, fig_rsp_so2, fig_rsp_so2_p,
                             fig_rsp_o3, fig_rsp_o3_p, fig_no2_so2, fig_no2_so2_p,
                             fig_no2_o3, fig_no2_o3_p, fig_so2_o3, fig_so2_o3_p,
                             ncol = 4, label_fontfamily = 'serif',
                             labels = letters[1:20],
                             label_x = -0.005, label_y = 1)
fig_pcm
```

![**Figure 3**. Partial cross mapping results between different air pollutants and cardiovascular diseases.](../man/figures/edm/fig_pcm-1.png)

From Figure 3, we can infer the following causal links:

<div class="figure">
<img src="../man/figures/edm/cvds_hk.png" alt="**Figure 4**. Causal interactions between air pollutants and cardiovascular diseases in Hong Kong." width="600px" />
<p class="caption">**Figure 4**. Causal interactions between air pollutants and cardiovascular diseases in Hong Kong.</p>
</div>

### US County Carbon Emissions and Temperature Dynamics

To examine whether a causal relationship exists between annual mean temperature and total annual CO₂ emissions, we implement the CMC method across counties.


``` r
library(tEDM)

carbon = readr::read_csv(system.file("case/carbon.csv.gz",package = "tEDM"))
## Rows: 113627 Columns: 4
## ── Column specification ─────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (4): year, fips, tem, carbon
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(carbon)
## # A tibble: 6 × 4
##    year  fips   tem     carbon
##   <dbl> <dbl> <dbl>      <dbl>
## 1  1981  1001  17.4 192607687.
## 2  1982  1001  18.4 187149414.
## 3  1983  1001  16.9 191584445.
## 4  1984  1001  17.8 199157579.
## 5  1985  1001  17.9 205207564.
## 6  1986  1001  18.5 218446030.

carbon_list = dplyr::group_split(carbon, by = fips)
length(carbon_list)
## [1] 3071
```

Using the 100th county as an example, we determine the appropriate embedding dimension by applying the FNN method.


``` r
tEDM::fnn(carbon_list[[100]],"carbon",E = 2:10,eps = stats::sd(carbon_list[[100]]$carbon))
##        E:1        E:2        E:3        E:4        E:5        E:6        E:7 
## 0.29729730 0.02702703 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
##        E:8        E:9 
## 0.00000000 0.00000000
```

When E equals 3, the FNN ratio begins to drop to zero; therefore, we select $E = 3$ as the embedding dimension for the CMC analysis.


``` r
res = carbon_list |>
  purrr::map_dfr(\(.x) {
    g = tEDM::cmc(.x,"tem","carbon",E = 3,k = 20,dist.metric = "L2",progressbar = FALSE)
    return(g$xmap)
  })
head(res)
##   neighbors x_xmap_y_mean x_xmap_y_sig x_xmap_y_upper x_xmap_y_lower
## 1        20       0.17625 1.441152e-06      0.3079181     0.04458194
## 2        20       0.15625 5.526258e-08      0.2802472     0.03225277
## 3        20       0.15625 5.809091e-08      0.2804509     0.03204912
## 4        20       0.20125 3.147621e-05      0.3419196     0.06058040
## 5        20       0.20875 7.165149e-05      0.3525117     0.06498830
## 6        20       0.13125 2.408340e-10      0.2453769     0.01712311
##   y_xmap_x_mean y_xmap_x_sig y_xmap_x_upper y_xmap_x_lower
## 1       0.06375 4.154966e-26      0.1446512              0
## 2       0.06250 1.381916e-28      0.1398111              0
## 3       0.06250 4.667439e-26      0.1437169              0
## 4       0.06625 2.180138e-25      0.1478975              0
## 5       0.06875 2.677461e-23      0.1537491              0
## 6       0.05875 1.135327e-30      0.1338684              0

res_carbon = res |>
  dplyr::select(neighbors,
                carbon_tem = x_xmap_y_mean,
                tem_carbon = y_xmap_x_mean) |>
  tidyr::pivot_longer(c(carbon_tem, tem_carbon),
                      names_to = "variable", values_to = "value")
head(res_carbon)
## # A tibble: 6 × 3
##   neighbors variable    value
##       <dbl> <chr>       <dbl>
## 1        20 carbon_tem 0.176 
## 2        20 tem_carbon 0.0638
## 3        20 carbon_tem 0.156 
## 4        20 tem_carbon 0.0625
## 5        20 carbon_tem 0.156 
## 6        20 tem_carbon 0.0625
```


``` r
res_carbon$variable = factor(res_carbon$variable,
                             levels = c("carbon_tem", "tem_carbon"),
                             labels = c("carbon → tem", "tem → carbon"))
fig_case2 = ggplot2::ggplot(res_carbon,
                            ggplot2::aes(x = variable, y = value, fill = variable)) +
  ggplot2::geom_boxplot() +
  ggplot2::theme_bw() +
  ggplot2::scale_x_discrete(name = "") +
  ggplot2::scale_y_continuous(name = "Causal Strength",
                              expand = c(0,0),
                              limits = c(0,0.3),
                              breaks = seq(0,0.3,by = 0.05)) +
  ggplot2::theme(legend.position = "none")
fig_case2
```

![**Figure 5**. Causal strength scores between annual mean temperature and total annual CO₂ emissions across US counties, with embedding dimension E set to 6 and number of neighbors set to 20.](../man/figures/edm/fig_case2-1.png)

### COVID-19 Spread Across Japanese Prefectures

We examine the COVID-19 transmission between Tokyo and other prefectures by applying CCM to identify the underlying causal dynamics of the epidemic spread


``` r
library(tEDM)

covid = readr::read_csv(system.file("case/covid.csv",package = "tEDM"))
## Rows: 334 Columns: 47
## ── Column specification ─────────────────────────────────────────────────────────
## Delimiter: ","
## dbl (47): Hokkaido, Aomori, Iwate, Miyagi, Akita, Yamagata, Fukushima, Ibarak...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(covid)
## # A tibble: 6 × 47
##   Hokkaido Aomori Iwate Miyagi Akita Yamagata Fukushima Ibaraki Tochigi Gunma
##      <dbl>  <dbl> <dbl>  <dbl> <dbl>    <dbl>     <dbl>   <dbl>   <dbl> <dbl>
## 1        0      0     0      0     0        0         0       0       0     0
## 2        0      0     0      0     0        0         0       0       0     0
## 3        0      0     0      0     0        0         0       0       0     0
## 4        0      0     0      0     0        0         0       0       0     0
## 5        0      0     0      0     0        0         0       0       0     0
## 6        0      0     0      0     0        0         0       0       0     0
## # ℹ 37 more variables: Saitama <dbl>, Chiba <dbl>, Tokyo <dbl>, Kanagawa <dbl>,
## #   Niigata <dbl>, Toyama <dbl>, Ishikawa <dbl>, Fukui <dbl>, Yamanashi <dbl>,
## #   Nagano <dbl>, Gifu <dbl>, Shizuoka <dbl>, Aichi <dbl>, Mie <dbl>,
## #   Shiga <dbl>, Kyoto <dbl>, Osaka <dbl>, Hyogo <dbl>, Nara <dbl>,
## #   Wakayama <dbl>, Tottori <dbl>, Shimane <dbl>, Okayama <dbl>,
## #   Hiroshima <dbl>, Yamaguchi <dbl>, Tokushima <dbl>, Kagawa <dbl>,
## #   Ehime <dbl>, Kochi <dbl>, Fukuoka <dbl>, Saga <dbl>, Nagasaki <dbl>, …
```

The data are first differenced:


``` r
covid = covid |>
  dplyr::mutate(dplyr::across(dplyr::everything(),
                              \(.x) c(NA,diff(.x))))
```

Using Tokyo's COVID-19 infection data to test the optimal embedding dimension.


``` r
tEDM::fnn(covid,"Tokyo",E = 2:30,eps = stats::sd(covid$Tokyo)/10)
##        E:1        E:2        E:3        E:4        E:5        E:6        E:7 
## 0.79452055 0.16842105 0.01246106 0.00000000 0.00000000 0.00000000 0.00000000 
##        E:8        E:9       E:10       E:11       E:12       E:13       E:14 
## 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
##       E:15       E:16       E:17       E:18       E:19       E:20       E:21 
## 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
##       E:22       E:23       E:24       E:25       E:26       E:27       E:28 
## 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 0.00000000 
##       E:29 
## 0.00000000
```

Since the FNN ratio begins to approach zero when E equals 4, embedding dimensions from 4 onward are evaluated, and the pair of E and k yielding the highest self-prediction accuracy is selected for the CCM procedure.


``` r
tEDM::simplex(covid,"Tokyo","Tokyo",E = 4:50,k = 5:60)
## The suggested E and k for variable Tokyo is 4 and 7
```


``` r
res = names(covid)[-match("Tokyo",names(covid))] |>
  purrr::map_dfr(\(.l) {
    g = tEDM::ccm(covid,"Tokyo",.l,E = 4,k = 7,progressbar = FALSE)
    res = dplyr::mutate(g$xmap,x = "Tokyo",y = .l)
    return(res)
  })
head(res)
##   libsizes x_xmap_y_mean x_xmap_y_sig x_xmap_y_upper x_xmap_y_lower
## 1      331    0.70879268 4.221499e-52     0.65090130      0.7584883
## 2      331    0.15540844 4.475224e-03     0.04874688      0.2585655
## 3      331    0.61181618 1.410377e-35     0.53983767      0.6749128
## 4      331    0.51213248 1.160928e-23     0.42822713      0.5872831
## 5      331    0.02065942 7.071982e-01    -0.08700953      0.1278513
## 6      331    0.65485497 3.697190e-42     0.58882105      0.7122050
##   y_xmap_x_mean y_xmap_x_sig y_xmap_x_upper y_xmap_x_lower     x        y
## 1     0.7236820 2.936572e-55      0.6681758      0.7711770 Tokyo Hokkaido
## 2     0.4371497 5.653496e-17      0.3459266      0.5201856 Tokyo   Aomori
## 3     0.7574997 2.893418e-63      0.7076344      0.7998568 Tokyo    Iwate
## 4     0.7214586 8.962398e-55      0.6655925      0.7692846 Tokyo   Miyagi
## 5     0.2884528 8.400987e-08      0.1867672      0.3840232 Tokyo    Akita
## 6     0.5021635 1.126202e-22      0.4172045      0.5784216 Tokyo Yamagata

df1 = res |>
    dplyr::select(x,y,y_xmap_x_mean,y_xmap_x_sig)|>
    purrr::set_names(c("cause","effect","cs","sig"))
df2 = res |>
    dplyr::select(y,x,x_xmap_y_mean,x_xmap_y_sig) |>
    purrr::set_names(c("cause","effect","cs","sig"))
res_covid = dplyr::bind_rows(df1,df2)|>
  dplyr::filter(cause == "Tokyo") |>
  dplyr::arrange(dplyr::desc(cs))
head(res_covid,10)
##    cause   effect        cs           sig
## 1  Tokyo    Osaka 0.9340924 4.139602e-150
## 2  Tokyo Kanagawa 0.9299503 7.009751e-146
## 3  Tokyo  Saitama 0.9170512 3.296461e-134
## 4  Tokyo    Chiba 0.9148243 2.186195e-132
## 5  Tokyo    Hyogo 0.9074071 1.163021e-126
## 6  Tokyo    Aichi 0.9053216 3.880799e-125
## 7  Tokyo  Ibaraki 0.8765329 3.896926e-107
## 8  Tokyo     Nara 0.8746084 4.261884e-106
## 9  Tokyo Shizuoka 0.8659047 1.327148e-101
## 10 Tokyo    Kyoto 0.8604876  5.796571e-99
```

Using `0.90` as the threshold (rounded to two decimal places), we map the causal responses in the spread of COVID-19 from Tokyo for those with a causal strength greater than `0.90`.


``` r
res_covid = res_covid |>
  dplyr::mutate(cs = round(res_covid$cs,2)) |>
  dplyr::filter(cs >= 0.90)
res_covid
##   cause   effect   cs           sig
## 1 Tokyo    Osaka 0.93 4.139602e-150
## 2 Tokyo Kanagawa 0.93 7.009751e-146
## 3 Tokyo  Saitama 0.92 3.296461e-134
## 4 Tokyo    Chiba 0.91 2.186195e-132
## 5 Tokyo    Hyogo 0.91 1.163021e-126
## 6 Tokyo    Aichi 0.91 3.880799e-125

if (!requireNamespace("rnaturalearth")) {
  install.packages("rnaturalearth")
}
## Loading required namespace: rnaturalearth
jp = rnaturalearth::ne_states(country = "Japan")

if (!requireNamespace("tidygeocoder")) {
  install.packages("tidygeocoder")
}
## Loading required namespace: tidygeocoder
jpp = tibble::tibble(name = c("Tokyo",res_covid$effect)) |>
  dplyr::mutate(type = factor(c("source",rep("target",6)),
                              levels = c("source","target"))) |>
  tidygeocoder::geocode(state = name, method = "arcgis",
                        long = "lon", lat = "lat")
## Passing 7 addresses to the ArcGIS single address geocoder
## Query completed in: 6.4 seconds

fig_case3 = ggplot2::ggplot() +
  ggplot2::geom_sf(data = jp, fill = "#ffe7b7", color = "grey", linewidth = 0.45) +
  ggplot2::geom_curve(data = jpp[-1,],
                      ggplot2::aes(x = jpp[1,"lon",drop = TRUE],
                                   y = jpp[1,"lat",drop = TRUE],
                                   xend = lon, yend = lat),
                      curvature = 0.2,
                      arrow = ggplot2::arrow(length = ggplot2::unit(0.2, "cm")),
                      color = "#6eab47", linewidth = 1) +
  ggplot2::geom_point(data = jpp,
                      ggplot2::aes(x = lon, y = lat, color = type),
                      size = 1.25, show.legend = FALSE) +
  ggrepel::geom_text_repel(data = jpp,
                           ggplot2::aes(label = name, x = lon, y = lat, color = type),
                           show.legend = FALSE) +
  ggplot2::scale_color_manual(values = c(source = "#2c74b7",
                                         target = "#cf574b")) +
  ggplot2::coord_sf(xlim = range(jpp$lon) + c(-0.45,0.45),
                    ylim = range(jpp$lat) + c(-0.75,0.75)) +
  ggplot2::labs(x = "", y = "") +
  ggplot2::theme_bw() +
  ggplot2::theme(panel.background = ggplot2::element_rect(fill = "#9cd1fd", color = NA))

fig_case3
```

![**Figure 6**. The prefectures most affected by Tokyo, Osaka, Kanagawa, Saitama, Chiba, Hyogo, and Aichi, are located on the map.](../man/figures/edm/fig_case3-1.png)

## Reference

Sugihara, G., May, R., Ye, H., Hsieh, C., Deyle, E., Fogarty, M., Munch, S., 2012. Detecting Causality in Complex Ecosystems. Science 338, 496–500. [https://doi.org/10.1126/science.1227079][1].

Leng, S., Ma, H., Kurths, J., Lai, Y.-C., Lin, W., Aihara, K., Chen, L., 2020. Partial cross mapping eliminates indirect causal influences. Nature Communications 11. [https://doi.org/10.1038/s41467-020-16238-0][2].

Tao, P., Wang, Q., Shi, J., Hao, X., Liu, X., Min, B., Zhang, Y., Li, C., Cui, H., Chen, L., 2023. Detecting dynamical causality by intersection cardinal concavity. Fundamental Research. [https://doi.org/10.1016/j.fmre.2023.01.007][3].

Clark, A.T., Ye, H., Isbell, F., Deyle, E.R., Cowles, J., Tilman, G.D., Sugihara, G., 2015. Spatial convergent cross mapping to detect causal relationships from short time series. Ecology 96, 1174–1181. [https://doi.org/10.1890/14-1479.1][4].

Gan, T., Succar, R., Macrì, S., Marín, M.R., Porfiri, M., 2025. Causal discovery from city data, where urban scaling meets information theory. Cities 162, 105980. [https://doi.org/10.1016/j.cities.2025.105980][5].

&nbsp;

[1]: https://doi.org/10.1126/science.1227079
[2]: https://doi.org/10.1038/s41467-020-16238-0
[3]: https://doi.org/10.1016/j.fmre.2023.01.007
[4]: https://doi.org/10.1890/14-1479.1
[5]: https://doi.org/10.1016/j.cities.2025.105980