---
title: "Demo with scanpy"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Demo with scanpy}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

We've found that by using anndata for R, interacting with other anndata-based Python packages becomes super easy!

```{r check_on_cran, message=FALSE, warning=FALSE, echo=FALSE}
on_cran <- !identical(Sys.getenv("NOT_CRAN"), "true")

if (on_cran) {
  knitr::opts_chunk$set(eval = FALSE)
  knitr::asis_output(paste0(
    "<span style=\"color: red;\">**WARNING:** The outputs of this vignette are not rendered on CRAN due to package size limitations. ",
    "Please check the [Demo with scanpy](https://anndata.dynverse.org/articles/scanpy_demo.html) ",
    "vignette in the package documentation. </span>"
  ))
}
```

### Set up

To use another Python package (e.g. `scanpy`), you need to make sure that it is installed in the same ephemeral Python environment that `anndata` uses. 
You can let `reticulate` handle this for you by using the `py_require()` function:

```{r}
library(anndata)
library(reticulate)
py_require("scanpy")
```

**TIP**: Check out the vignette on setting up Python package environments with reticulate: https://rstudio.github.io/reticulate/articles/python_packages.html. 

### Download and load dataset

Let's use a 10x dataset from the 10x genomics website.
You can download it to an anndata object with scanpy as follows:
```{r}
sc <- import("scanpy")

url <- "https://cf.10xgenomics.com/samples/cell-exp/6.0.0/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma/SC3_v3_NextGem_DI_CellPlex_CSP_DTC_Sorted_30K_Squamous_Cell_Carcinoma_count_sample_feature_bc_matrix.h5"

ad <- sc$read_10x_h5("dataset.h5", backup_url = url)

ad
```

## Preprocessing dataset
The resuling dataset is a wrapper for the Python class but behaves very much like an R object:
```{r}
ad[1:5, 3:5]
dim(ad)
``` 

But you can still call scanpy functions on it, for example to perform preprocessing.
```{r}
sc$pp$filter_cells(ad, min_genes = 200)
sc$pp$filter_genes(ad, min_cells = 3)
sc$pp$normalize_per_cell(ad)
sc$pp$log1p(ad)
```

## Analysing your dataset in R
You can seamlessly switch back to using your dataset with other R functions, for example
by calculating the rowMeans of the expression matrix.
```{r}
library(Matrix)
rowMeans(ad$X[1:10, ])
```