
Differential analysis of proximity scores
DifferentialProximityAnalysis.Rd
Runs differential analysis (Running Wilcoxon rank-sum test) on proximity scores calculated
from Proximity Network Assay (PNA) data generated with the Pixelator
data processing
pipeline.
Usage
DifferentialProximityAnalysis(object, ...)
# S3 method for class 'data.frame'
DifferentialProximityAnalysis(
object,
contrast_column,
reference,
targets = NULL,
group_vars = NULL,
proximity_metric = "join_count_z",
metric_type = c("all", "self", "co"),
backend = c("dplyr", "data.table"),
min_n_obs = 0,
p_adjust_method = c("bonferroni", "holm", "hochberg", "hommel", "BH", "BY", "fdr"),
verbose = TRUE,
...
)
# S3 method for class 'Seurat'
DifferentialProximityAnalysis(
object,
contrast_column,
reference,
targets = NULL,
assay = NULL,
group_vars = NULL,
proximity_metric = "join_count_z",
metric_type = c("all", "self", "co"),
min_n_obs = 0,
p_adjust_method = c("bonferroni", "holm", "hochberg", "hommel", "BH", "BY", "fdr"),
verbose = TRUE,
...
)
Arguments
- object
An object containing PNA proximity scores
- ...
Not yet implemented
- contrast_column
The name of the column where the group labels are stored. This column must include
target
andreference
.- reference
The name of the reference group
- targets
The names of the target groups. These groups will be compared to the reference group. If the value is set to
NULL
(default), all groups available incontrast_column
exceptreference
will be compared to thereference
group.- group_vars
An optional character vector with column names to group the tests by.
- proximity_metric
The proximity metric to use. Any numeric data column in the proximity score table can be selected. The default is "pearson_z".
- metric_type
One of "all", "self" or "cross". If "all", all pairwise comparisons are considered. If "self", only protein pairs of the same type are considered. If "cross", protein pairs of different type are considered.
- backend
One of "dplyr" or "data.table". The latter requires the
dtplyr
package to be installed.- min_n_obs
Minimum number of observations allowed in a group. Target groups with less observations than
min_n_obs
will be skipped.- p_adjust_method
One of "bonferroni", "holm", "hochberg", "hommel", "BH", "BY" or "fdr". (see
?p.adjust
for details)- verbose
Print messages
- assay
Name of assay to use
Details
If you are working with a Seurat
object created with pixelatorR, the proximity scores
can be accessed with ProximityScores
.
The input object should contain a contrast_column
(character vector or factor)
that includes information about the groups to compare. A typical example is a column
with sample labels, for instance: "control", "stimulated1", "stimulated2". If the input
object is a Seurat
object, the contrast_column
should be available in
the meta.data
slot. For those familiar with FindMarkers
from Seurat,
contrast_column
is equivalent to the group.by
parameter.
The targets
parameter specifies a character vector with the names of the groups
to compare reference
. targets
can be a single group name or a vector of
group names while reference
can only refer to a single group. Both targets
and reference
should be present in the contrast_column
. These parameters
are similar to the ident.1
and ident.2
parameters in FindMarkers
.
Additional groups
The test is always computed between targets
and reference
, but it is possible
to add additional grouping variables with group_vars
. If group_vars
is used,
each comparison is split into groups defined by the group_vars
. For instance, if we
have annotated cells into cell type populations and saved these annotations in a meta.data
column called "cell_type", we can pass "cell_type" to group_vars="cell_type"
to split
tests across each cell type.
Types of comparisons
Consider a scenario where we have a Seurat object (seurat_object
) with Proximity Network Assay
(PNA) data. seurat_object
contains a meta.data
column called "sampleID"
that holds information about what samples the components originated from. This column could have
three sample IDs: "control", "stimulated1" and "stimulated2". In addition, we have a column
called "cell_type" that holds information about the cell type identity of each component.
If we want to compare the "stimulated1" group to the "control" group:
dp_markers <- DifferentialProximityAnalysis( object = seurat_object, contrast_column = "sampleID", reference = "control", targets = "stimulated1" )
If we want to compare the "stimulated1" and "stimulated2" groups to the "control" group:
dp_markers <- DifferentialProximityAnalysis( object = seurat_object, contrast_column = "sampleID", reference = "control", targets = c("stimulated1", "stimulated2") )
If we want to compare the "stimulated1" and "stimulated2" groups to the "control" group, and split the tests by cell type:
dp_markers <- DifferentialProximityAnalysis( object = seurat_object, contrast_column = "sampleID", reference = "control", targets = c("stimulated1", "stimulated2"), group_vars = "cell_type" )
Examples
# TODO: Update examples with real data
library(dplyr)
example_data <- tidyr::expand_grid(
marker_1 = c("HLA-ABC", "B2M", "CD4", "CD8", "CD20", "CD19", "CD45", "CD43") %>%
rep(each = 50),
marker_2 = c("HLA-ABC", "B2M", "CD4", "CD8", "CD20", "CD19", "CD45", "CD43") %>%
rep(each = 50)
) %>%
mutate(
join_count_z = rnorm(n(), sd = 10)
)
example_data <- example_data %>%
mutate(sampleID = "ctrl") %>%
bind_rows(
example_data %>% mutate(join_count_z = join_count_z + 1) %>%
mutate(sampleID = "treatment")
)
# Compute statistics
dp_results <- DifferentialProximityAnalysis(
example_data,
contrast_column = "sampleID",
reference = "ctrl",
proximity_metric = "join_count_z",
metric_type = "self"
)
#> ℹ Computing Running Wilcoxon rank-sum test for each marker pair across the following comparisons:
#>
#> • treatment vs ctrl