
Approximate nearest neighbors using Annoy
FindAnnoyNeighbors.Rd
This function takes a matrix, and returns approximate Euclidean nearest neighbors and distances of row items given the number of trees (n_trees) and number of nearest neighbors (n_nn).
Usage
FindAnnoyNeighbors(
x,
cells = NULL,
n_trees = 50L,
n_nn = 10L,
search_k = NULL,
annoy_alg = c("euclidean", "angular", "manhattan", "hamming")
)
Arguments
- x
A numeric matrix with data to find nearest neighbors. Rows are cells, and columns are features.
- cells
A character vector with cell names to find nearest neighbors for. If NULL, all cells are used.
- n_trees
An integer with the number of trees to build in the Annoy index.
- n_nn
An integer with the number of nearest neighbors to find.
- search_k
An integer with the number of nodes to search in the Annoy index. Default is
n_trees * n_nn
.- annoy_alg
An character specifying which distance algorithm to use. Default is
AnnoyEuclidean
("euclidean"
). Available options are"euclidean"
,"angular"
,"manhattan"
, and"hamming"
.
Value
A tibble with the following columns:
id: The cell name
index: The index of the cell in the Annoy index
item: The index of the nearest neighbor
distance: The distance (Euclidean by default) to the nearest neighbor
nn: The rank of the nearest neighbor
neighbor: The name of the nearest neighbor
Examples
x <- matrix(rnorm(1000), ncol = 10)
FindAnnoyNeighbors(x, n_trees = 50, n_nn = 10)
#> # A tibble: 1,000 × 6
#> id index item distance nn neighbor
#> <chr> <dbl> <int> <dbl> <int> <chr>
#> 1 1 0 0 0 1 1
#> 2 1 0 52 2.13 2 53
#> 3 1 0 6 2.34 3 7
#> 4 1 0 16 2.81 4 17
#> 5 1 0 11 3.02 5 12
#> 6 1 0 60 3.03 6 61
#> 7 1 0 40 3.06 7 41
#> 8 1 0 68 3.10 8 69
#> 9 1 0 53 3.14 9 54
#> 10 1 0 36 3.22 10 37
#> # ℹ 990 more rows