iss_patcher.patch

iss_patcher.patch(iss, gex, min_genes=3, obs_to_take=None, cont_obs_to_take=None, nanmean=False, round_counts=True, chunk_size=100000, computation='annoy', neighbours=15, obsm_fraction=False, obsm_pbs=False)

Identify the nearest neighbours of low dimensionality observations in related higher dimensionality data, approximate features absent from the low dimensionality data as high dimensionality neighbour means. The data is log-normalised and z-scored prior to KNN inference.

Input

issAnnData

The low dimensionality data object, with raw counts in .X.

gexAnnData

The high dimensionality data object, with raw counts in .X.

min_genesint, optional (default: 3)

Passed to scanpy.pp.filter_cells() ran on the shared feature space of iss and gex.

obs_to_takestr or list of str, optional (default: None)

If provided, will report the most common value of the specified gex.obs column(s) for the neighbours of each iss cell. Discrete metadata only.

cont_obs_to_takestr or list of str, optional (default: None)

If provided, will report the average of the values of the specified gex.obs column(s) for the neighbours of each iss cell. Continuous metadata only.

nanmeanbool, optional (default: False)

If True, will also compute an equivalent of np.nanmean() for each cont_obs_to_take.

round_countsbool, optional (default: True)

If True, will round the computed counts to the nearest integer.

chunk_sizeint, optional (default: 100000)

If round_counts is True, will compute iss profiles these many observations at a time and round them to reduce RAM use. A larger value means fewer matrix operations (i.e. quicker run time) at the cost of more memory.

computationstr, optional (default: "annoy")

The package supports KNN inference via annoy (specify "annoy"), PyNNDescent (specify "pynndescent") and scipy’s cKDTree (specify "cKDTree"). Annoy identifies approximate neighbours and runs quicker, cKDTree identifies exact neighbours and is a bit slower.

neighboursint, optional (default: 15)

How many neighbours in gex to identify for each iss cell.

obsm_fractionbool, optional (default: False)

If True, will report the full fraction distribution of each obs_to_take in .obsm of the resulting object.

obsm_pbsbool, optional (default: False)

If True, will store the identified gex neighbours for each iss cell in .obsm['pbs']. A corresponding vector of gex.obs_names will be stored in .uns['pbs_gex_obs_names'].