spapros.se.select_reference_probesets
- spapros.se.select_reference_probesets(adata, n, genes_key='highly_variable', obs_key='celltype', methods=['PCA', 'DE', 'HVG', 'random'], seeds=[0], verbosity=2, save_dir=None)
Select reference probesets with basic selection methods.
- Parameters:
adata (AnnData) – Data with log normalised counts in adata.X.
n (int) – Number of selected genes.
genes_key (Optional[str]) – adata.var key for subset of preselected genes to run the selections on (typically ‘highly_variable_genes’). Set to None to not subset genes.
obs_key (str) – Only required for method ‘DE’. Column name of adata.obs for which marker scores are calculated.
methods (Union[List[str], Dict[str, Dict]]) –
Methods used for selections. Supported methods and default are [‘PCA’, ‘DE’, ‘HVG’, ‘random’]. To specify hyperparameters of the methods provide a dictionary, e.g.:
{ 'DE':{}, 'PCA':{'n_pcs':30}, 'HVG':{}, 'random':{}, }
seeds (List[int]) – List of random seeds. For each seed, one random gene set is selected if ‘random’ in methods.
verbosity (int) – Verbosity level.
save_dir (Optional[str]) – Directory path where all results are saved.
- Returns:
Dictionary with one entry for each method. The key is the selection method name and the value is a DataFrame with the same index as adata.var and at least one boolean column called ‘selection’ representing the selected probeset. For some methods, additional information is provided in other columns.
- Return type:
Dict[str, DataFrame]