Profiling Clusters#
After clustering single-cell RNA-seq data into discrete populations, the next essential step is to profile each cluster by identifying its defining molecular characteristics. This notebook demonstrates how to use the exprmat package for comprehensive cluster profiling, including the identification of cluster-specific marker genes through differential expression testing, visualization of marker gene expression patterns across clusters, and functional annotation of cluster signatures. These profiling approaches facilitate biological interpretation of clustering results, enabling the assignment of cell type identities and the discovery of novel or rare subpopulations within heterogeneous tissues.
The following methods can be used to describe and characterize cluster features:
Finding marker genes, which serves as the foundation for all other methods. These are genes that are particularly highly or lowly expressed in a cluster of interest. After obtaining marker genes, you can examine them directly and look for signatures. For well-known cell type classifications, manual identification using prior knowledge is often the most convenient and quickest approach. However, prior knowledge carries the risk of error.
Gene set enrichment analysis, which uses gene sets associated with known functions to determine whether marker genes appear more or less frequently in a gene set than expected by random chance, indicating that the cluster’s characteristics are related to the functional annotation of that gene set. This provides an automated annotation method, but requires ensuring that the gene sets meet the necessary criteria. Such methods include ORA, GSEA, GSVA, and basic rank-score methods.
Homology-based annotation, if the species you are studying lacks well-annotated gene sets, or if the cell type to be identified is not well-defined, you may be able to borrow knowledge from other species. Due to the complexity of evolutionary relationships, sequence-homologous genes may no longer perform their original functions.
We will load the data directly from the integrated dataset:
[1]:
%load_ext autoreload
%autoreload 2
[2]:
import exprmat as em
# set working directory
em.setwd('../../../data')
ver = em.version()
[i] exprmat 0.2.66 / exprmat-db 0.2.66
[i] os: posix (linux) platform version: 6.8.0-90-generic
[i] loaded configuration from /home/data/yangz/.exprmatrc
[i] current working directory: /home/data/yangz/packages/exprmat/data
[i] current database directory: /home/data/yangz/packages/database (0.2.66)
[i] resident memory: 774.93 MiB
[i] virtual memory: 5.95 GiB
[3]:
expm = em.load_experiment('expm/scrna', load_samples = False, load_subset = 'mono-neutro')
[!] samples are not dumped in the experiment directory.
[5]:
print(expm)
annotated data of size 9754 × 19651
subset mono-neutro of size 9754 × 19651
contains modalities: rna
modality [rna]
obs : sample <cat> <c/sample> batch <cat> <c/batch> group <cat> <c> modality <cat> <c/modality>
taxa <cat> <c/taxa> barcode <o> <o> ubc <o> <o> n.umi <f64> <i> n.genes <i64> <i>
n.mito <f64> <f> n.ribo <f64> <f> pct.mito <f64> <f> pct.ribo <f64> <f>
filter <bool> <bool> score.doublet <f64> <f> score.doublet.se <f64> <f>
is.doublet <bool> <bool> qc <bool> <bool/qc> leiden <cat> <c> sc3.5 <cat> <c>
sc3.10 <cat> <c> sc3.20 <cat> <c> sc3.30 <cat> <c> cell.type <cat> <c>
kde.umap <f64> <f/kde> psbulk <cat>
var : chr <cat> <c/chromosome> start <i64> <i> end <i64> <i> strand <cat> <c/strand> id <o> <o>
subtype <cat> <c/gsubtype> gene <cat> <o/gene> tlen <f64> <i/tlen> cdslen <i64> <i/cdslen>
assembly <cat> <c> uid <o> <o/ugene> vst.hvg <bool> <bool/hvg> vst.all.means <f64> <f>
vst.all.vars <f64> <f> vst.all.vars.norm <f64> <f> vst.all.hvg.rank <f32> <f>
vst.all.hvg <bool> <bool>
layers : counts <f32> <i/counts> norm <f32> <f>
obsm : cnmf.10 <df> <f/embedding/usage> harmony <arr:f32(35)> <f> knn <arr:i32(100)> <i/knni>
knn.d <arr:f32(100)> <f/knnd> pca <arr:f64(35)> <f/embedding/pca>
umap <arr:f32(2)> <f/embedding>
varm : cnmf.10 <arr:f64(10)> <f/weights> cnmf.coef.10 <arr:f64(10)> <f/usage-coef>
pca <arr:f64(35)> <f/weights>
obsp : connectivities <csr:f32> <f/connectivity> distances <csr:f32> <f/distance>
uns : cell.type.colors cell.type_colors cnmf <cnmf> cnmf.args <o>
cnmf.density.10 <cnmf-density> cnmf.dist.10 <f/connectivity> cnmf.stats <cnmf-stats>
commands <system> kde.umap <kde-stats> leiden <o> leiden.colors <o> markers <markers>
neighbors <knn> pca <dict> sc3.10.colors <o> sc3.20.colors <o> sc3.30.colors <o>
sc3.5.colors <o> slots <system> umap <o>
[*] samples not loaded from disk.
[6]:
fig = expm.rna.plot_multiple_embedding(
basis = 'umap', features = [
'Ly6a', 'F13a1', 'Flt3', 'Irf8',
'Csf1r', 'Vcan', 'Ly6g', 'cell.type'
], ncols = 4,
sort = True, figsize = (10, 5), dpi = 100, legend = False,
annotate_style = 'text', annotate_fontsize = 8, ptsize = 2
)
[7]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'cell.type',
legend = False,
annotate = True, annotate_style = 'text', annotate_fontsize = 8,
contour_plot = False,
sort = True, figsize = (2.5, 2.5), dpi = 100,
run_on_splits = True, split_key = 'sample', split_selection = ['normal', 'niche']
)
Finding marker genes#
The markers subroutine can be used to obtain differentially expressed genes of a subpopulation relative to others (or another subpopulation).
[8]:
expm.rna.markers(
groupby = 'cell.type',
mask_var = None,
groups = ['Neu'],
reference = 'rest',
n_genes = None, rankby_abs = False, pts = True,
key_added = 'deg.c7',
method = 't-test',
corr_method = 'benjamini-hochberg',
tie_correct = False,
gene_symbol = 'gene',
layer = 'X'
)
[11]:
expm.rna.get_markers(
slot = 'deg.c7',
min_pct = 0.5,
max_pct_reference = 0.75,
max_q = 0.05,
min_lfc = 1.0, max_lfc = 25,
remove_zero_pval = False
)[['names', 'lfc', 'q', 'pct', 'pct.reference', 'log10.q', 'gene']]
[i] fetched diff `Neu` over `rest` (302 genes)
[11]:
| names | lfc | q | pct | pct.reference | log10.q | gene | |
|---|---|---|---|---|---|---|---|
| 0 | rna:mmu:g34454 | 5.000824 | 0.000000e+00 | 0.968536 | 0.358268 | 300.000000 | Mmp9 |
| 2 | rna:mmu:g1350 | 5.083699 | 0.000000e+00 | 0.917238 | 0.192585 | 300.000000 | Cxcr2 |
| 3 | rna:mmu:g48106 | 3.973207 | 0.000000e+00 | 0.974351 | 0.429462 | 300.000000 | Mxd1 |
| 4 | rna:mmu:g33362 | 4.648287 | 0.000000e+00 | 0.925291 | 0.252953 | 300.000000 | Hdc |
| 5 | rna:mmu:g31928 | 4.757430 | 0.000000e+00 | 0.780644 | 0.116142 | 300.000000 | Dhrs9 |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 544 | rna:mmu:g25070 | 1.034380 | 6.239966e-111 | 0.637041 | 0.597441 | 110.204818 | Pim1 |
| 559 | rna:mmu:g1820 | 1.089561 | 1.149853e-104 | 0.634954 | 0.393701 | 103.939358 | Agap1 |
| 584 | rna:mmu:g57010 | 1.001317 | 3.344159e-99 | 0.560394 | 0.471129 | 98.475713 | Nfat5 |
| 609 | rna:mmu:g4166 | 1.012059 | 6.729340e-94 | 0.520131 | 0.377625 | 93.172028 | Hsd11b1 |
| 629 | rna:mmu:g60743 | 1.384094 | 1.305682e-89 | 0.681032 | 0.672572 | 88.884163 | Ltf |
302 rows × 7 columns
Over-representation analysis#
Over-representation analysis uses contingency table tests to determine whether a gene set appears more frequently in the marker gene list than expected by random chance.
[12]:
expm.rna.enrich_ora(
taxa = 'mmu',
de_slot = 'deg.c7', group_name = None,
use_abs_lfc = True, min_abs_lfc = 1, max_abs_lfc = 25,
key_added = 'ora.c7',
gene_sets = 'bp',
identifier = 'entrez', # the bp database contains gene names as ENTREZ
opa_cutoff = 0.05,
)
[i] fetched diff `Neu` over `rest` (14078 genes)
[i] fetched 10675 genes differentially expressed.
[i] with a background of 18040 observed genes.
[15]:
fig = expm.rna.plot_ora_dotplot(
slot = 'ora.c7', max_fdr = 1, max_p = 0.05,
top_term = 10, terms = None, # draw all terms
colour = 'fdr', cmap = 'wyj', figsize = (6, 3), cutoff = 1, ptsize = 5,
# customizing the formatting rule of the y axis
formatter = lambda x: x.replace('GOBP_', '').replace('_', ' ').capitalize(),
title = 'GO Biological Process Enrichment (ORA)'
)
[i] retreived 10 terms for plotting.
Gene set enrichment analysis#
Using the log fold change of all genes between two groups, enrichment analysis can be performed via a rank-based scoring method.
[16]:
expm.rna.enrich_gsea(
taxa = 'mmu',
de_slot = 'deg.c7', group_name = None,
key_added = 'gsea.c7',
gene_sets = 'kegg',
identifier = 'entrez'
)
[i] fetched diff `Neu` over `rest` (14078 genes)
[i] fetched 14078 preranked genes by logfc.
2026-05-11 22:17:24,363 [WARNING] Duplicated values found in preranked stats: 0.01% of genes
The order of those genes will be arbitrary, which may produce unexpected results.
[17]:
expm.rna.get_gsea(slot = 'gsea.c7')
[17]:
| name | es | nes | p | fwerp | fdr | tag | |
|---|---|---|---|---|---|---|---|
| 11 | Taurine and hypotaurine metabolism | 0.689581 | 1.708751 | 0.045714 | 0.197 | 0.111763 | 4/6 |
| 10 | Taste transduction | 0.394177 | 1.501916 | 0.040816 | 0.493 | 0.120216 | 11/29 |
| 38 | Retinol metabolism | 0.452057 | 1.502266 | 0.027778 | 0.493 | 0.180089 | 9/21 |
| 17 | Virion - Human immunodeficiency virus | -0.855095 | -1.733915 | 0.009357 | 0.153 | 0.180526 | 6/7 |
| 2 | Virion - Flavivirus and Alphavirus | -0.809305 | -1.629260 | 0.020457 | 0.545 | 0.466613 | 6/7 |
| 0 | Alcoholism | -0.511439 | -1.477416 | 0.000000 | 0.988 | 0.569829 | 51/137 |
| 40 | Basal cell carcinoma | -0.595072 | -1.505842 | 0.022175 | 0.962 | 0.601898 | 10/29 |
| 3 | Cytoskeleton in muscle cells | -0.517264 | -1.481652 | 0.002008 | 0.985 | 0.603226 | 45/129 |
| 16 | Hedgehog signaling pathway | -0.586859 | -1.517746 | 0.014644 | 0.949 | 0.613821 | 9/35 |
| 35 | Breast cancer | -0.526433 | -1.489142 | 0.001004 | 0.978 | 0.630820 | 32/94 |
| 9 | DNA replication | -0.551276 | -1.431964 | 0.044421 | 1.000 | 0.641675 | 24/34 |
| 37 | Proteoglycans in cancer | -0.498521 | -1.435232 | 0.002002 | 1.000 | 0.671578 | 45/147 |
| 6 | Transcriptional misregulation in cancer | -0.499654 | -1.443439 | 0.003000 | 0.997 | 0.672297 | 34/140 |
| 23 | Biosynthesis of unsaturated fatty acids | -0.625432 | -1.524325 | 0.021459 | 0.937 | 0.680898 | 8/21 |
| 21 | ECM-receptor interaction | -0.550174 | -1.450612 | 0.025694 | 0.995 | 0.683717 | 28/43 |
| 7 | Cytokine-cytokine receptor interaction | -0.472292 | -1.365499 | 0.001001 | 1.000 | 0.687468 | 68/158 |
| 19 | Renal cell carcinoma | -0.497755 | -1.360844 | 0.044898 | 1.000 | 0.687874 | 9/61 |
| 24 | Virion - Ebolavirus, Lyssavirus and Morbillivirus | -0.700165 | -1.541392 | 0.038976 | 0.895 | 0.699155 | 6/12 |
| 41 | Focal adhesion | -0.480401 | -1.372642 | 0.009027 | 1.000 | 0.704115 | 30/140 |
| 5 | Hippo signaling pathway | -0.488351 | -1.378083 | 0.017051 | 1.000 | 0.704133 | 20/96 |
| 25 | Systemic lupus erythematosus | -0.553177 | -1.564574 | 0.000000 | 0.820 | 0.706464 | 56/103 |
| 32 | EGFR tyrosine kinase inhibitor resistance | -0.498124 | -1.366051 | 0.032587 | 1.000 | 0.713456 | 17/69 |
| 4 | PPAR signaling pathway | -0.516196 | -1.380970 | 0.041879 | 1.000 | 0.720114 | 17/48 |
| 26 | Ras signaling pathway | -0.478279 | -1.382024 | 0.004008 | 1.000 | 0.750822 | 35/162 |
| 8 | Gastric cancer | -0.494137 | -1.387209 | 0.009082 | 1.000 | 0.755987 | 29/91 |
| 39 | PI3K-Akt signaling pathway | -0.453286 | -1.331012 | 0.002000 | 1.000 | 0.763638 | 51/231 |
| 29 | Cell adhesion molecules | -0.472574 | -1.325629 | 0.029029 | 1.000 | 0.772963 | 52/100 |
| 13 | Rap1 signaling pathway | -0.461357 | -1.334247 | 0.008000 | 1.000 | 0.798483 | 33/161 |
| 33 | Pathways in cancer | -0.416376 | -1.237094 | 0.008000 | 1.000 | 0.800584 | 123/386 |
| 12 | Calcium signaling pathway | -0.448758 | -1.291423 | 0.018018 | 1.000 | 0.807508 | 58/153 |
| 14 | Ribosome | -0.444357 | -1.259575 | 0.043043 | 1.000 | 0.836399 | 105/128 |
| 28 | MAPK signaling pathway | -0.421247 | -1.228006 | 0.027000 | 1.000 | 0.839601 | 48/224 |
[19]:
fig = expm.rna.plot_gsea_dotplot(
slot = 'gsea.c7', max_fdr = 1, max_p = 0.05,
top_term = 10, terms = None, # draw all terms
colour = 'p', cmap = 'turbo', figsize = (6, 3), cutoff = 1, ptsize = 5,
# customizing the formatting rule of the y axis
formatter = lambda x: x,
title = 'KEGG Enrichment (GSEA)'
)
[i] retreived 10 terms for plotting.
[22]:
fig = expm.rna.plot_gsea_leading_edge(
slot = 'gsea.c7',
terms = 'Systemic lupus erythematosus',
figsize = (4, 4),
title = None,
)
Single-cell gene set scoring#
Single-cell scoring functions can be used to assess the enrichment level of specific gene sets in individual cells. score_genes is a scanpy-compatible version, while other algorithmic implementations include aucell, ulm, and gsva. These functions generate obs columns named score.{geneset}.
[27]:
expm.rna.score_genes(
taxa = 'mmu',
gene_sets = {
'neu': ['S100a8', 'S100a9', 'Mpo'],
},
identifier = 'gene', # can be 'gene', 'uppercase', 'entrez', and 'ugene'
lognorm = 'X',
random_state = 42,
)
[29]:
expm['rna'].obs[['score.neu']]
[29]:
| score.neu | |
|---|---|
| distal:2 | 3.103644 |
| distal:3 | 3.853550 |
| distal:4 | 4.015083 |
| distal:8 | 3.794824 |
| distal:9 | 3.864962 |
| ... | ... |
| normal:4657 | 0.530043 |
| normal:4658 | 4.420076 |
| normal:4660 | 4.017634 |
| normal:4661 | 4.039845 |
| normal:4662 | 4.271536 |
9754 rows × 1 columns
[31]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'score.neu',
sort = True, figsize = (3, 3), dpi = 100, legend = False,
annotate_style = 'text', annotate_fontsize = 8, ptsize = 2
)
[12]:
expm.rna.score_ulm(
taxa = 'mmu',
gene_sets = {
'neu': ['S100a8', 'S100a9', 'Mpo'],
},
identifier = 'gene', # can be 'gene', 'uppercase', 'entrez', and 'ugene'
lognorm = 'X',
tmin = 0, # for small gene sets
)
[10]:
expm['rna'].obsm['score.ulm']
[10]:
| neu | |
|---|---|
| distal:2 | 14.599044 |
| distal:3 | 19.084107 |
| distal:4 | 21.918440 |
| distal:8 | 19.078472 |
| distal:9 | 20.233251 |
| ... | ... |
| normal:4657 | 4.259452 |
| normal:4658 | 22.427184 |
| normal:4660 | 19.939915 |
| normal:4661 | 20.791861 |
| normal:4662 | 20.143090 |
9754 rows × 1 columns
[14]:
print(expm)
annotated data of size 9754 × 19651
subset mono-neutro of size 9754 × 19651
contains modalities: rna
modality [rna]
obs : sample <cat> <c/sample> batch <cat> <c/batch> group <cat> <c> modality <cat> <c/modality>
taxa <cat> <c/taxa> barcode <o> <o> ubc <o> <o> n.umi <f64> <i> n.genes <i64> <i>
n.mito <f64> <f> n.ribo <f64> <f> pct.mito <f64> <f> pct.ribo <f64> <f>
filter <bool> <bool> score.doublet <f64> <f> score.doublet.se <f64> <f>
is.doublet <bool> <bool> qc <bool> <bool/qc> leiden <cat> <c> sc3.5 <cat> <c>
sc3.10 <cat> <c> sc3.20 <cat> <c> sc3.30 <cat> <c> cell.type <cat> <c>
kde.umap <f64> <f/kde> psbulk <cat> <o> score.neu <f64> <f/coordinate/score>
var : chr <cat> <c/chromosome> start <i64> <i> end <i64> <i> strand <cat> <c/strand> id <o> <o>
subtype <cat> <c/gsubtype> gene <cat> <o/gene> tlen <f64> <i/tlen> cdslen <i64> <i/cdslen>
assembly <cat> <c> uid <o> <o/ugene> vst.hvg <bool> <bool/hvg> vst.all.means <f64> <f>
vst.all.vars <f64> <f> vst.all.vars.norm <f64> <f> vst.all.hvg.rank <f32> <f>
vst.all.hvg <bool> <bool>
layers : counts <f32> <i/counts> norm <f32> <f>
obsm : cnmf.10 <df> <f/embedding/usage> harmony <arr:f32(35)> <f> knn <arr:i32(100)> <i/knni>
knn.d <arr:f32(100)> <f/knnd> pca <arr:f64(35)> <f/embedding/pca>
umap <arr:f32(2)> <f/embedding> score.ulm <df> <f/score-matrix>
padj.ulm <df> <f/score-pval>
varm : cnmf.10 <arr:f64(10)> <f/weights> cnmf.coef.10 <arr:f64(10)> <f/usage-coef>
pca <arr:f64(35)> <f/weights>
obsp : connectivities <csr:f32> <f/connectivity> distances <csr:f32> <f/distance>
uns : cell.type.colors <o> cell.type_colors <o> cnmf <cnmf> cnmf.args <o>
cnmf.density.10 <cnmf-density> cnmf.dist.10 <f/connectivity> cnmf.stats <cnmf-stats>
commands <system> kde.umap <kde-stats> leiden <o> leiden.colors <o> markers <markers>
neighbors <knn> pca <dict> sc3.10.colors <o> sc3.20.colors <o> sc3.30.colors <o>
sc3.5.colors <o> slots <system> umap <o> ulm <dict/scoring/score-ulm>
[*] samples not loaded from disk.
ULM scoring achieves well to produce scores in small gene set
[13]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'score.neu',
sort = True, figsize = (3, 3), dpi = 100, legend = False,
annotate_style = 'text', annotate_fontsize = 8, ptsize = 2
)
GSVA is less robust on such small geneset.
[15]:
expm.rna.score_gsva(
taxa = 'mmu',
gene_sets = {
'neu': ['S100a8', 'S100a9', 'Mpo'],
},
identifier = 'gene', # can be 'gene', 'uppercase', 'entrez', and 'ugene'
lognorm = 'X',
tmin = 0, # for small gene sets
)
[16]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'score.neu',
sort = True, figsize = (3, 3), dpi = 100, legend = False,
annotate_style = 'text', annotate_fontsize = 8, ptsize = 2
)
[17]:
expm.rna.score_aucell(
taxa = 'mmu',
gene_sets = {
'neu': ['S100a8', 'S100a9', 'Mpo'],
},
identifier = 'gene', # can be 'gene', 'uppercase', 'entrez', and 'ugene'
lognorm = 'X',
tmin = 0, # for small gene sets
)
[ ]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'score.neu',
sort = True, figsize = (3, 3), dpi = 100, legend = False,
annotate_style = 'text', annotate_fontsize = 8, ptsize = 2
)
Differential gene expression between groups#
Data can be split based on two categorical variables to visualize group differences in gene expression.
[23]:
fig = expm.rna.plot_expression_bar(
gene = 'S100a8', slot = 'X', group = 'cell.type', split = 'sample',
selected_groups = None, selected_splits = ['niche', 'normal'], palette = ['red', 'black'],
figsize = (5, 2), dpi = 100, style = 'violin',
violin_kwargs = { 'split': True, 'inner': None }
)
[i] Neu, p = 0.000, D niche over normal
[i] MDP, p = 0.360, U niche over normal
[i] MM, p = 0.000, D niche over normal
[i] Mo, p = 0.113, U niche over normal
[i] DCp, p = 0.053, D niche over normal
[i] Prog, p = 0.726, U niche over normal
[i] iMac, p = 0.674, U niche over normal
[24]:
fig = expm.rna.plot_expression_bar(
gene = 'S100a8', slot = 'X', group = 'cell.type', split = 'sample',
selected_groups = None, selected_splits = ['niche', 'normal'], palette = ['red', 'black'],
figsize = (5, 2), dpi = 100, style = 'box',
violin_kwargs = { 'split': True, 'inner': None }
)
[i] Neu, p = 0.000, D niche over normal
[i] MDP, p = 0.360, U niche over normal
[i] MM, p = 0.000, D niche over normal
[i] Mo, p = 0.113, U niche over normal
[i] DCp, p = 0.053, D niche over normal
[i] Prog, p = 0.726, U niche over normal
[i] iMac, p = 0.674, U niche over normal
Saving the dataset#
Finally, save the changes we made.
[25]:
print(expm)
annotated data of size 9754 × 19651
subset mono-neutro of size 9754 × 19651
contains modalities: rna
modality [rna]
obs : sample <cat> <c/sample> batch <cat> <c/batch> group <cat> <c> modality <cat> <c/modality>
taxa <cat> <c/taxa> barcode <o> <o> ubc <o> <o> n.umi <f64> <i> n.genes <i64> <i>
n.mito <f64> <f> n.ribo <f64> <f> pct.mito <f64> <f> pct.ribo <f64> <f>
filter <bool> <bool> score.doublet <f64> <f> score.doublet.se <f64> <f>
is.doublet <bool> <bool> qc <bool> <bool/qc> leiden <cat> <c> sc3.5 <cat> <c>
sc3.10 <cat> <c> sc3.20 <cat> <c> sc3.30 <cat> <c> cell.type <cat> <c>
kde.umap <f64> <f/kde> psbulk <cat>
var : chr <cat> <c/chromosome> start <i64> <i> end <i64> <i> strand <cat> <c/strand> id <o> <o>
subtype <cat> <c/gsubtype> gene <cat> <o/gene> tlen <f64> <i/tlen> cdslen <i64> <i/cdslen>
assembly <cat> <c> uid <o> <o/ugene> vst.hvg <bool> <bool/hvg> vst.all.means <f64> <f>
vst.all.vars <f64> <f> vst.all.vars.norm <f64> <f> vst.all.hvg.rank <f32> <f>
vst.all.hvg <bool> <bool>
layers : counts <f32> <i/counts> norm <f32> <f>
obsm : cnmf.10 <df> <f/embedding/usage> harmony <arr:f32(35)> <f> knn <arr:i32(100)> <i/knni>
knn.d <arr:f32(100)> <f/knnd> pca <arr:f64(35)> <f/embedding/pca>
umap <arr:f32(2)> <f/embedding>
varm : cnmf.10 <arr:f64(10)> <f/weights> cnmf.coef.10 <arr:f64(10)> <f/usage-coef>
pca <arr:f64(35)> <f/weights>
obsp : connectivities <csr:f32> <f/connectivity> distances <csr:f32> <f/distance>
uns : cell.type.colors cell.type_colors cnmf <cnmf> cnmf.args <o>
cnmf.density.10 <cnmf-density> cnmf.dist.10 <f/connectivity> cnmf.stats <cnmf-stats>
commands <system> kde.umap <kde-stats> leiden <o> leiden.colors <o> markers <markers>
neighbors <knn> pca <dict> sc3.10.colors <o> sc3.20.colors <o> sc3.30.colors <o>
sc3.5.colors <o> slots <system> umap <o> deg.c7 <markers> ora.c7 <ora> gsea.c7 <gsea>
[*] samples not loaded from disk.
[26]:
em.memory()
[i] resident memory: 1.94 GiB
[i] virtual memory: 18.14 GiB