Ligand-receptor interaction analysis#
Several computational tools have emerged for inferring cell-cell communication from single-cell transcriptomics. These tools can be divided into two categories: those that only predict cell-cell communication interactions, commonly referred to as ligand-receptor inference methods, and those that additionally estimate intracellular activity induced by cell-cell communication. Both types of tools use gene expression information as a proxy for protein abundance and typically require clustering cells into biologically meaningful groups. These cell-cell communication tools infer interactions between cell groups, where one group serves as the source of the communication event and the other as the receiver. Thus, cell-cell communication events are typically represented as interactions between proteins from the source and receiver cell groups.
Information about interacting proteins is generally extracted from prior knowledge resources. In ligand-receptor methods, interactions can also be represented through heteromeric protein complexes, as different subunit combinations can induce different responses, and incorporating protein complex information has been shown to reduce false positive rates. On the other hand, methods that model intracellular signaling also utilize functional information from the receiver cell type, and therefore require additional data such as intracellular protein-protein interaction networks and/or gene regulatory interactions.
[1]:
%load_ext autoreload
%autoreload 2
[2]:
import exprmat as em
# set working directory
em.setwd('../../../data')
ver = em.version()
[i] exprmat 0.2.66 / exprmat-db 0.2.66
[i] os: posix (linux) platform version: 6.8.0-90-generic
[i] loaded configuration from /home/data/yangz/.exprmatrc
[i] current working directory: /home/data/yangz/packages/exprmat/data
[i] current database directory: /home/data/yangz/packages/database (0.2.66)
[i] resident memory: 775.17 MiB
[i] virtual memory: 5.95 GiB
[3]:
expm = em.load_experiment('expm/scrna', load_samples = False, load_subset = 'mono-neutro')
[!] samples are not dumped in the experiment directory.
[4]:
print(expm)
annotated data of size 9754 × 19651
subset mono-neutro of size 9754 × 19651
contains modalities: rna
modality [rna]
obs : sample <cat> <c/sample> batch <cat> <c/batch> group <cat> <c> modality <cat> <c/modality>
taxa <cat> <c/taxa> barcode <o> <o> ubc <o> <o> n.umi <f64> <i> n.genes <i64> <i>
n.mito <f64> <f> n.ribo <f64> <f> pct.mito <f64> <f> pct.ribo <f64> <f>
filter <bool> <bool> score.doublet <f64> <f> score.doublet.se <f64> <f>
is.doublet <bool> <bool> qc <bool> <bool/qc> leiden <cat> <c> sc3.5 <cat> <c>
sc3.10 <cat> <c> sc3.20 <cat> <c> sc3.30 <cat> <c> cell.type <cat> <c>
kde.umap <f64> <f/kde> psbulk <cat>
var : chr <cat> <c/chromosome> start <i64> <i> end <i64> <i> strand <cat> <c/strand> id <o> <o>
subtype <cat> <c/gsubtype> gene <cat> <o/gene> tlen <f64> <i/tlen> cdslen <i64> <i/cdslen>
assembly <cat> <c> uid <o> <o/ugene> vst.hvg <bool> <bool/hvg> vst.all.means <f64> <f>
vst.all.vars <f64> <f> vst.all.vars.norm <f64> <f> vst.all.hvg.rank <f32> <f>
vst.all.hvg <bool> <bool>
layers : counts <f32> <i/counts> norm <f32> <f>
obsm : cnmf.10 <df> <f/embedding/usage> harmony <arr:f32(35)> <f> knn <arr:i32(100)> <i/knni>
knn.d <arr:f32(100)> <f/knnd> pca <arr:f64(35)> <f/embedding/pca>
umap <arr:f32(2)> <f/embedding>
varm : cnmf.10 <arr:f64(10)> <f/weights> cnmf.coef.10 <arr:f64(10)> <f/usage-coef>
pca <arr:f64(35)> <f/weights>
obsp : connectivities <csr:f32> <f/connectivity> distances <csr:f32> <f/distance>
uns : cell.type.colors cell.type_colors cnmf <cnmf> cnmf.args <o>
cnmf.density.10 <cnmf-density> cnmf.dist.10 <f/connectivity> cnmf.stats <cnmf-stats>
commands <system> kde.umap <kde-stats> leiden <o> leiden.colors <o> markers <markers>
neighbors <knn> pca <dict> sc3.10.colors <o> sc3.20.colors <o> sc3.30.colors <o>
sc3.5.colors <o> slots <system> umap <o>
[*] samples not loaded from disk.
[6]:
fig = expm.rna.plot_embedding(
basis = 'umap', color = 'cell.type', annotate_style = 'text', legend = False,
figsize = (3, 3), dpi = 100, ptsize = 4, contour_plot = False, slot = 'magic'
)
Ligand-receptor communication analysis#
For human and mouse, comprehensive database resources are already available for ligand-receptor analysis.
[7]:
expm.rna.ligand_receptor(
flavor = 'ra',
taxa_source = 'mmu',
taxa_dest = 'mmu',
gene_symbol = None,
groupby = 'cell.type',
use_raw = False,
min_cells = 5,
expr_prop = 0.1,
n_perms = 500,
seed = 42,
de_method = 't-test',
resource_name = 'consensus',
verbose = True,
key_added = 'lr',
n_jobs = 50
)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 500 / 500 (00:17 < 00:00)
[i] running cellphonedb
━━━━━━━━━━━━━━━━━━━━━━━━━━ 500 / 500 (00:01 < 00:00)
[i] running connectome
[i] running logfc
[i] running natmi
[i] running scseqcomm
[i] running singlecellsignalr
[8]:
expm.rna.get_lr(
source_labels = 'Neu',
target_labels = 'MDP',
filter_fun = lambda x: (
(x['cellchat.p'] < 0.05) &
(x['lr.means'] > 0.5) &
(x['magnitude'] >= 0.05)
),
orderby = 'magnitude'
)[[
'source', 'ligand.complex', 'target', 'receptor.complex',
'lr.probs', 'lr.means', 'cellchat.p', 'specificity', 'magnitude'
]]
[8]:
| source | ligand.complex | target | receptor.complex | lr.probs | lr.means | cellchat.p | specificity | magnitude | |
|---|---|---|---|---|---|---|---|---|---|
| 7744 | Neu | Timp2 | MDP | Itgb1 | 0.024075 | 0.941626 | 0.000 | 0.055215 | 0.053848 |
| 7570 | Neu | C3 | MDP | Lrp1 | 0.020522 | 0.906975 | 0.000 | 0.076983 | 0.069470 |
| 7549 | Neu | App | MDP | Lrp10 | 0.018705 | 0.896630 | 0.000 | 0.154539 | 0.075968 |
| 7533 | Neu | Alcam | MDP | Nrp1 | 0.016223 | 0.892363 | 0.000 | 0.096843 | 0.078477 |
| 7545 | Neu | App | MDP | Aplp2 | 0.017906 | 0.890198 | 0.000 | 0.154539 | 0.079706 |
| 7745 | Neu | Tln1 | MDP | Itgb5 | 0.008847 | 0.956893 | 0.000 | 0.154539 | 0.084781 |
| 7561 | Neu | Arpc5 | MDP | Ldlr | 0.008132 | 1.054119 | 0.000 | 0.154539 | 0.102947 |
| 7618 | Neu | H2-K1 | MDP | Aplp2 | 0.016434 | 0.857278 | 0.000 | 1.000000 | 0.103778 |
| 7555 | Neu | App | MDP | Tnfrsf21 | 0.015277 | 0.852304 | 0.000 | 0.304696 | 0.107374 |
| 7551 | Neu | App | MDP | Ncstn | 0.013254 | 0.809421 | 0.000 | 0.154539 | 0.147716 |
| 7735 | Neu | Thbs1 | MDP | Cd47 | 0.006768 | 0.993382 | 0.000 | 0.362101 | 0.151609 |
| 7511 | Neu | Actr2 | MDP | Ldlr | 0.006697 | 0.901519 | 0.000 | 0.154539 | 0.154032 |
| 7576 | Neu | Camp | MDP | P2rx7 | 0.010242 | 1.554048 | 0.000 | 0.154539 | 0.162357 |
| 7740 | Neu | Thbs1 | MDP | Ptprj | 0.006284 | 0.937751 | 0.000 | 0.270331 | 0.179992 |
| 7515 | Neu | Adam10 | MDP | Il6ra | 0.006512 | 0.743551 | 0.006 | 0.198188 | 0.227656 |
| 7535 | Neu | Anxa2 | MDP | Tlr2 | 0.008042 | 1.133962 | 0.000 | 0.154539 | 0.232450 |
| 7660 | Neu | Ltb | MDP | Tnfrsf1a | 0.013051 | 0.708727 | 0.000 | 0.154539 | 0.290506 |
| 7639 | Neu | Inhba | MDP | Actr2 | 0.004787 | 0.926168 | 0.000 | 0.207423 | 0.303735 |
| 7571 | Neu | C3 | MDP | Nrp1 | 0.011390 | 0.701406 | 0.000 | 0.154539 | 0.304305 |
| 7738 | Neu | Thbs1 | MDP | Lrp1 | 0.004713 | 0.763979 | 0.000 | 0.055215 | 0.311579 |
| 7736 | Neu | Thbs1 | MDP | Itga4 | 0.004659 | 0.756301 | 0.000 | 0.320136 | 0.320231 |
| 7573 | Neu | Calr | MDP | Lrp1 | 0.004574 | 0.697135 | 0.000 | 1.000000 | 0.331443 |
| 7590 | Neu | Ceacam1 | MDP | Havcr2 | 0.002897 | 0.510467 | 0.000 | 0.154539 | 0.337120 |
| 7617 | Neu | Grn | MDP | Tnfrsf1b | 0.011414 | 0.678940 | 0.000 | 1.000000 | 0.357966 |
| 7627 | Neu | Hsp90b1 | MDP | Lrp1 | 0.004263 | 0.701827 | 0.000 | 1.000000 | 0.368501 |
| 7529 | Neu | Adam17 | MDP | Itgb1 | 0.002927 | 0.642050 | 0.000 | 0.382621 | 0.436023 |
| 7711 | Neu | Sema4a | MDP | Plxnb2 | 0.009535 | 0.643411 | 0.000 | 0.055215 | 0.449164 |
| 7520 | Neu | Adam10 | MDP | Tspan14 | 0.003671 | 0.520230 | 0.000 | 0.386216 | 0.466068 |
| 7527 | Neu | Adam17 | MDP | Il6ra | 0.003121 | 0.663994 | 0.000 | 0.263618 | 0.472090 |
| 7565 | Neu | C3 | MDP | C3ar1 | 0.005675 | 0.628210 | 0.000 | 0.059432 | 0.495488 |
| 7588 | Neu | Cd177 | MDP | Pecam1 | 0.003012 | 0.755761 | 0.000 | 0.186533 | 0.501230 |
| 7712 | Neu | Sema4a | MDP | Plxnd1 | 0.005959 | 0.561873 | 0.000 | 0.055215 | 0.514707 |
| 7634 | Neu | Hspa8 | MDP | Ldlr | 0.003245 | 0.562931 | 0.000 | 1.000000 | 0.547963 |
| 7550 | Neu | App | MDP | Lrp6 | 0.004149 | 0.727385 | 0.000 | 0.193251 | 0.552410 |
| 7538 | Neu | Apoe | MDP | Lrp1 | 0.003123 | 0.711129 | 0.000 | 1.000000 | 0.576276 |
| 7764 | Neu | Vcl | MDP | Itgb5 | 0.003792 | 0.560963 | 0.000 | 0.154539 | 0.610284 |
| 7681 | Neu | Pi16 | MDP | Tnfrsf21 | 0.007785 | 0.583856 | 0.000 | 0.164999 | 0.635763 |
| 7611 | Neu | Gnas | MDP | Adcy7 | 0.008156 | 0.581460 | 0.000 | 1.000000 | 0.643255 |
| 7635 | Neu | Il16 | MDP | Ccr5 | 0.003853 | 0.553870 | 0.000 | 0.066918 | 0.647193 |
| 7715 | Neu | Sema4d | MDP | Plxnb2 | 0.007355 | 0.554517 | 0.000 | 0.154539 | 0.667702 |
| 7562 | Neu | B2m | MDP | Hfe | 0.006598 | 1.166142 | 0.000 | 1.000000 | 0.672097 |
| 7593 | Neu | Copa | MDP | Cd74 | 0.002609 | 0.720646 | 0.000 | 0.878713 | 0.693046 |
| 7596 | Neu | Entpd1 | MDP | Adora2b | 0.002240 | 0.628008 | 0.000 | 0.063459 | 1.000000 |
[9]:
fig = expm.rna.plot_lr_heatmap(
lr_key = 'lr', groupby = 'cell.type',
figure_size = (4.5, 4)
)
[12]:
table = expm.rna.get_lr(
source_labels = None,
target_labels = 'MDP',
filter_fun = lambda x: (
(x['cellchat.p'] < 0.05) &
(x['lr.means'] > 0.5) &
(x['magnitude'] >= 0.05)
),
orderby = 'magnitude'
)[[
'source', 'ligand.complex', 'target', 'receptor.complex',
'lr.probs', 'lr.means', 'cellchat.p', 'specificity', 'magnitude'
]]
# sankey plots on custom adata object is not exported. a hack here.
from exprmat.reader.rna import rna_plot_sankey
table.index = table.index.astype(str)
fig = rna_plot_sankey(em.ad.AnnData(obs = table), None, None, obs1 = 'source', obs2 = 'target')
[22]:
fig = expm.rna.plot_lr_dotplot(
source_labels = ['Mo', 'Neu', 'MDP', 'iMac'],
target_labels = ['Mo', 'Neu'],
filter_fun = lambda x: (
(x['cellchat.p'] < 0.05) &
(x['lr.means'] > 0.5) &
(x['magnitude'] >= 0.7)
),
colour = 'magnitude',
size = 'lr.means',
lr_key = 'lr',
orderby = 'lr.means',
figure_size = (3, 2.8),
size_ratio = 3,
cmap = 'reds/r',
# paired from source_labels, dest_labels
paired = False
)
Using databases from other species#
For other species, we may wish to transfer existing databases using homologous genes. For example, here we use the human database and map it to mouse (although this is unnecessary and introduces additional interpretability challenges).
[23]:
expm.rna.ligand_receptor(
flavor = 'ra',
# use human database
taxa_source = 'hsa',
taxa_dest = 'mmu',
gene_symbol = None,
groupby = 'cell.type',
use_raw = False,
min_cells = 5,
expr_prop = 0.1,
n_perms = 500,
seed = 42,
de_method = 't-test',
# select a database that is only available to humans
resource_name = 'cellphonedb',
verbose = True,
key_added = 'lr.hsa',
n_jobs = 50
)
━━━━━━━━━━━━━━━━━━━━━━━━━━ 500 / 500 (00:05 < 00:00)
[i] running cellphonedb
━━━━━━━━━━━━━━━━━━━━━━━━━━ 500 / 500 (00:00 < 00:00)
[i] running connectome
[i] running logfc
[i] running natmi
[i] running scseqcomm
[i] running singlecellsignalr
[25]:
expm.rna.get_lr(
source_labels = 'Neu',
target_labels = 'MDP',
filter_fun = lambda x: (
(x['cellchat.p'] < 0.05) &
(x['lr.means'] > 0.5) &
(x['magnitude'] >= 0.05)
),
orderby = 'magnitude',
slot = 'lr.hsa'
)[[
'source', 'ligand.complex', 'target', 'receptor.complex',
'lr.probs', 'lr.means', 'cellchat.p', 'specificity', 'magnitude'
]]
[25]:
| source | ligand.complex | target | receptor.complex | lr.probs | lr.means | cellchat.p | specificity | magnitude | |
|---|---|---|---|---|---|---|---|---|---|
| 730 | Neu | Grn | MDP | Tnfrsf1b | 0.011414 | 0.678940 | 0.0 | 1.000000 | 0.392257 |
| 740 | Neu | Ltb | MDP | Ltbr | 0.006450 | 0.557579 | 0.0 | 0.154963 | 0.433573 |
| 727 | Neu | Entpd1 | MDP | Adora2b | 0.002240 | 0.628008 | 0.0 | 0.052947 | 1.000000 |
[ ]:
expm.save()
Intracellular receptor signaling activity inference#
NicheNet can prioritize the inference of intrinsic receptor activity by incorporating intracellular expression regulatory networks of target signals, regardless of whether the ligand is captured in the current dataset. We first need to identify differentially expressed genes between two cell groups, and then perform inference based on these DE genes.
For example, if we want to understand which signaling pathway activation is most likely responsible for the genes that change during neutrophil maturation from myeloid cells, we obtain the upregulated DE genes in Neu relative to MM.
[26]:
expm.rna.markers(
groupby = 'cell.type',
mask_var = None,
groups = ['Neu'],
reference = 'MM',
n_genes = None, rankby_abs = False, pts = True,
key_added = 'deg',
method = 't-test',
corr_method = 'benjamini-hochberg',
tie_correct = False,
gene_symbol = 'gene',
layer = 'X'
)
[28]:
upreg = expm.rna.get_markers(
slot = 'deg',
min_pct = 0.0,
max_pct_reference = 0.75,
max_q = 0.05,
min_lfc = 0.5, max_lfc = 25,
remove_zero_pval = False
)
upreg
[i] fetched diff `Neu` over `MM` (2671 genes)
[28]:
| names | scores | lfc | p | q | gene | pct | log10.p | log10.q | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | rna:mmu:g23254 | 88.062660 | 5.080193 | 0.000000 | 0.000000 | Retnlg | 0.981360 | 300.000000 | 300.000000 |
| 1 | rna:mmu:g1350 | 80.203453 | 3.973848 | 0.000000 | 0.000000 | Cxcr2 | 0.917238 | 300.000000 | 300.000000 |
| 2 | rna:mmu:g9641 | 78.552460 | 3.968456 | 0.000000 | 0.000000 | Slfn1 | 0.893976 | 300.000000 | 300.000000 |
| 3 | rna:mmu:g39854 | 78.243988 | 4.206313 | 0.000000 | 0.000000 | Hacd4 | 0.793170 | 300.000000 | 300.000000 |
| 4 | rna:mmu:g31928 | 77.418846 | 4.272014 | 0.000000 | 0.000000 | Dhrs9 | 0.780644 | 300.000000 | 300.000000 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 3160 | rna:mmu:g53172 | 2.240569 | 0.755317 | 0.025224 | 0.048289 | Trim3 | 0.020131 | 1.598182 | 1.316156 |
| 3162 | rna:mmu:g4427 | 2.234726 | 0.842012 | 0.025622 | 0.049002 | Adgrg6 | 0.019535 | 1.591385 | 1.309782 |
| 3164 | rna:mmu:g24653 | 2.231622 | 18.894960 | 0.025673 | 0.049095 | Zfp13 | 0.000746 | 1.590526 | 1.308965 |
| 3166 | rna:mmu:g14160 | 2.227726 | 1.291074 | 0.026020 | 0.049725 | Gm9745 | 0.005965 | 1.584690 | 1.303425 |
| 3167 | rna:mmu:g36406 | 2.225970 | 19.748264 | 0.026049 | 0.049776 | Tigd4 | 0.000746 | 1.584204 | 1.302981 |
2671 rows × 9 columns
[29]:
expm.rna.nichenet(
taxa = 'mmu',
receiver = ['Neu'],
sender = None,
identity = 'cell.type',
foreground = upreg['gene'].tolist(),
background = None,
expression_threshold = 0.001,
ncpus = 20,
key_added = 'nichenet'
)
[i] background gene set constructed with size 15993
[i] using user-specified foreground gene set of size 2671
[i] using 1200(all) potential ligands
[i] using 581 expressed receptors in receivers
━━━━━━━━━━━━━━━━━━ predicting ligand activities 1200 / 1200 (09:56 < 00:00)
[30]:
expm['rna'].uns['nichenet']['activities']
[30]:
| ligand | auroc | aupr | aupr.adj | pearson | |
|---|---|---|---|---|---|
| 941 | Csf2 | 0.595752 | 0.214262 | 0.078202 | 0.137546 |
| 234 | Clec4g | 0.596799 | 0.212559 | 0.076500 | 0.130687 |
| 405 | Selp | 0.599250 | 0.209284 | 0.073225 | 0.128783 |
| 788 | Ifng | 0.602421 | 0.208928 | 0.072869 | 0.126478 |
| 431 | Il4 | 0.605878 | 0.207946 | 0.071887 | 0.126003 |
| ... | ... | ... | ... | ... | ... |
| 943 | Cntn5 | 0.534744 | 0.151258 | 0.015199 | 0.038510 |
| 226 | Cd6 | 0.533018 | 0.150995 | 0.014936 | 0.037346 |
| 336 | Nptx2 | 0.527072 | 0.147502 | 0.011442 | 0.030330 |
| 931 | Elfn1 | 0.513617 | 0.139674 | 0.003615 | 0.009316 |
| 681 | Cklf | 0.502508 | 0.136178 | 0.000118 | 0.001379 |
1200 rows × 5 columns
[31]:
expm.rna.nichenet_infer_targets(
key_nichenet = 'nichenet',
ligands = 10,
n_targets = 200
)
[32]:
fig = expm.rna.plot_nichenet_ligands(
key_nichenet = 'nichenet',
figsize = (10, 2.2),
width_ratios = [0.2, 8]
)