Multiplex IF and CODEX#
This notebook demonstrates the loading, visualization, and analysis of multiplex immunofluorescence (mIF) data, with a focus on CODEX (CO-Detection by indEXing) imaging. CODEX enables simultaneous detection of dozens of protein markers on a single tissue section through cyclic staining and imaging, providing spatially resolved proteomic data at single-cell resolution. We cover the complete workflow from reading multi-channel TIFF images, exploring attached image data, and inspecting the metadata, to preparing for downstream cell segmentation and quantification steps.
[1]:
%load_ext autoreload
%autoreload 2
[2]:
import exprmat as em
ver = em.version()
[i] exprmat 0.2.66 / exprmat-db 0.2.66
[i] os: posix (linux) platform version: 6.8.0-90-generic
[i] loaded configuration from /home/data/yangz/.exprmatrc
[i] current working directory: /home/data/yangz/packages/exprmat/docs/source/spatial
[i] current database directory: /home/data/yangz/packages/database (0.2.66)
[i] resident memory: 776.41 MiB
[i] virtual memory: 5.95 GiB
[3]:
em.setwd('../../../data')
[4]:
meta = em.metadata(
locations = [
'codex/codex/A-10.tif',
'codex/codex/A-11.tif',
],
modality = ['mif'] * 2,
default_taxa = ['hsa'] * 2,
names = ['a-10', 'a-11'],
batches = ['b1', 'b2'],
groups = ['-', '-'],
)
[6]:
meta.dataframe
[6]:
| location | sample | batch | group | modality | taxa | |
|---|---|---|---|---|---|---|
| 0 | codex/codex/A-10.tif | a-10 | b1 | - | mif | hsa |
| 1 | codex/codex/A-11.tif | a-11 | b2 | - | mif | hsa |
[7]:
! eza --group-directories-first --tree --long codex
drwxrwxr-x - yangz 4 May 10:24 codex
drwxrwxr-x - yangz 4 May 12:17 ├── codex
.rw-r--r-- 1.6G yangz 4 May 10:23 │ ├── A-10.tif
.rw-r--r-- 1.6G yangz 4 May 10:23 │ ├── A-11.tif
.rw-r--r-- 388 yangz 4 May 12:17 │ └── channelnames.txt
drwxrwxr-x - yangz 4 May 10:23 └── he
.rw-rw-r-- 362M yangz 4 May 10:23 ├── A-10.tif
.rw-rw-r-- 362M yangz 4 May 10:23 └── A-11.tif
As you can see, the codex modality is recommended to place a channelnames.txt under the same directory of the images. This will automatically assign names to the image channels
[8]:
expm = em.experiment(meta, dump = 'expm/codex')
[i] reading sample a-10 [mif] ...
[!] no ome metadata found; falling back to tifffile series.
[i] reading sample a-11 [mif] ...
[!] no ome metadata found; falling back to tifffile series.
The multiplex immunofluroscence dataset loaded from image is mapped to an empty default of spatial-cell modality. The observation table will be filled only after segmentation of cells are completed. By now, users need to operate on raw image data to properly segment the cells.
[9]:
print(expm)
[!] dataset not integrated.
[*] composed of samples:
a-10 spatial-cell hsa batch b1 of size 0 × 59
a-11 spatial-cell hsa batch b2 of size 0 × 59
For spatial modalities, using the accessor’s summary() function provide a quick way to introspect the attached images. They are later used for plotting and segmentation, and can be operated using image tools
[11]:
expm.spatial_cell.summary(run_on_samples = True)
a-10
└── origin of shape 3727 ✗ 3726 ✗ 59
[ 0] dapi [ 1] cd45 [ 2] cd11c [ 3] bcl2 [ 4] cd90
[ 5] foxp3 [ 6] egfr [ 7] p16 [ 8] pd1 [ 9] cd206
[10] cd45ro [11] il10 [12] cd56 [13] cd11b [14] cd31
[15] cd163 [16] cd21 [17] cd8 [18] pnad [19] cd20
[20] cxcr5 [21] ki67 [22] lag3 [23] cd73 [24] cd16
[25] asma [26] icos [27] cd25 [28] coliv [29] pdgfrb
[30] cd4 [31] cd68 [32] cd34 [33] vimentin [34] podoplanin
[35] hladr [36] cxcl12 [37] cd3 [38] fap [39] cd138
[40] tbet [41] periostin [42] spp1 [43] s100a8a9 [44] clec9a
[45] cd45ra [46] caix [47] gzmb [48] bcat [49] sox2
[50] pdl1 [51] mmp9 [52] tcrgd [53] cd38 [54] cd69
[55] cd15 [56] ido1 [57] mct1 [58] panck
a-11
└── origin of shape 3733 ✗ 3732 ✗ 59
[ 0] dapi [ 1] cd45 [ 2] cd11c [ 3] bcl2 [ 4] cd90
[ 5] foxp3 [ 6] egfr [ 7] p16 [ 8] pd1 [ 9] cd206
[10] cd45ro [11] il10 [12] cd56 [13] cd11b [14] cd31
[15] cd163 [16] cd21 [17] cd8 [18] pnad [19] cd20
[20] cxcr5 [21] ki67 [22] lag3 [23] cd73 [24] cd16
[25] asma [26] icos [27] cd25 [28] coliv [29] pdgfrb
[30] cd4 [31] cd68 [32] cd34 [33] vimentin [34] podoplanin
[35] hladr [36] cxcl12 [37] cd3 [38] fap [39] cd138
[40] tbet [41] periostin [42] spp1 [43] s100a8a9 [44] clec9a
[45] cd45ra [46] caix [47] gzmb [48] bcat [49] sox2
[50] pdl1 [51] mmp9 [52] tcrgd [53] cd38 [54] cd69
[55] cd15 [56] ido1 [57] mct1 [58] panck
[12]:
expm.save()
[i] saving individual samples. (pass `save_samples = False` to skip)
━━━━━━━━━━━━━━━━━━━━━━━ modality [spatial-cell] 2 / 2 (00:00 < 00:00)
For mIF datasets, the next step is to manipulate the images from the loaded dump, and perform segmentation to fill the actual cell matrix. When handling data that is already segmented, you can always rerun segmentation if raw high-resolution image is available.