We introduce the SynthMT dataset to study the readiness of automated microtubule analysis with state-of-the-art foundation models.
Studying microtubules (MTs) and their mechanical properties is central to understanding intracellular
transport, cell division, and drug action, yet experts still spend many hours manually segmenting these
filamentous structures.
The suitability of state-of-the-art models for this task cannot be orderly assessed, as large-scale
labeled datasets are missing.
We address this gap by presenting the synthetic dataset SynthMT, which is the product of
tuning a novel
image generation pipeline on unlabeled, real-world interference reflection microscopy (IRM) frames of
in vitro reconstituted microtubules.
In our benchmark, we systematically test nine models in both zero- and Hyperparameter Optimization
(HPO)-bbased few-shot settings.
Across both, classical and current foundation models still struggle to achieve the accuracy required for
biological downstream analysis on, to humans, visually simple in vitro MT IRM images.
However, a notable exception is the recently introduced SAM3 model.
After HPO on only ten random SynthMT images, its text-prompt version SAM3Text
achieves near-perfect and in some cases super-human performance on unseen real data.
This result indicates that fully automated MT segmentation has become feasible when model
configuration is effectively guided through synthetic data.
SynthMT): We release SynthMT,
a synthetic dataset with instance masks for MTs, judged by domain experts for
biological plausibility.SynthMT β zero-shot and with HPO β so you know what actually works.
SAM3Text + simple HPO on just 10 synthetic
images
from our pipeline β near-perfect, sometimes superhuman segmentation on real data. Fully
automated MT analysis is here! βSKIoU (Skeleton IoU) measures how well predicted segmentations match ground-truth microtubule shapes β the core metric for filament segmentation.
Count measures the absolute difference in the number of detected filaments compared to the ground truth.
Length KL and Curvature KL capture how well the model preserves biologically meaningful properties. Lower = predictions match ground-truth MT distributions better.
FIESTA baseline, and microscopy-specific models often outperform
general ones on biological tasks.SAM3Text.
SAM3Text enables automation: With HPO, it achieves human-grade
performance on real data, proving that fully automated MT analysis is feasible.π‘ Full results with all models and metrics available in the paper.
SynthMT| Model | SKIoU β | Length KL β |
Curvature KL β |
|---|---|---|---|
| Baseline | |||
FIESTA |
0.12 | 5.03 | 0.997 |
FIESTA + HPO |
0.24 | 3.74 | 0.706 |
| Microscopy Foundation Models | |||
TARDIS |
0.45 | 0.56 | 0.019 |
TARDIS + HPO |
0.48 | 0.41 | 0.031 |
Β΅SAM |
0.02 | 0.88 | 0.130 |
Β΅SAM + HPO |
0.66 | 1.24 | 0.132 |
CellSAM |
0.56 | 0.19 | 0.021 |
CellSAM + HPO |
0.59 | 0.21 | 0.031 |
Cellpose-SAM |
0.26 | 0.12 | 0.019 |
Cellpose-SAM + HPO |
0.65 | 0.12 | 0.012 |
| General Purpose Foundation Models | |||
SAM |
0.37 | 3.90 | 0.700 |
SAM + HPO |
0.16 | 5.45 | 0.912 |
SAM3Text |
0.85 | 0.07 | 0.063 |
SAM3Text + HPO π |
0.93 | 0.02 | 0.069 |
| Model | Count β | Length KL β |
Curvature KL β |
|---|---|---|---|
| Microscopy Foundation Models | |||
CellSAM |
2.62 | 0.07 | 0.087 |
CellSAM + HPO |
7.18 | 0.08 | 0.140 |
Cellpose-SAM |
11.64 | 0.05 | 0.178 |
Cellpose-SAM + HPO |
10.91 | 0.07 | 0.090 |
| General Purpose Foundation Models | |||
SAM |
4.64 | 0.79 | 0.157 |
SAM + HPO |
84.35 | 1.39 | 0.152 |
SAM3Text |
8.92 | 0.21 | 0.249 |
SAM3Text + HPO |
1.17 | 0.09 | 0.140 |
| Inter-annotator | 1.47 | 0.06 | 0.110 |
From structural masks to photorealistic microscopy images in 5 steps.
Generate MT masks with realistic geometry.
Apply optical system simulation.
Add realistic imaging artifacts.
Inject sensor and shot noise.
Apply intensity and contrast variations.
Optimizing θ aligns synthetic image distributions with real, unlabeled microscopy data.
TBA: Citation will be available soon.
Our work is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) Project-ID 528483508 - FIP 12. We would like to thank Dominik Fachet and Gil Henkin from the Reber lab for providing data, and also thank the further study participants Moritz Becker, Nathaniel Boateng, and Miguel Aguilar. The Reber lab thanks staff at the Advanced Medical Bioimaging Core Facility (CharitΓ©, Berlin) for imaging support and the Max Planck Society for funding. Furthermore, we thank Kristian Hildebrand and Chaitanya A. Athale (IISER Pune, India) and his lab for helpful discussions, and the authors of the SCAM project page for their template.