ProPicker: Promptable segmentation for particle picking in cryogenic electron tomography

Cryo-electron tomography (cryo-ET) is gaining popularity due to its unique ability to image biological macromolecules in their native environments (Turk and Baumeister, 2020, Hylton and Swulius, 2021). An ambitious goal of cryo-ET is to obtain an atlas of the cell with all of its constituent macromolecules mapped in their native environment. This would revolutionize our understanding of essential protein interactions and has the potential to provide breakthroughs in modern medicine spanning cell biology to drug discovery (Bodakuntla et al., 2023).

In this paper, we focus on particle picking, which is the task of finding all instances of a particle of interest in 3D volumes, called tomograms, obtained with cryo-ET. Particle picking is an essential step and often a bottleneck in important cryo-ET analysis pipelines (Genthe et al., 2023).

Particle picking is a 3D object detection problem and is particularly challenging for various reasons. Due to the fundamental limitations of data acquisition in cryo-ET, tomograms have a very low signal-to-noise ratio and exhibit strong artifacts. Moreover, tomograms are often large (200 × 1000 × 1000 voxels and larger), and cryo-ET datasets can consist of hundreds of tomograms, making their analysis computationally demanding (Genthe et al., 2023, Zeng et al., 2023). Finally, due to the significant diversity in protein types within the cell, there is a vast array of unique object classes to be detected, many of which only differ subtly, rendering differentiation challenging. For instance, the human body is estimated to contain more than 20,000 unique proteins (Li and Buck, 2021).

A particle picking method should be fast and flexible, i.e., it should be able to accurately pick a wide range of particles using no or little additional data for model training or fine-tuning. Existing methods for particle picking are either slow or not flexible. Most state-of-the-art methods (Moebel et al., 2021, De Teresa-Trueba et al., 2023, Liu et al., 2024) are based on deep learning models, which can detect only a small, fixed set of particles and require training on large amounts of labeled data, which is particularly difficult to obtain in the cryo-ET domain.

Here, we propose ProPicker, a Promptable particle Picker that can target a broad range of particles via a versatile prompting mechanism. ProPicker is trained on a large, diverse synthetic dataset and leverages a 3D segmentation network to segment particles of interest in tomograms and to accurately locate their positions. The promptable design enables the user to select which single particle class is targeted by the segmentation model (prompt-based picking). Particle-specific fine-tuning can be used to improve on the out-of-the-box performance of ProPicker. The promptable design of ProPicker is inspired by methods for the segmentation of (natural) 2D images like the Segment Anything Model (SAM) (Kirillov et al., 2023) and CLIPSeg (Lüddecke and Ecker, 2022).

ProPicker is able to pick new particles unseen during training in synthetic tomograms with high F1 scores based on a single prompt (see Section 4.2.1). Experiments on five challenging real-world datasets, show that prompt-based picking can detect large particles and such that produce strong contrast, e.g., ribosomes and apoferritin.

We also encountered situations where prompt-based picking with ProPicker and other flexible baselines, i.e., TomoTwin (Rice et al., 2023) and CryoSAM (Zhao et al., 2024) do not give satisfactory results. In Sections 4.3 Improving over prompt-based picking with fine-tuning, 4.2.2 Prompt-based picking in real-world tomograms, we discuss such scenarios and demonstrate that fine-tuning ProPicker on little data (even ≤25% of a tomogram) can improve the F1 score by factors of up to 4, depending on the particle.

Furthermore, we show that fine-tuning ProPicker requires less data than training the state-of-the-art particle-specific method, DeepETPicker (Liu et al., 2024), to achieve comparable picking performance.

Finally, ProPicker is the fastest among flexible pickers capable of detecting particles based on a prompt, being up to an order of magnitude faster than the state-of-the-art TomoTwin (Rice et al., 2023) (see Section 4.2.1).

Our findings illustrate the promising capabilities of ProPicker, and highlight the need for large, diverse, and extensively annotated real-world training datasets to unlock its full potential (see Section 5).

Comments (0)

No login
gif