ZeroCostDL4Mic: an open platform to simplify access and use of Deep-Learning in Microscopy

Lucas von Chamier, Johanna Jukkala, Christoph Spahn, Martina Lerche, Sara Hernández-pérez, Pieta Mattila, Eleni Karinou, Seamus Holden, Ahmet Can Solak, Alexander Krull, Tim-Oliver Buchholz, Florian Jug, Loic Alain Royer, Mike Heilemann, Romain F. Laine, Guillaume Jacquemet, Ricardo Henriques

Posted on: 24 March 2020

Preprint posted on 20 March 2020

Article now published in Nature Communications at http://dx.doi.org/10.1038/s41467-021-22518-0

Deep Learning for everyone: ZeroCostDL4Mic, a platform to ensure everyone access to multiple novel and powerful resources for image analysis.

Selected by Mariana De Niz

Categories: bioinformatics, cell biology

Background

Deep Learning methods are recognised as powerful analytical tools with great use and increased potential for image analysis and microscopy. However, a current challenge for the widespread use of deep learning is the technological, resource, and knowledge barriers separating microscopy users with high knowledge of computational platforms for image analysis, and novice users with limited knowledge of such tools. To bridge this gap, von Chamier et al present ZeroCostDL4Mic (1), a platform based on Google Colab, which simplifies deployment, access and use of deep learning tools (Figure 1).

Key findings and developments

Overall developments

ZeroCostDL4Mic is a collection of self-explanatory Jupyter Notebooks, for Google Colab. The latter provides free, cloud-based computational resources needed.
ZeroCostDL4Mic provides a single simple interface for users at all levels of expertise to install, test, train, and use the popular deep learning networks U-net, Stardist, CARE, Noise2Void, and Label-free prediction.
- U-net was designed by Ronnenberg et al in 2015 (2), and is a deep learning architecture originally developed for segmentation of EM images.
- Stardist was designed by Schmidt et al in 2018 (3) and is a deep learning method designed to segment cell nuclei in microscopy images.
- CARE is a deep learning method designed by Weigert et al in 2018 (4), and is capable of image restoration from corrupted bio-images (e.g. corrupted by noise, artefacts or low resolution). The network allows image denoising and resolution improvement in 2D and 3D images, using supervised training.
- Noise2Void is a deep learning method designed by Krull et al in 2019 (5) to perform denoising on microscopy images, using an unsupervised training approach.
- Label-free prediction (fnet) is a deep learning method desinged by Ounkomol et al in 2018 (6) as a tool for label-predictions from unannotated brightfield and EM images.
ZeroCostDL4Mic promotes the acquisition of knowledge and dexterity in the use of these networks. In their work, the authors provide training datasets for each of the networks used.
While ZeroCostDL4Mic provides a friendly and easy-to-use interface for users with little coding experience, the underlying code remains accessible, allowing advanced users to explore and edit the programmatic structure of the notebooks.
For access to and use of ZeroCostDL4Mic, no extra resources beyond a web browser and a Google Drive account are needed.
ZeroCostDL4Mic provides access to Deep Learning to run tasks of image segmentation, denoising, restoration, and artificial labelling.
Beyond its current uses, the authors discuss the potential of this tool for the future, to aid in the rapid dissemination of novel technologies, allowing users of all levels of expertise to use multiple tools for deep-learning-based image analysis in a reproducible and testable manner.

Notes by authors on limitations and further considerations.

The Google Colab platform offers a free and straightforward access to a GPU or TPU, which significantly lowers the entry barrier for new users of Deep Learning methods. However, this access comes with some drawbacks, which the authors carefully explain. These include:
- Limited free Google Drive storage, with a maximum of 15 GB feely accessible by Google Colab notebooks. However, additional storage space can be purchased.
- A 12.72 RAM limit. Exceeding this RAM limit can cause the notebook to crash or show an error.
- A 12-hour time-out, and a log-out if idle-30-90 min time, after which data loaded into the network is deleted. If training has not been completed, all progress might be lost if not saved.
- Google Colab does not guarantee access to a GPU, as the number of users of the service may be larger than the number of available devices.
- Google Colab uses different GPUs which currently include Nvidia K80, P4 and P100. The user cannot decide which GPU will be available when using the notebook. This may affect the speed at which networks can be trained and used.
While assessing these limitations, the authors offer a detailed discussion on how these limitations can be mitigated.
The authors include a supplementary discussion emphasizing the importance of re-training. They discuss that many labs take the approach of using pre-trained network models that can be used to process imaging data. However, pre-trained models, although very powerful, can also be very specific to the microscopes and sample types used in their training. This may lead to erroneous or artefactual results when applied to widely different dataset types than those in which they were trained on. The authors emphasize the importance of training the models with own specific data, to produce high-fidelity and reliable results.

What I like about this preprint

The main point I like about this preprint is that it hugely promotes open science. Significant barriers exist that prevent even experienced microscopists from having access to deep-learning based tools that are revolutionizing the field of image analysis. This work endeavours to give access to everyone, regardless of level of expertise, to the latest advances in image analysis. Furthermore, it also promotes that scientists with multiple expertise continue to contribute to ZeroCostDL4Mic. Moreover, beyond the knowledge barrier being addressed, the video tutorials and other training material are very user-friendly, and of free access. It is my belief the microscopy community (with all levels of image analysis expertise) will greatly benefit from this important resource.

Open questions

*Note: all questions with answers are shown at the end of this highlight.

In your discussion on the future perspectives of ZeroCostDL4Mic, you mention that you expect to grow the number of networks available. Will it be possible to compare the output of multiple networks so as to define the most suitable for specific analyses?
You discuss in your work the need to re-train models, and to use one’s own specific data. Large imaging repositories are not yet a reality, but if there were, could you incorporate this to address your discussion point on pre-trained models, and to build altogether stronger models for multiple types of data?
Is there a way ZeroCostDL4Mic can join efforts with resources such as BIAFLOWS, as the purpose of accessibility and training is shared?
One of your purposes is that ZeroCostDL4Mic grows in terms of number of networks available. Following from the question above, have you considered the possibility that ZeroCostDL4Mic guides users on the choice of network, based on input regarding the type of image (eg. super-resolution, time-lapse, etc), and the expected type of analysis?
I might have asked this to various different authors, but wouldn’t an image repository be of great use for resources such as yours and those of others? And in general, for the scientific community?

References

von Chamier et al, ZeroCostDL4Mic: an open platform to simplify access and use of Deep-Learning in Microscopy, bioRxiv, 2020
Ronnenberg et al, U-net: convolutional networks for biomedical image segmentation. International Conference on Medical Image computing and computer-assisted intervention, 234-241, Springer, 2015.
Schmidt et al, Cell detection with star-convex polygons, International Conference on Medical Image computing and computer-assisted intervention, 265-273, 2018
Weigert et al, Content-aware image restoration: pushing the limits of fluorescence microscopy. Nature Methods, 15(2):1090-1097, 2018.
Krull et al, Noise2Void-learning denoising from single noisy images. Proceedings of the IEEE conference on computer vision and Pattern Recognition, 2129-2137, 2019.
Ounkomol et al, Label-free prediction of three-dimensional fluorescence images from transmitted-light microscopy, Nature methods, 15(11), 917-920, 2018.

Acknowledgements

I thank Ricardo Henriques for his input and engagement, and Mate Palfy for his helpful suggestions.

doi: https://doi.org/10.1242/prelights.17760

Read preprint

(No Ratings Yet)

Author's response

Ricardo Henriques shared

Open questions

1. In your discussion on the future perspectives of ZeroCostDL4Mic, you mention that you expect to grow the number of networks available. Will it be possible to compare the output of multiple networks so as to define the most suitable for specific analyses?

Reply: Likely yes. Our main goal is to give researchers freedom and choices to find the best path to analyse their data. At the start, we will focus on adding networks that provide new analytical features rather than overlap on the type of outputs they generate.

2. You discuss in your work the need to re-train models, and to use one’s own specific data. Large imaging repositories are not yet a reality, but if there were, could you incorporate this to address your discussion point on pre-trained models, and to build altogether stronger models for multiple types of data?

Reply: Training models based on an extensive image database may be beyond the computational resources everyday researchers have. I believe the optimal solution likely lies in having users grab pre-trained models and increment these with their data via transfer learning. This approach will enable researchers to benefit from models trained in large libraries that are then fine-tuned to their data. Something we intend to get to in the future.

3. Is there a way ZeroCostDL4Mic can join efforts with resources such as BIAFLOWS, as the purpose of accessibility and training is shared?

Reply: Yes! This is something we’re very much looking forward to. We want ZeroCostDL4Mic to be as inclusive and as compatible with other efforts as possible.

4. One of your purposes is that ZeroCostDL4Mic grows in terms of number of networks available. Following from the question above, have you considered the possibility that ZeroCostDL4Mic guides users on the choice of network, based on input regarding the type of image (eg. super-resolution, time-lapse, etc), and the expected type of analysis?

Reply: We are actively developing our wiki to provide guidance empowering researchers to make the best choices on how to analyse their data (correctly!). We are trying to be careful not to oversimplify the information we expose, to the point where users take erroneous decisions because they don’t know enough about what is happening in the workflow. We’re hoping to get a right balance, between making things easy while forcing researchers to learn what is needed to achieve high-fidelity, trustworthy, results.

5. I might have asked this to various different authors, but wouldn’t an image repository be of great use for resources such as yours and those of others? And in general, for the scientific community?

Reply: Yes, and there are a few community efforts in this direction, for example the Image Data Resource (https://idr.openmicroscopy.org/). Although we don’t exploit them directly in the manuscript, we make a large pool of example data available in Zenodo so that users can have them as examples. We hope these datasets are useful to others, the same way we also use datasets from others to develop and validate our networks.