Abstract
Understanding of animal collectives is limited by the ability to track each individual. We describe an algorithm and software that extract all trajectories from video, with high identification accuracy for collectives of up to 100 individuals. idtracker.ai uses two convolutional networks: one that detects when animals touch or cross and another for animal identification. The tool is trained with a protocol that adapts to video conditions and tracking difficulty.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Code availability
idtracker.ai is open-source and free software (license GPL v.3). The source code and the instructions for its installation are available at https://gitlab.com/polavieja_lab/idtrackerai. A quick-start user guide and a detailed explanation of the GUI can be found at http://idtracker.ai/. The software is also provided as Supplementary Software.
Data availability
Processed data that can be used to reproduce all figures and tables can be found at http://idtracker.ai/. Lossless compressed videos can be downloaded from the same page. Raw videos are available from the corresponding author upon reasonable request. A library of single-individual zebrafish images for use in testing identification methods also can be found at http://idtracker.ai/. Two example videos, one of 8 adult zebrafish and one of 100 juvenile zebrafish, are also included as part of the quick-start user guide.
References
Pérez-Escudero, A., Vicente-Page, J., Hinz, R. C., Arganda, S. & de Polavieja, G. G. Nat. Methods 11, 743–748 (2014).
Dolado, R., Gimeno, E., Beltran, F. S., Quera, V. & Pertusa, J. F. Behav. Res. Methods 47, 1032–1043 (2015).
Rasch, M. J., Shi, A. & Ji, Z. bioRxiv Preprint at https://www.biorxiv.org/content/early/2016/08/24/071308 (2016).
Rodriguez, A., Zhang, H., Klaminder, J., Brodin, T. & Andersson, M. Sci. Rep. 7, 14774 (2017).
Wang, S. H., Zhao, J. W. & Chen, Y. Q. Multimed. Tools Appl. 76, 23679–23697 (2017).
Xu, Z. & Cheng, X. E. Sci. Rep. 7, 42815 (2017).
Lecheval, V. et al. Proc. Biol. Sci. 285, 1877 (2018).
LeCun, Y., Bengio, Y. & Hinton, G. Nature 521, 436–444 (2015).
Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. TensorFlow.org http://download.tensorflow.org/paper/whitepaper2015.pdf (2015).
Rusk, N. Nat. Methods 13, 35 (2016).
Pan, S. J. et al. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010).
Laan, A., Iglesias-Julios, M. & de Polavieja, G. G. R. Soc. Open Sci. 5, 180679 (2018).
Martins, S. et al. Zebrafish 13, S47–S55 (2016).
Glorot, X. & Bengio, Y. in Proc. Thirteenth International Conference on Artificial Intelligence and Statistics (eds Teh, Y. W. & Titterington, M.) 249–256 (PMLR, Sardinia, Italy, 2010).
Kingma, D. & Ba, J. arXiv Preprint at https://arxiv.org/abs/1412.6980 (2015).
Morgan, N. & Bourlard, H. in Advances in Neural Information Processing Systems 2 (ed Touretzky, D. S.) 630–637 (Morgan Kaufmann, San Francisco, 1990).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I. & Salakhutdinov, R. J. Mach. Learn. Res. 15, 1929–1958 (2014).
Bradski, G. Dr. Dobb’s Journal 25, 120–123 (2000).
Oppenheim, A. V. & Schafer, R. W. Discrete-time Signal Processing (Pearson, Upper Saddle River, NJ, 2014).
Scott, D. W. Multivariate Density Estimation: Theory, Practice, and Visualization (John Wiley & Sons, Hoboken, NJ, 2015).
Acknowledgements
We thank A. Groneberg, A. Laan and A. Pérez-Escudero for discussions; J. Baúto, R. Ribeiro, P. Carriço, T. Cruz, J. Couceiro, L. Costa, A. Certal and I. Campos for assistance in software, arena design and animal husbandry; and A. Bruce (Monash University, Melbourne, Australia), N. Blüthgen (Technische Universität Darmstadt, Darmstadt, Germany), C. Ferreira, A. Laan and M. Iglesias-Julios (Champalimaud Foundation, Lisbon, Portugal) for videos of ants, flies and zebrafish fights. This study was supported by Congento LISBOA-01-0145-FEDER-022170, NVIDIA (M.G.B., F.H. and G.G.d.P.), PTDC/NEU-SCC/0948/2014 (G.G.d.P.) and Champalimaud Foundation (G.G.d.P.). F. R.-F. acknowledges an FCT PhD fellowship.
Author information
Authors and Affiliations
Contributions
F.R.-F., M.G.B. and G.G.d.P. devised the project and algorithms and analyzed data. F.R.-F. and M.G.B. wrote the code with help from F.H. M.G.B. managed the code architecture and GUI. F.R.-F. managed testing procedures. R.H. built setups and conducted experiments with help from F.R.-F. G.G.d.P. supervised the project. M.G.B. wrote the supplementary material with help from F.R.-F., R.H., F.H. and G.G.d.P., and G.G.d.P. wrote the main text with help from F.R.-F., M.G.B. and F.H.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
Supplementary Figure 1 Training dataset of individual images.
(a) Holding grid used to record 184 juvenile zebrafish (TU strain, 31 dpf) in separated chambers (60-mm-diameter Petri dishes). (b) Sample frame showing the individuals used to create the dataset and the individuals used as social context (n = 46 videos corresponding to n = 184 different individuals; ~18,000 frames per individual). (c) Summary of the individual-images dataset. The dataset is composed of a total of ~3,312,000 uncompressed, grayscale, labeled images (52 × 52 pixels).
Supplementary Figure 2 Single-image identification accuracy for different group sizes and different variations of the identification network.
Each network is trained from scratch using 3,000 temporally uncorrelated images per animal (90% for training and 10% for validation) and then tested with 300 new temporally uncorrelated images to compute the single-image identification accuracy (Supplementary Notes). We train and test each network five times. For every repetition, the individuals of the group and the images of each individual are selected randomly. Images are extracted from videos of 184 different animals recorded in isolation (Supplementary Fig. 2). Colored lines with markers represent single-image accuracies (mean ± s.d., n = 5) for network architectures with different numbers of convolutional layers (a; see Supplementary Table 2 for the architectures) and different sizes and numbers of fully connected layers (b; see Supplementary Table 3 for the architectures). The black solid line with diamond markers shows the accuracy for the network used to identify images in idtracker.ai (see Supplementary Table 1, identification convolutional neural network).
Supplementary Figure 3 Experimental setup for recording zebrafish videos.
(a) Front view of the experimental setup used to record zebrafish in groups and in isolation. (b) Side view of the same setup with the light diffuser rolled up. (c) Close-up view of the custom-made circular tank used to record the groups of 10, 60 and 100 juvenile zebrafish. (d) Sample frame from a video of 60 animals (n = 3 videos of 10 zebrafish, n = 3 videos of 60 zebrafish, and n = 3 videos of 100 zebrafish).
Supplementary Figure 4 Experimental setup used to record fruit fly videos.
(a) Exterior view of the setup used to record flies in groups. (b) Top view of the same setup with the diffuser rolled up. (c) Close-up view of one of the two arenas used (arena 1). (d) Sample frame from a video of 100 flies (n = 1 group of 38 flies, n = 2 groups of 60 flies, n = 1 group of 72 flies, n = 2 groups of 80 flies, and n = 3 groups of 100 flies; all animals were different for each group).
Supplementary Figure 5 Automatic estimation of identification accuracy.
Comparison between the accuracy estimated automatically by idtracker.ai and the accuracy computed by human validation of the videos (Supplementary Notes). The estimated accuracy is computed over the validated portion of the video. Blue dots represent the videos referenced in Supplementary Tables 5–7.
Supplementary Figure 6 Accuracy as a function of the minimum number of images in the first global fragment used for training.
To study the effect of the minimum number of images per individual in the first global fragment used to train the identification network, we created synthetic videos using images of 184 individuals recorded in isolation (Supplementary Fig. 1). Each synthetic video consists of 10,000 frames, where the number of images in every individual fragment was drawn from a gamma distribution, and the crossing fragments lasted for three frames (Supplementary Notes). The parameters were set as follows: θ = [2,000, 1,000, 500, 250, 100], k = [0.5, 0.35, 0.25, 0.15, 0.05], number of individuals = [10,60,100]. For every combination of these parameters we ran three repetitions. In total, we computed both the cascade of training and identification protocols and the residual identification for 225 synthetic videos. (a) Identification accuracy for simulated (empty markers) and real videos (color markers) as a function of the minimum number of images in the first global fragment. The number next to each color marker indicates the number of animals in the video. The accuracy of the real videos was obtained by manual validation (Supplementary Tables 5–7). In some videos, animals are almost immobile for long periods of time because of low-humidity conditions. Potentially, the individual fragments acquired during these periods encode less information that is useful for identifying the animals. To account for this, we corrected the number of images in the individual fragments by considering only frames in which the animals were moving with a speed of at least 0.75 BL/s. We observed that idtracker.ai was more likely to have higher accuracy when the minimum number of images in the first global fragment used for training was > 30. (b) Distributions of the number of images per individual fragment for real videos of zebrafish, and their fits to a gamma distribution. (c) Distributions of speeds of zebrafish and fruit fly videos.
Supplementary Figure 7 Performance as a function of resolution.
Human-validated accuracy of tracking results obtained at six different resolutions. Pixels per animal are here indicated at the identification stage. There are fewer pixels per animal at the segmentation stage—approximately 25 and 300 pixels per animal, compared with 100 and 600 at the identification stage, respectively.
Supplementary Figure 8 Performance after application of Gaussian blurring.
Human-validated accuracy of tracking results obtained at seven different values of the s.d. of a Gaussian filtering of the video.
Supplementary Figure 9 Performance with inhomogeneous light conditions.
Background image corresponding to two different experiments with 60 zebrafish (n = 1 experiment for each condition). On the left for our standard setup and on the right after switching off the IR LEDs in two walls and covering the light diffuser in the same side with a black cloth. Human-validated accuracy of tracking results is given below the images. The background image is computed as the average of equally spaced frames along the video with a period of 100 frames.
Supplementary Figure 10 Attack score over time for seven pairs of fish staged to fight.
Each colored line represents the attack score of an individual (see the Methods for the definition of ‘attack score’).
Supplementary Figure 11 Correlation between the average distance to the center of the tank and the average speed for two milling groups of 100 juvenile zebrafish.
(a) Probability density of the location in the tank of three representative individuals depicted in (b) as gray markers. (b) Average speed along the video as a function of the average distance to the center of the tank for all the fish in the group. Each black dot represents an individual; the gray markers are the individuals depicted in (a). The blue dashed line is the line of best fit to the data (R2 = 0.5686, Pearson’s r and P = 10–19, two-sided P value using Wald test with t-distribution of the test statistic). (c) Same as in (a) for a different video. (d) Same as in (b) for a different video (R2 = 0.6934, Pearson’s r and P = 7 × 10–27, two-sided P value using Wald test with t-distribution of the test statistic).
Supplementary information
Supplementary Text and Figures
Supplementary Figs. 1–11, Supplementary Tables 1–12 and Supplementary Note 1
Supplementary Software
Supplementary_software.zip contains two folders: (1) idtrackerai-1.0.3-alpha, which is the code for the idtracker.ai software at the time of publication (see https://gitlab.com/polavieja_lab/idtrackerai.git for the latest version), and (2) idtracker.ai_Figures_and_Tables_code, which includes the code to reproduce the panels in Figs. 1 and 2, as well as Supplementary Figures and Supplementary Tables
Rights and permissions
About this article
Cite this article
Romero-Ferrero, F., Bergomi, M.G., Hinz, R.C. et al. idtracker.ai: tracking all individuals in small or large collectives of unmarked animals. Nat Methods 16, 179–182 (2019). https://doi.org/10.1038/s41592-018-0295-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-018-0295-5
This article is cited by
-
Multi-animal 3D social pose estimation, identification and behaviour embedding with a few-shot learning framework
Nature Machine Intelligence (2024)
-
High-resolution ethograms, accelerometer recordings, and behavioral time series of Japanese quail
Scientific Data (2024)
-
Ontogeny and social context regulate the circadian activity patterns of Lake Malawi cichlids
Journal of Comparative Physiology B (2024)
-
3D-MuPPET: 3D Multi-Pigeon Pose Estimation and Tracking
International Journal of Computer Vision (2024)
-
DVT: a high-throughput analysis pipeline for locomotion and social behavior in adult Drosophila melanogaster
Cell & Bioscience (2023)