Classification of dinosaur footprints using machine learning
Posted on: 13 November 2024
Preprint posted on 18 July 2024
Stepping back in time: using machine learning to classify dinosaur footprints 🐾🦖
Selected by Ryan HarrisonCategories: paleontology
Background
Dinosaur footprints can not only tell researchers when certain dinosaurs roamed different parts of the world, but also give clues on things like how different species interacted. The shape of the footprints and their arrangement within a trackway can reveal even more information, such as how fast a dinosaur was moving.
This preprint looks at footprints of two types of tridactyl (three-toed) dinosaurs, ornithopods and theropods. You’re more likely to be familiar with theropods, the meat-eating dinosaurs, as these include the T-Rex, and were the type of dinosaur to evolve into birds. The other type of dinosaur footprint this preprint discusses was made by ornithopods, a group of herbivorous dinosaurs (an ornithopod that you might have heard of is the Iguanodon). Dinosaurs existed for a whopping 167 million years and went extinct only 66 million years ago.
In this preprint, the authors investigate the ability of six different machine learning models to classify footprints as either ornithopod or theropod. In doing so, the authors aim to provide a method that easily distinguishes between theropod and ornithopod footprints; a challenging problem!
Main Findings
Mapping out the footprints
Using image data can be quite difficult, as it can introduce a lot of errors, such as rotation of the footprint, or if the placement of the footprint in the image is off-centre. To prevent such things, the footprints of theropods and ornithopods were analysed and 20 landmarks, or reference points, per footprint were assigned as in a 2020 research publication by Lallensack and colleagues, where landmarks 1-3 were placed at the tips of the digits, 4 at the rear of the foot, and landmarks 5-20 around the foot (as in Figure 1).
Figure 1: Landmarks of a Footprint | Figure depicts a footprint, with the 20 landmarks shown (as in Lallensack et al. (2020)). Red dots represent the landmarks the authors identified as informative while the others are shown in green. The black spot marks the centre of the footprint, and the arrows show how distance was measured to input into the machine learning models.
To input this data into machine learning models, the authors found the centre-point of the footprint, and, starting at landmark 4, measured the distance from the centre to each landmark (shown by the arrows in Figure 1) in an anti-clockwise direction. They then plotted these distances on a graph. It turned out that distance, rather than (x, y) co-ordinates, was a more useful metric for the machine learning models.
Why the models incorrectly classify some footprints
There were some footprints that all/most of the machine learning models misclassified. An example of a factor that influenced this misclassification was the direction of the middle toe (e.g. if the middle toe was pointing either left or right rather than straight ahead). Also, some tracks had characteristics of the opposite class (e.g. if a theropod had a relatively short and wide middle toe, this was associated with ornithopod tracks), which led to incorrect classification by a number of models. It is important to note that these footprints were deemed challenging to identify manually. In some cases, these footprints can be manually identified only because hand prints are present, which are sometimes found in ornithopods but not in theropods.
Which method performs best?
The six models used in this study were Logistic Regression (LR), Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), Multivariate Adaptive Regression Splines (MARS), and Linear Discriminant Analysis (LDA). For LR, the authors identified 8 landmarks that were expected to be informative locations, which included the top of the middle digit (2), the bottom of the foot (4), the landmarks representing the width of the footprint (10 and 5), and the landmarks representing the width of the middle toe (17 and 20) (identified as red spots in Figure 1).
Of all models, MLP performed best, by achieving a recall for both theropods and ornithopods of ~90% after optimising the parameters of the model. In other words, this model could correctly classify these footprints to both groups of dinosaurs 90% of the time. One particularly interesting result was that LDA was very successful at correctly classifying theropods (<95%), but not so much at identifying ornithopods (~65%).
Why I am highlighting this preprint
Coming from a developmental and cell biology background, I was interested to see how machine learning was used in different disciplines of science. I also really enjoy reading about palaeontology (dinosaurs are really cool) and thought that this preprint using machine learning to classify different dinosaur footprints was fascinating.
Questions for the authors
- I am assuming the footprints of a Tyrannosaurus rex and an Archaeopteryx (both theropods) are quite different. Is there any way to account for this when training a machine learning model to classify them both correctly?
- How applicable are your findings to other fossilized footprints? Could your work be adapted to identify other types of footprints?
- Could you extend this analysis to look at 3D maps of dinosaur footprints?
Bibliography
Lallensack, J.N., Engler, T. and Barthel, H.J. (2020), Shape variability in tridactyl dinosaur footprints: the significance of size and function. Palaeontology, 63: 203-228. https://doi.org/10.1111/pala.12449
Lallensack, J. N., Romilio, A., & Falkingham, P. L. (2022, Nov 9). A machine learning approach for the discrimination of theropod and ornithischian dinosaur tracks. Journal of the Royal Society Interface, 19(196). https://doi.org/10.1098/rsif.2022.0588
Read preprint