Classification of dinosaur footprints using machine learning

Michael Jones, Jens N. Lallensack, Ian Jarman, Peter Falkingham, Ivo Siekmann

Posted on: 13 November 2024

Preprint posted on 18 July 2024

Stepping back in time: using machine learning to classify dinosaur footprints 🐾🦖

Selected by Ryan Harrison

Background

Dinosaur footprints can not only tell researchers when certain dinosaurs roamed different parts of the world, but also give clues on things like how different species interacted. The shape of the footprints and their arrangement within a trackway can reveal even more information, such as how fast a dinosaur was moving.

This preprint looks at footprints of two types of tridactyl (three-toed) dinosaurs, ornithopods and theropods. You’re more likely to be familiar with theropods, the meat-eating dinosaurs, as these include the T-Rex, and were the type of dinosaur to evolve into birds. The other type of dinosaur footprint this preprint discusses was made by ornithopods, a group of herbivorous dinosaurs (an ornithopod that you might have heard of is the Iguanodon). Dinosaurs existed for a whopping 167 million years and went extinct only 66 million years ago.

In this preprint, the authors investigate the ability of six different machine learning models to classify footprints as either ornithopod or theropod. In doing so, the authors aim to provide a method that easily distinguishes between theropod and ornithopod footprints; a challenging problem!

Main Findings

Mapping out the footprints

Using image data can be quite difficult, as it can introduce a lot of errors, such as rotation of the footprint, or if the placement of the footprint in the image is off-centre. To prevent such things, the footprints of theropods and ornithopods were analysed and 20 landmarks, or reference points, per footprint were assigned as in a 2020 research publication by Lallensack and colleagues, where landmarks 1-3 were placed at the tips of the digits, 4 at the rear of the foot, and landmarks 5-20 around the foot (as in Figure 1).

Figure 1: Landmarks of a Footprint | Figure depicts a footprint, with the 20 landmarks shown (as in Lallensack et al. (2020)). Red dots represent the landmarks the authors identified as informative while the others are shown in green. The black spot marks the centre of the footprint, and the arrows show how distance was measured to input into the machine learning models.

To input this data into machine learning models, the authors found the centre-point of the footprint, and, starting at landmark 4, measured the distance from the centre to each landmark (shown by the arrows in Figure 1) in an anti-clockwise direction. They then plotted these distances on a graph. It turned out that distance, rather than (x, y) co-ordinates, was a more useful metric for the machine learning models.

Why the models incorrectly classify some footprints

There were some footprints that all/most of the machine learning models misclassified. An example of a factor that influenced this misclassification was the direction of the middle toe (e.g. if the middle toe was pointing either left or right rather than straight ahead). Also, some tracks had characteristics of the opposite class (e.g. if a theropod had a relatively short and wide middle toe, this was associated with ornithopod tracks), which led to incorrect classification by a number of models. It is important to note that these footprints were deemed challenging to identify manually. In some cases, these footprints can be manually identified only because hand prints are present, which are sometimes found in ornithopods but not in theropods.

Which method performs best?

The six models used in this study were Logistic Regression (LR), Multi-Layer Perceptron (MLP), Random Forest (RF), Support Vector Machine (SVM), Multivariate Adaptive Regression Splines (MARS), and Linear Discriminant Analysis (LDA). For LR, the authors identified 8 landmarks that were expected to be informative locations, which included the top of the middle digit (2), the bottom of the foot (4), the landmarks representing the width of the footprint (10 and 5), and the landmarks representing the width of the middle toe (17 and 20) (identified as red spots in Figure 1).

Of all models, MLP performed best, by achieving a recall for both theropods and ornithopods of ~90% after optimising the parameters of the model. In other words, this model could correctly classify these footprints to both groups of dinosaurs 90% of the time. One particularly interesting result was that LDA was very successful at correctly classifying theropods (<95%), but not so much at identifying ornithopods (~65%).

Why I am highlighting this preprint

Coming from a developmental and cell biology background, I was interested to see how machine learning was used in different disciplines of science. I also really enjoy reading about palaeontology (dinosaurs are really cool) and thought that this preprint using machine learning to classify different dinosaur footprints was fascinating.

Questions for the authors

I am assuming the footprints of a Tyrannosaurus rex and an Archaeopteryx (both theropods) are quite different. Is there any way to account for this when training a machine learning model to classify them both correctly?
How applicable are your findings to other fossilized footprints? Could your work be adapted to identify other types of footprints?
Could you extend this analysis to look at 3D maps of dinosaur footprints?

Bibliography

Lallensack, J.N., Engler, T. and Barthel, H.J. (2020), Shape variability in tridactyl dinosaur footprints: the significance of size and function. Palaeontology, 63: 203-228. https://doi.org/10.1111/pala.12449

Lallensack, J. N., Romilio, A., & Falkingham, P. L. (2022, Nov 9). A machine learning approach for the discrimination of theropod and ornithischian dinosaur tracks. Journal of the Royal Society Interface, 19(196). https://doi.org/10.1098/rsif.2022.0588

Tags: classification, dinosaurs, footprints, machine learning, ornithopods, theropods

Read preprint

(No Ratings Yet)

Author's response

The author team shared

1) I am assuming the footprints of a Tyrannosaurus rex and an Archaeopteryx (both theropods) are quite different. Is there any way to account for this when training a machine learning model to classify them both correctly?

T. rex and Archaeopteryx obviously are completely different beasts! But if you ignore the vast difference in size, even the shapes of very different theropods are a lot more similar than one might expect. The idea of our approach is that the shapes are sufficiently different from ornithopod footprints so that they can be distinguished. Now, Archaeopteryx, specifically, is a bit mysterious because tracks of this famous bird-like dinosaur have not been found so far. But when it comes to ensuring that even very different footprints belonging to the same class are correctly classified, this can be supported in training of the machine learning models by using data that captures both theropods and ornithopods in all their variability. Only then can we be sure that a somewhat unusual ornithopod is not accidentally classified as a theropod.

2) How applicable are your findings to other fossilized footprints? Could your work be adapted to identify other types of footprints?

This is a good question! Our method is based on representing footprints using a system of landmarks. But… these landmarks were really developed with three-toed feet in mind! It is all about determining how, say, the tip of the middle toe of one footprint, is located relative to the same tip of the middle of another footprint. So, it would not be so easy to adapt this method to sauropods, for example, because their feet have different shapes and therefore the landmarks would have to be redefined for sauropod footprints. This means that our method cannot be used to look at three-toed ornithopods and theropods as well as sauropods at the same time. But… it would be a fun project to apply the approach to birds and see if they are similar to ornithopods or theropods – modern birds are actually the descendants of theropods!

3) Could you extend this analysis to look at 3D maps of dinosaur footprints?

This is another really interesting question… you might even get different answers if you ask different co-authors of this preprint! There are definitely pros to being able to work directly on 3D maps of footprints – there are a lot of sources of error and subjective decisions going from a three-dimensional footprint fossil to a 2D photograph to identifying the boundary of the footprint and eventually placing the landmarks – note that co-authors of this preprint have already developed a classification method based on black and white “silhouettes” of 2D representations of footprints (Lallensack et al., 2022). Directly working on 3D footprints seems advantageous because subjectivity and errors would be reduced to a minimum. But there are challenges with this approach, too – much less 3D footprint data are available. At the same time, 3D data would most likely require much more computational machine learning methods than used in this preprint – and, unfortunately, it is exactly these machine learning methods that would need a lot of data, probably at least 10 times as much data than the models trained in this preprint. We actually only required a data set of relatively modest size of around 300 samples for our study.