Ancient origins of complex neuronal genes

Matthew J. McCoy, Andrew Z. Fire

Preprint posted on 16 August 2023

Making sense(s) through the lens of evolution (pun intended) - old genes may drive brain evolution.

Selected by Preethi Krishnaraj, Chee Kiang Ewe


How does the nervous system transition from loose neuron networks in cnidarians to the complex connectomes found in vertebrates? Intriguingly, the nervous systems of many species exhibit an abundance of large genes [1]. This observation has sparked a compelling hypothesis that the expansion of gene size might be a contributing factor to the evolution of more complex brains.

During the course of evolution, the relative sizes of genes and the proteins they encode have remained relatively stable (see more below). However, in more complex organisms, a substantial proportion of the expanded genome consists of non-coding DNA. This includes introns which play vital roles in the regulation of gene expression and the generation of different protein isoforms [2].

In this preprint by McCoy and Fire, a group of genes are described that are conserved across species and potentially pivotal in driving the evolution of the nervous system. Importantly, these genes appear to have undergone significant increases in size and to have acquired a greater number of isoforms during evolution. This observation suggests that the expansion of gene size, particularly among ancient genes under high selective constraint, provides substrates for natural selection during the evolution of the nervous system.

Figure 1: Gain in gene architecture complexity drives the evolution of the nervous system.   

Main findings

Harmony in gene size – Relative gene size remains stable across diverse species

Many previous studies had mostly compared aggregate gene size measures across species; whereas the authors of this preprint pursued a gene-by-gene comparison approach to investigate gene size variation during evolution. Their investigation involved comparing orthologous gene sizes* across diverse eukaryotes. They explored whether gene sizes in one species correlated with those in distantly related species. Despite significant differences in absolute gene size**, they found relative gene size** to be remarkably consistent across species. This finding indicates a macroevolutionary trend in which gene sizes evolve together, regardless of absolute variations.

As an example, the authors compared orthologous genes in humans and the nematode, C. elegans. Despite consistent CDS size between each ortholog in both the species, the largest human genes were found to be 100 times larger than those in the nematode. Additionally, within both genomes, the CDS size strongly correlated with the gene size indicating a close relationship between protein and gene size on a macroevolutionary scale.

Neurological Sovereignty – Large genes often have neuronal functions

Several studies, including previous work of the preprint authors [1], have demonstrated a correlation between the substantial size of certain genes in the brain and their expression levels. In this preprint, the authors show that the top 10% largest genes include more brain-enriched genes compared to genes of other sizes, highlighting the brain’s unique gene expression patterns. Conversely, the expression of smaller genes was observed in tissues like the testis and skin.

The authors then investigated gene size distributions for specific functional categories and found that large genes often had a neuronal function, such as neuron recognition, presynaptic membrane assembly, and neuron cell-cell adhesion. These findings suggest that there are distinct types of genes: (i) those benefiting from small, condensed gene sizes including highly expressed and rapidly responsive genes, (ii) those benefiting from expanded gene sizes such as neuronal genes with multiple isoforms, and (iii) potentially a third class of genes whose sizes are influenced by unknown factors.

Timeless treasures: Most large neuronal genes are ancient

Older genes are often larger, undergo stronger purifying selection#, and evolve more slowly than newer genes. Here, the authors have detailed an analysis that focused on genes of specific ages and sizes. They found that most large genes are ancient and highly conserved, with the top 10% largest human genes averaging an age of 953 million years, while shorter genes average around 62 million years.

Further, while examining large brain-enriched genes, the authors identified their conservation not only among animals but also in basal organisms like sponges and choanoflagellates, despite lacking nervous tissues. These critical genes for nervous systems have ancient origins predating dedicated neuronal cell types.

Expanding the horizons: New isoforms evolving from large genes

In organisms characterized by expanded genomes, the authors noted the presence of ancient, highly conserved genes that were undergoing evolutionary changes by acquiring new isoforms. Specifically, this phenomenon was observed to be more pronounced in larger genes. To quantify these changes, the authors assessed the number of isoforms and compared these numbers between genes with one-to-one orthologous relationships. Interestingly, the authors noted that a group of large, ancient genes present in various species have grown even larger and more complex in vertebrates. These results indicate that large, ancient genes can actively incorporate new sequences contributing towards evolution.

What we liked about this preprint

This thought-provoking preprint proposes the interesting theory which states that increase in absolute gene size may drive the evolution of gene structures and regulatory elements, which then contribute to the diversification of the nervous systems. The bioinformatic analyses are elegant and the paper is very well written. We very much enjoyed reading it!

Questions for the authors

  1. We wonder if you have performed GO analysis specifically on the most ancient large genes (the 71 genes conserved between humans and sponges, for example)? Do they tend to perform a certain neuronal function?
  2. Your results seemingly contradict previous findings that endodermal genes tend to be older than ectodermal genes [3]. Would you be able to comment on this?


  • Matthew J McCoy, Andrew Z Fire. Intron and gene size expansion during nervous system evolution. BMC Genomics. 2020 May 14;21(1):360. doi: 10.1186/s12864-020-6760-4.
  • Warren R Francis, Gert Wörheide. Similar Ratios of Introns to Intergenic Sequence across Animal Genomes. Genome Biol Evol. 2017 Jun 1;9(6):1582-1598. doi: 10.1093/gbe/evx103.
  • Tamar Hashimshony, Martin Feder, Michal Levin, Brian K. Hall, and Itai Yanai. Spatiotemporal transcriptomics reveals the evolutionary history of the endoderm germ layer. Nature. 2015 Mar 12; 519(7542): 219–222. doi: 10.1038/nature13996

Footnotes and definitions

*How did the authors define gene size? It is the length of the gene from the first to the last annotated exon, excluding untranslated regions.

**How was the gene size measured? The authors measured the size in two ways: absolute (in base pairs) and relative (ranking compared to other genes in the same genome).

What did the CDS size represent? It represented the nucleotide span within an mRNA transcript, excluding introns and untranslated regions.

How was the protein size determined? It was determined by the number of amino acids.

#purifying selection: a background selection to get rid of potentially deleterious mutations.



Posted on: 23 October 2023


Read preprint (No Ratings Yet)

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

preLists in the evolutionary biology category:

EMBO | EMBL Symposium: The organism and its environment

This preList contains preprints discussed during the 'EMBO | EMBL Symposium: The organism and its environment', organised at EMBL Heidelberg, Germany (May 2023).


List by Girish Kale

9th International Symposium on the Biology of Vertebrate Sex Determination

This preList contains preprints discussed during the 9th International Symposium on the Biology of Vertebrate Sex Determination. This conference was held in Kona, Hawaii from April 17th to 21st 2023.


List by Martin Estermann

EMBL Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture (2021)

A list of preprints mentioned at the #EESmorphoG virtual meeting in 2021.


List by Alex Eve

Planar Cell Polarity – PCP

This preList contains preprints about the latest findings on Planar Cell Polarity (PCP) in various model organisms at the molecular, cellular and tissue levels.


List by Ana Dorrego-Rivas

TAGC 2020

Preprints recently presented at the virtual Allied Genetics Conference, April 22-26, 2020. #TAGC20


List by Maiko Kitaoka et al.

ECFG15 – Fungal biology

Preprints presented at 15th European Conference on Fungal Genetics 17-20 February 2020 Rome


List by Hiral Shah

COVID-19 / SARS-CoV-2 preprints

List of important preprints dealing with the ongoing coronavirus outbreak. See for additional resources and timeline, and for full list of bioRxiv and medRxiv preprints on this topic


List by Dey Lab, Zhang-He Goh


SDB 78th Annual Meeting 2019

A curation of the preprints presented at the SDB meeting in Boston, July 26-30 2019. The preList will be updated throughout the duration of the meeting.


List by Alex Eve

Pattern formation during development

The aim of this preList is to integrate results about the mechanisms that govern patterning during development, from genes implicated in the processes to theoritical models of pattern formation in nature.


List by Alexa Sadier