Borgs are giant extrachromosomal elements with the potential to augment methane oxidation

Basem Al-Shayeb, Marie C. Schoelmerich, Jacob West-Roberts, Luis E. Valentin-Alvarado, Rohan Sachdeva, Susan Mullen, Alexander Crits-Christoph, Michael J. Wilkins, Kenneth H. Williams, Jennifer A. Doudna, Jillian F. Banfield

Preprint posted on July 10, 2021

Beware the Borg! Linear extrachromosomal DNA assimilated from Archaea

Selected by Kerryn Elliott


Methane is the second most abundant greenhouse gas after carbon dioxide, and although it is comparatively short lived in the atmosphere, methane is almost 30 times more potent at warming the earth than carbon dioxide. Methane is produced by microbiological processes, and one gigaton of methane is produced annually by methanogenic (methane-producing) archaea. Methanotrophs, on the other hand, are methane-oxidising microorganisms with the ability to reduce methane levels in the atmosphere, and therefore limit global warming. One such methanotroph is an anaerobic methanotrophic archaea (ANME) known as Methanoperedens, which is capable of directly coupling methane oxidation to the reduction of iron, nitrate or manganese.

Previous work has shown that different factors can impact the rate of methane oxidation. In order to investigate the factors, such as viruses, which have been suggested to influence the methane oxidation rate of Methanoperedens, the authors sampled soil from wetlands in California where methane oxidation and Methanoperedens are known to occur.  In this preprint, they report the identification of novel extrachromosomal elements, which they term ‘Borgs’ with the ability to augment the methane oxidation capacity of Methanoperedens. They hypothesise that Borgs have a potential role in the mitigation of greenhouse gas emissions in the future.

Main findings

The authors first sampled the soils from a number of saturated vernal pools (wetland) locations in California, and performed Illumina sequencing on the bulk environmental DNA extracted from both these samples and sediment samples from an aquifer in Colorado. A variety of tools were used to assemble the sequences into de novo sequences which were then compared to the existing databases UniProt and ggKbase to determine the taxonomy. Hundreds of kilobases of assembled sequences (contigs) were flagged as extrachromosomal elements as they did not fit any taxonomy at the domain level (Bacteria, Archaea, Eukaryotes).

The authors manually curated these large contigs to generate four complete genomes, as well as an additional partial genome. The four genomes ranged in size from 661,708 to 918,293 kbp and all contained a large inverted repeat (about 1.5 kbp) at the end of the linear genome, as well as interspersed regions of tandem repeats throughout the genomes. All genomes consisted of two replicores, which were unequal in length and carried all genes on one strand. Although the majority of the identified genes were novel, 21% of the predicted proteins had database matches, and the majority of those had a best match to Methanoperedens. Interestingly, although they had a 10% lower GC content than previously existing species, there were certain gene cluster regions that had clear increases in GC content up to the expected GC of Methanoperedens species, which also displayed much greater similarity at the protein sequence level to Methanoperedens, suggesting that these pieces of DNA had been acquired by lateral gene transfer from Methanoperedens. The genomes lacked key ribosomal and other single-copy genes required for Archaea life, and therefore represented a novel extrachromosomal element, which the authors termed “Borgs” due to their ability to assimilate genes from organisms, such as Methanoperedens.

Using the manually curated genomes the authors then compiled criteria to search for additional Borg genomes, and detected a total of 19 different Borg genomes in their samples. There were more Borg sequences found in deeper soil, and there was no clear relationship between the abundance of coexisting Methanoperedens species, with Borgs being detected up to 8x more frequently than the Archaea, confirming that Borg genomes also shared regions of similarity throughout their genomes, and indeed there was one pair which shared 100 % nucleotide identity for an 11 kb region, suggesting a recent combination event.

What do the Borgs encode?

The Borg genomes contained a variety of proteins, and the majority of them have unknown function. All Borg genomes encoded the ribosomal protein rpL11, and some contained other ribosomal proteins. Many Borg genomes encoded mobile element defense systems, including RNA-targeting type III-A CRISPR-Cas systems. The most represented functions for the encoded Borg proteins were DNA or RNA manipulation, energy metabolism and Glycosyltransferases. Intriguingly, all Borgs contained FtsZ-tubulin homologs possibly involved in membrane remodeling or division. They also contain proteins that resemble Major Vault Proteins and the TEP1-like TROVE domain protein, which together form a ribonucleoprotein known as the eukaryotic vault organelle, which may be involved in drug resistance.

The prevalence of a variety of genes with redox or respiratory functions also varied across different Borg genomes. The Black Borg encoded genes involved in methane oxidation, which appeared to be derived recently from Methanoperedens based on GC content. Eight Borgs encoded genes for the biosynthesis of tetrahydromethanopterin, a coenzyme used in methanogenesis, as well as ferredoxin proteins which could be the electron carriers. The Lilac Borg, which was unique in its association with a certain species of Methanoperedens, encoded the methyl-CoM reductase complex, which is central to methane processing. This complex in the Lilac Borg genome shares 75-88% amino acid sequence identity to the associated Methanoperedens genome. This complex is also encoded by the Steel borg. This is the first time methane metabolism genes have been detected on extrachomosomal elements and could represent a mechanism for dispersal of such elements. As with all Borgs, the Lilac Borg lacks the capacity for independent existence. The authors postulate that instead the Lilac Borg, lives within its host Methanoperedens cells, and may provide important metabolic genes to the host cells.


The extrachromosomal Borg elements represent something very different from anything that has been identified before. They are larger than other known Archaea viruses, and do not show similarity to anything that has been previously reported. Although they appear to have some association with Methanoperedens species, the exact connection to the Archaea is unclear. The fact that they carry metabolic genes, and could even regulate methane levels, giving the surrounding species an advantage and is very intriguing. More broadly speaking, Borgs have the potential to influence methane mitigation in the future, with implications for climate change. I look forward to future research into the role of the Borg.

Why I chose this paper

The entire concept is very intriguing! A giant extrachromosomal element which has the potential to influence metabolism? I had to learn more about it 🙂

I love the idea of basic research, that is collecting mud samples, leading to important findings for the future.


Questions to the Authors

Tell us a bit more about how you named them “Borg” and the inspiration for giving them colour names.

Can you explain a bit more about how  you compile a new chromosome from the environmental DNA? For example, how do you make new contigs from 150 to 250 bp reads? How much overlap is there? Are there multiple copies of these sequences which allows a new contig to be assembled with confidence?

Is it possible to use long read sequencing like nanopore or PacBio on the environmental DNA?

You state: “We can neither prove that they are archaeal viruses or plasmids or mini-chromosomes, nor can we prove that they are not” Why is this? What would be required to determine what they actually are?


Posted on: 12th August 2021 , updated on: 13th August 2021


Read preprint (No Ratings Yet)

Author's response

Basem Al-Shayeb shared

Can you explain a bit more about how  you compile a new chromosome from the environmental DNA? For example, how do you make new contigs from 150 to 250 bp reads? How much overlap is there? Are there multiple copies of these sequences which allows a new contig to be assembled with confidence?

It is a combination of overlaps in sequencing reads, as well as sequencing coverage, overall GC content, tetranucleotide frequency, and other factors that should be consistent throughout a single genome.

Is it possible to use long read sequencing like nanopore or PacBio on the environmental DNA?


You state: “We can neither prove that they are archaeal viruses or plasmids or mini-chromosomes, nor can we prove that they are not” Why is this? What would be required to determine what they actually are?

We would need to see the structural proteins that make up a virus, or the replication and other proteins that constitute a megaplasmid or minichromosome.  Since we couldn’t identify anything that would suggest this, we can’t classify them as those classes, but at the same time the majority of genes are unknown so we can’t say they aren’t an extremely novel virus or plasmid

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here

Also in the cell biology category:


The advances in fibroblast biology preList explores the recent discoveries and preprints of the fibroblast world. Get ready to immerse yourself with this list created for fibroblasts aficionados and lovers, and beyond. Here, my goal is to include preprints of fibroblast biology, heterogeneity, fate, extracellular matrix, behavior, topography, single-cell atlases, spatial transcriptomics, and their matrix!


List by Osvaldo Contreras

EMBL Synthetic Morphogenesis: From Gene Circuits to Tissue Architecture (2021)

A list of preprints mentioned at the #EESmorphoG virtual meeting in 2021.


List by Alex Eve

FENS 2020

A collection of preprints presented during the virtual meeting of the Federation of European Neuroscience Societies (FENS) in 2020


List by Ana Dorrego-Rivas

Planar Cell Polarity – PCP

This preList contains preprints about the latest findings on Planar Cell Polarity (PCP) in various model organisms at the molecular, cellular and tissue levels.


List by Ana Dorrego-Rivas

BioMalPar XVI: Biology and Pathology of the Malaria Parasite

[under construction] Preprints presented at the (fully virtual) EMBL BioMalPar XVI, 17-18 May 2020 #emblmalaria


List by Dey Lab, Samantha Seah


Cell Polarity

Recent research from the field of cell polarity is summarized in this list of preprints. It comprises of studies focusing on various forms of cell polarity ranging from epithelial polarity, planar cell polarity to front-to-rear polarity.


List by Yamini Ravichandran

TAGC 2020

Preprints recently presented at the virtual Allied Genetics Conference, April 22-26, 2020. #TAGC20


List by Maiko Kitaoka et al.

3D Gastruloids

A curated list of preprints related to Gastruloids (in vitro models of early development obtained by 3D aggregation of embryonic cells). Updated until July 2021.


List by Paul Gerald L. Sanchez and Stefano Vianello

ECFG15 – Fungal biology

Preprints presented at 15th European Conference on Fungal Genetics 17-20 February 2020 Rome


List by Hiral Shah

ASCB EMBO Annual Meeting 2019

A collection of preprints presented at the 2019 ASCB EMBO Meeting in Washington, DC (December 7-11)


List by Madhuja Samaddar et al.

EMBL Seeing is Believing – Imaging the Molecular Processes of Life

Preprints discussed at the 2019 edition of Seeing is Believing, at EMBL Heidelberg from the 9th-12th October 2019


List by Dey Lab


Preprints on autophagy and lysosomal degradation and its role in neurodegeneration and disease. Includes molecular mechanisms, upstream signalling and regulation as well as studies on pharmaceutical interventions to upregulate the process.


List by Sandra Malmgren Hill

Lung Disease and Regeneration

This preprint list compiles highlights from the field of lung biology.


List by Rob Hynds

Cellular metabolism

A curated list of preprints related to cellular metabolism at Biorxiv by Pablo Ranea Robles from the Prelights community. Special interest on lipid metabolism, peroxisomes and mitochondria.


List by Pablo Ranea Robles

BSCB/BSDB Annual Meeting 2019

Preprints presented at the BSCB/BSDB Annual Meeting 2019


List by Dey Lab

Biophysical Society Annual Meeting 2019

Few of the preprints that were discussed in the recent BPS annual meeting at Baltimore, USA


List by Joseph Jose Thottacherry

ASCB/EMBO Annual Meeting 2018

This list relates to preprints that were discussed at the recent ASCB conference.


List by Dey Lab, Amanda Haage

Also in the evolutionary biology category: