Analysis of science journalism reveals gender and regional disparities in coverage

Natalie R. Davidson, Casey S. Greene

Preprint posted on 9 December 2021

Read all about it! How are scientists and different countries represented in the news?

Selected by Helen Robertson


Science journalism has a significant role to play not only in how we understand science and new research, but also in our perception of scientists. Since the onset of the COVID-19 pandemic, science has consistently hit the headlines more than any other time in recent memory. With that in mind, thinking about how science reporting reflects the demographics of the scientific community is perhaps more important now than ever before.

During research for a story, journalists typically interview a number of sources to find out about their findings, opinions, and to identify other additional sources. For science stories, these sources are commonly scientists working in the field-of-interest. Their work might be cited in the resulting story, or the interviewee might be directly quoted to draw attention to a point. The scientists who are interviewed therefore have a role in shaping the portrayal of science and scientists in the final article. However, the selection of sources by the journalist is open to biases – be that as a result of the repeated selection of ‘media-friendly’ scientists, or the approaches taken by journalists to identify new sources. This is compounded by the existing biases that we know exist across STEM demographics.

Two previous studies have shown that male scientists were preferentially represented in print media and print and broadcast media in the periods covering 1985-2015 and 1995-2015, respectively. However, little data exists on representation in the media with regards name origins and how different countries are framed in the context of science stories. In this preprint, the authors investigate the gender and country-of-origin of scientists represented in online media in two different publications.

Study design

The authors chose two different outlets that cater for different demographics for their investigation: articles published in The Guardian, and non-research news articles published in Nature, both covering the period between 2005-2020.

Nature news article were scraped from the Nature website and assigned to one of several categories. News text, news citations, journalist names and metadata related to the paper were also collected. The final Nature News dataset included 22,001 articles.

The Guardian website was queried for 100 science articles published per month between 2005 to 2020. Text was scraped as for Nature News articles, with the exception of citation information. 15,747 science-related articles from The Guardian were included for analysis.

Text from both sets of news articles were processed to identify names of individuals and countries. Extracted names were later assigned either to be quoted, mentioned, or cited individuals.

As a comparison for gender and country-of-origin representation in scientific literature, a random set of 200 scientific research papers published in Springer Nature journals over each month of the study time period were included for analysis. Author names, positions and affiliations were stored for each research publication.

The names of all quoted and/or cited individuals, and the authors of research papers, were assigned a gender using the genderizeR package and pronouns (when used in quotes/in-text references). The authors note that non-binary names could not be identified in their pipeline. Name origin prediction was carried out using Wiki-2019LSTM, which assigns names to one of ten possible name origins. The highest-probability origin for each name was chosen as the resultant assignment.

Country mentions (by way of organisations, countries, states or provinces) were quantified for each article that included a minimum of two unique country-associated terms. For the citation rate of a country, all author affiliations on research articles were processed.

Main findings

Men are represented more frequently as quoted speakers and primary research authors in Nature

The authors pulled a total of 177,134 quotes from science news articles, of which 157,955 contained a gender prediction for the speaker (quoted speakers were analysed as a proportion of total quotes, not as uniquely quoted individuals). Gender representation in news stories was compared to gender representation in research paper authorship.

Perhaps unsurprisingly, male percentage representation in news quotes and in first and last research author position was higher than female representation in either category (see figure). However, whilst gender proportions of authorship remained stable over the study period, the authors found that male-attributed quotes in news articles declined from 2005 to 2020. This was more evident in Nature news stories (87.09% to 68.86%) than The Guardian new stories (84.01% to 75.94%). On closer inspection, the authors found that Nature’s ‘Career Feature’ article type was largely responsible for the observed trend in quotes in Nature News stories.

Male percentage of quotes and research authorship

Names predicted to be of East Asian origin are under-represented in News stories

To investigate representation with regards name origin, the authors looked again at names of quoted speakers, cited authors, and authors of research papers.

Names predicted to be of Celtic/English or European origin were cited most frequently in news articles, representing an over-enrichment of Celtic/English names compared to predicted research author name origins, but on par with research author names predicted to be of European origin. The authors found East Asian predicted name origins to be the third highest proportion of cited names, representing a slight under-representation compared to authors of research papers.

The authors found this gap to widen even more when it came to attributed quotations. Celtic/English origin names were over-enriched compared to citation patterns, which was particularly true for news stories published in The Guardian. East Asian name origins were significantly under-represented in quotes in both Nature and The Guardian. Interestingly, the predicted name origin of the journalist appeared to have some association with gathering sources for quotes, but only outside of authors whose work was directly cited in the news story.

Representation of countries in science news

To conclude, the authors looked at how different countries were represented with regards science in the media. They attributed each research paper included for analysis to one or more countries listed in the author affiliations. The US was most commonly affiliated to at least one author of the papers included, followed by the UK, Germany and France. But how did this compare to representation of these countries in news stories?

To answer this, the authors looked at content related to 1) when a given country was the subject of the article or 2) when the research coming from a given country was the subject of the article (i.e., cited research). From this, the authors grouped the dataset into countries with a high relative citation-to-mention rate (including Sweden, Singapore, Israel and Australia) and those with a low relative citation-to-mention rate (including the US, UK, India and Brazil). The authors found that words more commonly associated with mentioned countries were commonly environmental or space-related (including ‘rice’, ‘amazon’, ‘astronauts’, ‘forest’). Conversely, highly cited countries were more commonly linked to words related to science or research (such as ‘quantum’, ‘tumours’, ‘therapy’ and ‘dna’).

Why is this study important?

This study confirms the previous finding that male scientists are frequently over-represented in science news compared to female scientists. This is disappointing, but their findings do also indicate that the gap between male and female representation is (slowly) narrowing, which can only be a good thing for promoting equity and inclusion in STEM, and who the public perceive to be scientists.

However, their findings also show that representation goes much further than gender. Amplified voices are largely those of Celtic/English descent, despite researchers with names of other origins featuring commonly as authors on published research papers. The authors find a similar discrepancy in how countries themselves are represented in news around science. Although public interest is clearly important in story selection, addressing these biases in science news should be a consideration for journalists and news outlets to make sure that regional science, and the people doing the science, are fairly represented to the public.


Posted on: 23 December 2021 , updated on: 6 January 2022


Read preprint (No Ratings Yet)

Author's response

Natalie and Casey shared

What led you to choose these two outlets for your study? Did you consider including a tabloid publication as a comparison to a broadsheet (The Guardian) and specialized publication (Nature)?

We first chose Nature because it was a news source that contained all the information we were interested in (quotes and citations) that was also straightforward to attain the text and analyze. We chose the Guardian because we wanted to replicate our findings in another news outlet that had a different intended audience. While the Guardian didn’t have citations to analyze, it was the only news source we found with an easy-to-use API that allowed us to quickly attain a text corpus ready for analysis.

You highlight a lack of regional diversity in scientific reporting. Do you perceive this to be a problem that should be addressed primarily by editors and publications themselves as opposed to journalists?

We believe that this would be best addressed by editors and publications. They could help by attaining leads from diverse regions or working with journalists located in the countries that are being covered. However, journalists can also help by remaining cognizant of regional biases in sourcing.

Have your say

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Sign up to customise the site to your preferences and to receive alerts

Register here