Citation needed? Wikipedia and the COVID-19 pandemic
Preprint posted on 17 May 2021 https://www.biorxiv.org/content/10.1101/2021.03.01.433379v4
Article now published in GigaScience at http://dx.doi.org/10.1093/gigascience/giab095
Wikipedia was launched in 2001 and today has a monthly readership of approximately 495 million people . With over 155,000 medical-related articles viewed more than 4.88 billion times in 2013 alone, Wikipedia is one of the most viewed medical resources on the globe . To maintain high-quality material, Wikipedia has strict editorial guidelines , and medical professionals make up 50% of those editing medical-related articles .
In 2020, the world became gripped by a global pandemic caused by the SARS-COV-2 virus. In response to the pandemic, science experienced a cultural shift in how articles were shared and disseminated . There was also an “infodemic” of disinformation , with specific hijacking of the scientific literature by many different groups including conspiracy groups and right-wing politicians . However, to date nobody has investigated how this phenomenon has impacted Wikipedia articles on the pandemic.
To look into this, the authors of this preprint investigated the role of popular media and academic sources used as citations on Wikipedia articles related to the COVID-19 pandemic.
Wikipedia sources are highly selective
The authors examined 1695 COVID-19 related articles and found that citations in Wikipedia came from a variety of sources, including scientific articles and popular journalism(?). The majority of the scientific articles that were cited were published in Nature, Science, The Lancet and The New England Journal of Medicine (Figure 1 in the preprint). Surprisingly, only 0.42% of all academic papers on COVID-19 were cited on Wikipedia. In addition, papers cited also tended to have a higher altimetric score, meaning that they were generally more widely-shared articles. Encouragingly, almost 1/3rd of papers cited on Wikipedia were open-access, although few were preprints. The reasoning for citing traditionally highly-regarded journals over preprints is due to the underlying editorial requirements for health articles on Wikipedia.
Over 80% of references used in COVID-19 articles were not academic, and instead came from news media or websites. The highly selective nature of citations was also observed with non-academic sources, with more respected news organisations including the BBC and Reuters, being cited more often. Moreover, the World Health Organisation represented a significant amount of cited content.
Technical articles had a higher “scientific score”
To investigate the role of scientific articles compared to popular media, the authors created a scientific score by calculating the ratio of academic to non-academic references for each Wikipedia article. Those Wikipedia articles that had high scientific scores (closest to 1) were mostly highly scientific topics such as “cytokine”, “Macrophage-1 antigen” and “Tetrandrine”. In contrast, those with the lowest scientific score (closest to 0) were mostly those articles focussed on social aspects such as “COVID-19 pandemic in North America”, “Boris Johnson” and “Impact of the COVID-19 pandemic on the arts and cultural heritage”.
The authors next examined how the coverage of COVID-19 developed over time. They looked at 231 articles and mapped them to their respective dates of creation starting in 2001, when Wikipedia was created, to May 2020. From the beginning of the pandemic the total number of Wikipedia articles referencing COVID-19 doubled, with those created during 2020 having a lower scientific score (0.14) compared with those created pre-2020 (i.e. those articles on general coronaviruses or behaviour that were applicable to COVID-19) (0.48). The authors reasoned that staying up to date with current COVID-19 came at a cost to their scientific score.
Networks of COVID-19 Knowledge
Next, the authors investigated how Wikipedia articles connected together based on their shared academic sources. They found that six prominent topics emerged which shared multiple citations with other Wikipedia pages. These six topics were termed nodes and included ‘Coronavirus’, ‘Coronavirus disease 2019’,’ COVID-19 drug development’ and ‘COVID-19 pandemic’. Two of these nodes were locked for editing by the public to try and prevent the spread of misinformation.
Why we chose this preprint
This preprint covers an interesting topic and may be able to help us generate a tool to bring complicated scientific topics to a general audience. It may also be a starting point for how social media outlets, such as Facebook and twitter, can help to prevent misinformation. The COVID-19 pandemic has highlighted that public engagement and outreach are essential to help prevent the spread of misinformation and Wikipedia could be a tool to do this.
Posted on: 5 July 2021Read preprint
Also in the scientific communication and education category:
Digitize your Biology! Modeling multicellular systems through interpretable cell behavior
Mentoring practices predictive of doctoral student outcomes in a biological sciences cohort
The landscape of biomedical research
ChatGPT identifies gender disparities in scientific peer review