Wikidata:WikiProject Zika Corpus
Welcome to WikiProject Zika Corpus This is a WikiProject dedicated to the creation of a rich corpus on all scholarly knowledge related to the Zika Virus. |
About
[edit]In February 2016, the World Health Organization declared a public health emergency over the Zika virus outbreak and its links (then suspected, by now confirmed) to microcephaly and Guillain-Barré syndrome. By that time, around 150 scholarly articles had been published about the virus since its discovery in 1947, and the majority of these articles had already been assigned Wikidata items.
Since then, the literature on the topic has grown by an order of magnitude (see #Timeline), and the Wikidata coverage has mostly kept pace, with a typical time lag of less than a week. While not complete, this corpus covers most PubMed-indexed English-language articles reporting or reviewing original research about the Zika virus and the infections it can cause in mosquitoes, humans and animal models, as well as about approaches to prevention, diagnostics, therapy, or surveillance.
The Zika corpus served as a nucleus for creating a citation graph on Wikidata and for exploring co-author networks and similar information on Wikidata. It is now slowly expanding to encompass literature about related subjects, e.g., flaviviridae and mosquito-borne diseases more broadly, epidemiological modeling or data sharing in public health emergencies.
Definition
[edit]We define the Zika Corpus as consisting of
- all items linking to Zika virus (Q202864) or to entities that are part of it
- topics that co-occur with Zika virus (Q202864) as main subject (P921) of creative works
- authors of these works, and institutions or organizations they are affiliated with or employed by
- venues and publishers through which these works have been published
Objectives
[edit]- Curate the corpus by
- enriching the items in the corpus
- using the works included in the corpus to reference Wikidata statements
- Create a demonstrator for WikiCite: a consistent/interesting/visualizable dataset
- Prototype data visualization/storytelling ideas for exploring the corpus
Overview of publications related to the Zika virus
[edit]Scholia
[edit]- http://scholia.toolforge.org/topic/Q202864
- http://scholia.toolforge.org/topic/Q202864/missing
- http://scholia.toolforge.org/taxon/Q202864
Timelines
[edit]The timeline of Zika publications indexed in Wikidata can be visualized in various ways, e.g.
- by publication date (P577) of the work
- by date of creation of the Wikidata item for the work
Recent changes
[edit]A good overview of ongoing activity in curating the corpus is provided by the 100 most recent changes related to the list of items about the Zika virus.
Target Audiences
[edit]- Sociologists of science (including STS, information science, bibliometrics, social scientists? Or should we describe multiple separate groups here)
- democratizing access to datasets that have traditionally been controlled by a small group of academic players
- which topics were the current authors of Zika research previously studying?
- The general public
- public understanding of research on Zika and how this research evolved, e.g. timelines of when the news knew about the virus, when it became public knowledge, compared to when the papers were published, social media coverage and compared to the geographic spread of the virus and cases over time.
- Journalists
- how much is Zika research costing? where is funding coming from? Is funding coming from tax dollars and research coming from govt orgs? It matters because our representatives' and institutions' ability to respond to global health crises depends on budget
- what treatments are currently available? Are there advances that may provide treatment in the near future?
- how the public opinion is understanding or potentially distorting trustworthy information on the topic
- personal stories
- Zika virus first described by
- get inspiration from the Zika hashtag
- get inspiration fromZika dashboard
- Primary Researchers
- 1034 Distinct ZIKV nucleotide sequences can be found by searching here: https://www.ncbi.nlm.nih.gov/genome/viruses/variation/
- A citation can be found here: https://www.ncbi.nlm.nih.gov/pubmed/27899678
- 1034 Distinct ZIKV nucleotide sequences can be found by searching here: https://www.ncbi.nlm.nih.gov/genome/viruses/variation/
To Do
[edit]- Define a property to help set the boundaries of the bibliographic corpus
- We could just define an item "Zika corpus (v1) and set relevant items as "part of" that corpus
- Better: Tag all items in the corpus with on focus list of Wikimedia project (P5008):WikiProject Zika Corpus (Q54439832)
- Tagging is ongoing
- Extract and store author affiliations
- Extend coverage of statements supported by specific sources
- Add funder organizations from CrossRef Funder Registry to Wikidata. It's CC0 and "a unique taxonomy of grant-giving organizations". Downloadable as RDF or CSV
- Add funder information for papers
- PMC API
- NLP may help:
- Councill, Isaac G., C. Lee Giles, Hui Han, and Eren Manavoglu. "Automatic acknowledgement indexing: expanding the semantics of contribution in the CiteSeer digital library." In Proceedings of the 3rd international conference on Knowledge capture, pp. 19-26. ACM, 2005. Q30046394
- Giles, C. Lee, and Isaac G. Councill. "Who gets acknowledged: Measuring scientific contributions through automatic acknowledgment indexing."Proceedings of the National Academy of Sciences of the United States of America 101, no. 51 (2004): 17599-17604. Q30046493
- Khabsa, Madian, Pucktada Treeratpituk, and C. Lee Giles. "Ackseer: a repository and search engine for automatically extracted acknowledgments from digital libraries." In Proceedings of the 12th ACM/IEEE-CS joint conference on Digital Libraries, pp. 185-194. ACM, 2012. Q30050797
- Khabsa M., Koppman S., Giles C.L. (2012) Towards Building and Analyzing a Social Network of Acknowledgments in Scientific and Academic Documents. In: Yang S.J., Greenberg A.M., Endsley M. (eds) Social Computing, Behavioral - Cultural Modeling and Prediction. SBP 2012. Lecture Notes in Computer Science, vol 7227. Springer, Berlin, Heidelberg
- Add paper topics
- Extract and add MeSH (Q199897) terms
Subpages
[edit]- WikiProject Zika Corpus/Items
- WikiProject Zika Corpus/Listeria
- WikiProject Zika Corpus/Listeria/All
- WikiProject Zika Corpus/Listeria/All without papers
- WikiProject Zika Corpus/Listeria/All without papers and people
- WikiProject Zika Corpus/Listeria/Author name strings on works citing works about Zika virus
- WikiProject Zika Corpus/Listeria/Common words in titles of works on the topic
- WikiProject Zika Corpus/Listeria/Long author name strings
- WikiProject Zika Corpus/Listeria/Missing authors
- WikiProject Zika Corpus/Listeria/Taxa
- WikiProject Zika Corpus/Listeria/Well-cited authors with no employer or affiliation statement
- WikiProject Zika Corpus/Listeria/Works not tagged for a particular topic but for which citing and cited works are tagged with that topic
- WikiProject Zika Corpus/Participants
- WikiProject Zika Corpus/Properties
- WikiProject Zika Corpus/Queries
- WikiProject Zika Corpus/Tabs
See also
[edit]- Wikidata:WikiProject Medicine/Zika
- Wikidata:WikiProject Source MetaData/Wikidata lists/Items about Zika virus or fever
- WikiCite 2016: Report: the Zika corpus
- WikiCite 2017:
- ShEx manifest for the Zika corpus
- Wikidata:WikiProject COVID-19
Participants
[edit]The participants listed below can be notified using the following template in discussions:{{Ping project|Zika Corpus}}