Wikipedia:Wikipedia Signpost/2023-03-09/Recent research
"Wikipedia's Intentional Distortion of the Holocaust" in Poland and "self-focus bias" in coverage of global events
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
"Wikipedia's Intentional Distortion of the Holocaust"
- Reviewed by Nathan TeBlunthuis
English-language Wikipedia, so influential in shaping collective memory in today's world, has been presenting systematically misleading information about Nazi Germany’s genocide of the European Jews, by "whitewash[ing] the role of Polish society in the Holocaust and bolster[ing] stereotypes about Jews." Showing this is the important contribution of "Wikipedia's Intentional Distortion of the History of the Holocaust,"[1] a scholarly essay by Jan Grabowski and Shira Klein published in The Journal of Holocaust Research. In the past few weeks, this publication has already sparked a response including media coverage and a new arbitration case. This review's purpose is to summarize the essay and its contributions and to reflect on its merits and significance, and it will not engage the widespread debates in this area more than necessary (see also coverage in this and the previous issue of The Signpost).
Grabowski and Klein's central claim is twofold. First, Wikipedia articles often support a narrative of Holocaust distortion (not denial) with four elements: (1) overstating the suffering of Poles in comparison to Jews during World War II, (2) understating Polish antisemitism and Nazi collaboration while overemphasizing the rescue of Jews by Poles, (3) insinuating that Jews "bear responsibility for their own persecution" because of their communism and/or greed, and (4) exaggerating the role of Jewish-Nazi collaboration. The result misrepresents the Polish nation's role in the Holocaust and contradicts mainstream historiography, as Grabowski and Klein show by citing prior scholarship.
Grabowski and Klein provide very strong support for this first claim, that Wikipedia bolsters each form of distortion. They offer myriad examples where articles ranging from Stawiski, Warsaw Concentration Camp, Naliboki massacre, History of the Jews in Poland, Collaboration with the Axis Powers, to Rescue of Jews by Poles during the Holocaust, and Polish Righteous among the Nations have supported the distortion narrative by including claims backed by dubious sources or overemphasizing facts aligned with the distortion narrative while ignoring or underemphasizing facts that do not support it. Many of the errors Grabowski and Klein identify, and their role in the narrative, are not obvious to non-experts, and so an important contribution of this scholarship is to make the pattern of distortion clear.
Wikipedia's distorted coverage is harmful, Grabowski and Klein persuasively argue, because "Wikipedia plays a critical role in informing the public about the Holocaust in Poland." It is important that Wikipedia not reproduce it because misremembering the Holocaust can increase the risk of future antisemitic violence and genocide. Many Poles believe elements of the distortion narrative which Poland's current government has taken legal and administrative steps (e.g., creating monuments for apocryphal Poles who rescued Jews) to popularize. To be clear, critiques of distortion do not blame the Polish for the Holocaust. No one is confused that Nazi Germany is at fault. Still, Grabowski and Klein cite evidence that Polish antisemitism was common during, before, and after WWII, and that Poles (without direct Nazi coercion) committed atrocities against Jews during the war as well as afterward when Jews returned to Poland and attempted to reclaim their stolen property. Although they are not entirely clear about why distortion is popular, this juxtaposition suggests that it relieves a sense of national guilt.
The second part of Grabowski and Klein’s thesis is that a small group of committed Wikipedians "with a Polish nationalist bent" have persistently and successfully defended both the distortion narrative's claims and sources advancing it. The essay argues that these editors are substantially responsible for the observed distortion pattern, citing article diffs, excerpts from on-wiki discussions, and edit counts. It also relies on interviews with some of the editors that it describes as "distortionists", their opponents, and involved Wikipedia administrators.
Grabowski and Klein persuasively argue that these editors heavily worked on Wikipedia articles that (typically in versions from early 2022) included the four types of distortion, and in doing so often cited uncredible sources that contradict historical scholarship. These editors surface again and again throughout the topic area and its controversies, defending the source-validity of dubious authors while attacking "well-known experts on Holocaust history" that contradict them. In a striking quantitative description of the distortionist editors' outsized influence, Grabowski and Klein argue that Wikipedia cites two authors they view as distortionist (Richard C. Lukas and M. J. Chodakiewicz) much more than the mainstream experts (Doris Bergen, Samuel Kassow, Zvi Gitelman, Debórah Dwork, Nechama Tec) even though the former have far fewer academic citations than the latter according to Google Scholar.
Two of the editors criticized as distortionists, Piotrus and Volunteer Marek, have defended themselves in terms of the essay's omissions and possible errors, only some of which are actual errors. One notable inaccuracy is that the method for counting citations using Google Scholar is imprecise and today surfaces many more citations to Richard C. Lukas than Grabowski and Klein reported. Yet, even this inaccuracy does not change the broader conclusion that Wikipedia relies too heavily on Lukas' work (also, Klein has uploaded a table with updated numbers (.csv) which continue to support the original conclusion). The title of his most-cited work, The Forgotten Holocaust, refers to the suffering of Poles under Nazi occupation. The Nazis indeed had a murderous colonial policy to "Germanize" Poland (see [supp 1]), but this is distinct from the Holocaust, which refers to the genocide of European Jews. Lukas' title thus insinuates a false equivalence between Polish and Jewish suffering. Arguably, Wikipedia should not reference this at all, at least not without blinding clarity about how it contradicts mainstream sources.
From these editors' defensive responses, it is clear Grabowski and Klein have interpreted their actions unsympathetically to the extent that they overlooked their many valuable contributions to Wikipedia, some of which involved removing distortion. This omission is mostly understandable. A thorough account of these editors' Wikipedia careers (spanning more than 18 and 17 years, respectively) would have distracted from identifying and accounting for the Holocaust distortion on Wikipedia. In this reviewer's view, even if we take these defenses on board, Grabowski and Klein's possible errors are small relative to their abundant evidence that this group, comprising around a dozen or so editors, helped secure a foothold for the Holocaust distortion in Wikipedia articles.
That said, we should recognize how this case surfaces some of Wikipedia's more fundamental problems. At its core, this was a conflict about which Holocaust narratives belong on Wikipedia exemplified by questions such as: "Should Wikipedia include elements of Polish heroism?" and "How should facts about Poles rescuing Jews from the Holocaust be sourced, emphasized or positioned relative to facts about Polish atrocities or complicity in the Holocaust?" These questions are broad, complex, and require subject-matter knowledge and historiographic consideration to answer.
In their essay's final and most thought-provoking section, Grabowski and Klein describe how Wikipedia administrators and arbitration committee (ArbCom) members responded to the conflict. They are sharply critical of ArbCom members who "don't do the homework it takes to recognize distortion" and "wish to avoid fights in this area." It is standard practice on Wikipedia for administrators to avoid questions like those above by bracketing them as content disputes (which community members are normally supposed to resolve on their own) rather than misconduct (which administrators are normally empowered to address). This practice means that transforming a broad conflict about a content area into a series of narrow misconduct cases can be an effective strategy for winning (or at least dragging out) the conflict about content. Many times, administrators dismissed reports about the distortionists for being about content not conduct. On three occasions reports resulted in arbitration cases and even sanctions such as topic bans on distortionists and a discretionary "reliable-source consensus" requirement (WP:APLRS) intended to empower administrators to intervene against controversial sources. Efforts to enforce such sanctions, however, were themselves dismissed as content disputes and the topic bans were ultimately reversed (once ahead of schedule).
Emerging from this administrivia is a picture of Wikipedia's highest institutions straining under the complexity of this case. Strikingly, steps taken to simplify administrators' tasks shift the burden of proof onto the parties of a conflict. Short word-limits in case statements were too constraining for defenders of historical accuracy to be able to explain to non-experts the problems with distortion in the articles (indeed; it takes Grabowski and Klein most of 50 pages), but provided enough space for distortionists to deflect the accusations. Thus advantaged, the authors argue, distortionists skilled in wikilaywering effectively steered the content-dispute-averse administrators away from the fundamental conflict over historical narratives and toward the particular conduct of individual editors, which is easier for the ArbCom to address.
As noted above, Grabowski and Klein may have made errors, yet these barely undermine their central argument. An audience of Wikipedia scholars is more likely to feel underwhelmed by the essay's sparse engagement with the existing Wikipedia research literature beyond the amount needed to demonstrate Wikipedia's influence and importance to collective memory. Better positioning this case study within Wikipedia scholarship could have shed new light on Wikipedia's fundamental limitations. Past scholarship has discussed systematic flaws in Wikipedia's dispute resolution processes[supp 2] (cf. our review: "Critique of Wikipedia's dispute resolution procedures") and the damage when disagreements about article content turn into conflicts about bureaucratic process and individual conduct [supp 3]. In the Gamergate Controversy, for example, the ArbCom's decision to punish editors who were defending against a coordinated anti-feminist brigade similarly reveals how Wikipedia administrators' myopic focus on civil conduct and procedural fairness can distract from a fundamental conflict about content—and even become an effective tool for disingenuous actors[supp 4]. Yet other research finds that Wikipedia can be remarkably resilient to partisan misinformation because conflicting partisans hold each other to the same policies[supp 5] (cf. our review: "Politically diverse editors and article quality"). We might ask: What (if anything) was special about this Holocaust case such that it reveals Wikipedia’s limitations so starkly? Or: How (if at all) should Wikipedia's institutions for dealing with content disputes evolve? This case presents an important opportunity to consider such questions. Grabowski and Klein, content to draw attention to this case and document it in great detail, have left this to future work.
"Let's Work Together! Wikipedia Language Communities' Attempts to Represent Events Worldwide"
- Reviewed by Piotr Konieczny
The paper[2] addresses the issue of systemic bias, and focuses on English, Chinese, Arabic and Spanish Wikipedias. The authors study the production of seven years of news
on these projects (from the "In the news" (ITN) section on the Main Page and its equivalents), and conclude that while there is an indication of self-focus bias, there is also strong evidence of a global representation of events
. Self-focus, here, refers to focusing on one's home region or culture, and past studies found that about a quarter of most Wikipedias are about "self-focused topics".
The authors ended up with the dataset of a total of 6730 articles... 2064 in English, 1379 in Arabic, 1527 in Chinese and 1760 in Spanish
which correspond to 2064 events, 172 in Arabic-speaking countries, 115 in Chinese-speaking areas, 114 in Spanish-speaking regions, 445 in the US, 472 in other English-speaking countries and 746 in [other] areas
. The events were also coded by topic covered, which resulted in the 192 events classified as Science & Nature, 714 in Notable Person, 337 in Sports, 299 in Politics, 231 in Man-made Incidents, and 291 as Other
categories. To compare Wikipedia's coverage to global media coverage, the author also associated their dataset with that of the GDELT Project.
Some specific findings suggest that English Wikipedia suffers from a slight under-representation of events in Arabic-speaking countries
. The Arabic Wikipedia project on the other hand does not show much self-bias; instead it over-represents events that happen in English-speaking countries (but not the United States). The Chinese and Spanish Wikipedias, the authors argue, have a stronger self-focus bias than the Arabic and English projects, although still, over 90% of events covered by the news sections of these projects are about items not related to these countries. The authors also find, perhaps unsurprisingly, that larger Wikipedias will react to breaking news faster and update their news section more promptly.
Briefly
- See the page of the monthly Wikimedia Research Showcase for videos and slides of past presentations.
Other recent publications
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
- Compiled by Tilman Bayer
"Digital divides in the social construction of history: Editor representation in Wikipedia articles on African independence processes"
From the abstract:[3]
"The present study examines how [Wikipeda's] editor geography is reflected in the editing of articles (participation, impact and success) about the independence of former French colonies in Africa. The analysis is based on 354 Wikipedia articles; by geolocating 75% of the editors (N = 23,408), we show that the majority of edits are made by users located in France. This imbalance is also reflected in the overall share of text they contribute over time. However, when looking at the individual user level, we find that editors from France are only slightly more successful in maintaining their contributions visible to the reader, than editors from African successor states."
"A Wikipedia Narration of the GameStop Short Squeeze"
From the abstract:[4]
"This paper examines the usefulness of Wikipedia pageviews as indicator of the performance of stock prices. We examine the GameStop (GME) case, which drew the investors’ and scholars’ attention in 2021 due to the short squeeze, and its skyrocketing price increase since 2021. [...] The results show strong statistical evidence that increased number of Wikipedia pageviews for COVID-19, which represents the fear of the pandemic, has a negative impact on the GME performance. Moreover, the findings show that the increased interest in information regarding the short squeeze, as expressed by the increased number of pageviews of the relative Wikipedia page, is positively linked with the GME price. The econometric analysis shows that the interest indicator of GME has a positive coefficient, but it is not confirmed at significant statistical level."
References
- ^ Grabowski, Jan; Klein, Shira (2023-02-09). "Wikipedia's Intentional Distortion of the History of the Holocaust". The Journal of Holocaust Research. 0 (0): 1–58. doi:10.1080/25785648.2023.2168939. ISSN 2578-5648.
- ^ Li, Ang; Farzan, Rosta; López, Claudia (2022-12-03). "Let's Work Together! Wikipedia Language Communities' Attempts to Represent Events Worldwide". Interacting with Computers: –033. doi:10.1093/iwc/iwac033. ISSN 1873-7951. Data: https://github.com/LittleRabbitHole/WikipediaLanguageCommunity
- ^ Schlögl, Stephan; Bürger, Moritz; Schmid-Petri, Hannah (2022). "Digital divides in the social construction of history: Editor representation in Wikipedia articles on African independence processes". In Andreas M. Scheu; Thomas Birkner; Christian Schwarzenegger; Birte Fähnrich (eds.). Wissenschaftskommunikation und Kommunikationsgeschichte: Umbrüche, Transformationen, Kontinuitäten. Münster: Deutsche Gesellschaft für Publizistik- und Kommunikationswissenschaft e.V. pp. 1–12.
- ^ Vasileiou, Evangelos (2022-05-25), A Wikipedia Narration of the GameStop Short Squeeze, Rochester, NY, doi:10.2139/ssrn.4119961
{{citation}}
: CS1 maint: location missing publisher (link)
- Supplementary references and notes:
- ^ "Polish Victims". encyclopedia.ushmm.org.
- ^ Ross, Sara (March 1, 2014). "Your Day in 'Wiki-Court': ADR, Fairness, and Justice in Wikipedia's Global Community". doi:10.2139/ssrn.2495196 – via papers.ssrn.com.
- ^ Arazy, Ofer; Yeo, Lisa; Nov, Oded (August 10, 2013). "Stay on the Wikipedia task: When task-related disagreements slip into personal and procedural conflicts". Journal of the American Society for Information Science and Technology. 64 (8): 1634–1648. doi:10.1002/asi.22869 – via DOI.org (Crossref).
- ^ Famiglietti, Andrew (October 31, 2015). "ADIEU WIKIPEDIA: UNDERSTANDING THE ETHICS OF WIKIPEDIA AFTER GAMERGATE". AoIR Selected Papers of Internet Research – via journals.uic.edu.
- ^ Shi, Feng; Teplitskiy, Misha; Duede, Eamon; Evans, James A. (April 10, 2019). "The wisdom of polarized crowds". Nature Human Behaviour. 3 (4): 329–336. doi:10.1038/s41562-019-0541-6 – via www.nature.com.
Discuss this story
<edits violating 500/30 policy (and others) removed>
<edits violating 500/30 policy (and others) removed>
Richard C. Lukas
It is worth noting that Richard C. Lukas' book The Forgotten Holocaust, along with another of his works, is part of the "Background Information" reading list provided on the United States Holocaust Memorial Museum (USHMM) website.
It is described on that site as follows: An account of the systematic persecution of the Polish nation and its residents by the German forces. Features endnotes, a bibliography, appendices including lists of Poles killed for assisting Jews, primary source documents, and an index.
I respectfully disagree with the review author's opinion that a work recommended on the USHMM website should not be suitable for citation in Wikipedia. --Andreas JN466 11:52, 9 March 2023 (UTC)[reply]
For the record
Since Groceryheist's review of Grabowski and Klein's "Wikipedia's Intentional Distortion of the Holocaust" has been featured *despite* objections from multiple uninvolved editors (other than me), and *despite* the fact that these editors pointed out both stylistic and factual errors in the review, I do feel the need to say that
Volunteer Marek 15:44, 9 March 2023 (UTC)[reply]
Google scholar discussion
I legit pity you for being unable to admit the "Jewish welcome banner" caption was shocking, upsetting, and hurtful Holocaust distortion. You must have so much hate and pride in your heart that you seem unable to spare even a drop of empathy. Levivich (talk) 16:10, 13 March 2023 (UTC)[reply]This is as I said a relatively minor point in the overall scheme of things, but I do feel compelled to point out that Zvi Gitelman's main area is the History of Jews in Russia and Soviet Union so it's not surprising that's he's not cited much in Wikipedia's articles on Holocaust in Poland. If he's undercited in the topic area Holocaust in Soviet Union then that should be raised with whoever is working on that. I have no idea why Grabowski and Klein decided to throw him in there, maybe to "inflate" the numbers or, since at least one of them is writing outside their area of expertise, due to ignorance. BTW, Gitelman's work on the Jewish Labor Bund is really good and I recommend this book he edited The Emergence Of Modern Jewish Politics: Bundism And Zionism In Eastern Europe (particularly his article) for anyone who wants to fix the under-cited-on-Wikipedia situation. Volunteer Marek 17:22, 10 March 2023 (UTC)[reply]
@HaeB: So can I write and publish a rebuttal or not? And don't tell me "submissions is that way". I'm not going to waste my time writing something just to have the rug pulled out from me by you (which I think under circumstances is a legitimate concern on my part). People can of course discuss and debate whatever I write but given that you just published what is basically a hit piece against strong consensus, I'd expect some leeway here. Volunteer Marek 20:06, 11 March 2023 (UTC)[reply]
So far as I know, there is no way to check Scholar's citation counts at a past moment. I also think this is irrelevant to the question, since the claim is that Scholar's citation counts show a problem in Wikipedia now, and to the (highly dubious) extent this is even a valid method of analysis there is no reason to not use the present counts. If the present counts don't support the hypothesis, then the hypothesis should be discarded. But I think this method of analysis is fundamentally invalid anyway.
But...it is totally impossible that Lukas' actual (as opposed to Scholar) citations jumped by more than a factor of 6 in 7 months, especially given that his most cited works are quite old. There must be another explanation. A clue can be gained from Gitelman's jump from 2367 to 3690. Looking at Gitelman's Scholar profile, we see 3693 citations but in the sidebar we see that only 115 of them were for all of 2022 and 2023. So Scholar is now saying that Gitelman's count at Aug 2022 was at least 3693-115=3578, much higher than 2367. (These numbers can change by the day.) There are multiple possible explanations: maybe the two searches were not made in exactly the same way, maybe the semantics of the search engine changed, maybe Scholar got better at identifying citations in sources, maybe Scholar got better at telling when two authors are the same person, maybe Scholar added a large number of additional sources in which to look for citations, maybe Scholar's algorithm is broken somehow. Zerotalk 08:33, 12 March 2023 (UTC)[reply]
Piotrus asked why G&K chose the particular scholars they did for their plot. Haeb claimed they gave a rationale: "Nechama Tec, Samuel Kassow, Doris Bergen, Deborah Dwork, or Zvi Gitelman, to name some well-known experts on Holocaust history". But that only defines the group and not the selection, so the question remains unanswered.
Looking at who was not chosen may help. Of those scholars they named approvingly in their article, Browning, Gross and Polonsky each have far more Wikipedia mentions than any of those they selected. In fact Christopher Browning, who they correctly describe as "one of the world's top Holocaust scholars", has more wiki-mentions than all of the five scholars they selected put together. Then there are other famous Holocaust scholars not named who could have been selected, such as Yehuda Bauer, David Cesarani, Efraim Zuroff and Yisrael Gutman, all of whom are mentioned in Wikipedia more times than any of those they selected. This is very strong evidence that the choice was made to fit the desired result. Bearing this in mind and reading carefully, G&K actually do give a rationale: "Wikipedia mentions Richard C. Lukas 82 times, more than it mentions ... to name some well-known experts on Holocaust history." In other words, as the numerical evidence indicates, these five people were selected because they are mentioned less than Lukas. What has the superficial appearance of a little statistical experiment is nothing of the sort. Zerotalk 15:24, 12 March 2023 (UTC)[reply]
Removal of comments from this talk page
@Piotrus and Volunteer Marek: Regarding your deletions of comments from this talk page here, here and [12]:
I appreciate the concern and your disagreement with these harsh criticisms of the "Distortion of the Holocaust" review (speaking as the editor of this Signpost section who supported its publication despite strenuous objections from some people). But WP:TPO sets a pretty high bar for deletion of comments and I think that as long as it doesn't reach the level of WP:NPA, we can deal with criticisms like that we are spreading "lies of Grabowski" or furthering "histeria [sic] introduced by Icewhiz and his Jewish friends", however factually wrong they may be.
And seeing that this review might be attracting considerable critical attention from a non-Wikimedian Polish audience, I would not like us/the Signpost/Wikipedia being accused of censorship, especially given that Volunteer Marek's cryptic rationale "500/30 policy" will not likely be intelligible to many. Regards, HaeB (talk) 15:51, 9 March 2023 (UTC)[reply]
{{reply to|Chess}}
on reply) 21:03, 9 March 2023 (UTC)[reply]ECP applied
Per WP:APLECP, extended confirmed protection has been applied to this page. This action as been logged at [17]. --Jayron32 16:58, 9 March 2023 (UTC)[reply]
The chronology on Wikipedia suggests that a progenitor/related concern was first acknowledged through a list of arbitration committee findings in 2009 (with indicators of the source issues and concerns going back to 2005). The current arbitration "revisits" and references a prior arbitration that occurred in 2021. "I know you are but what am I" or "My facts are more correct that your facts" does not negate the process and governance concerns which remained open and thus unaddressed for 13-19 years. I think the current arbitration committee might want to look back to 2009 and prior. Flibbertigibbets (talk) 01:41, 10 March 2023 (UTC)[reply]
A more trivial question
I don't intend to join in the intensely substantive discussion among my more informed colleagues, but is this sentence possibly missing a word or two? "Many Poles believe elements of the distortion narrative which Poland's current government has taken legal and administrative steps (e.g., creating monuments for apocryphal Poles who rescued Jews) to popularize." Jim.henderson (talk) 00:43, 11 March 2023 (UTC)[reply]
Zooming out
Right, this is my point exactly. I am coming at this from knowledge of some of the parties in another subject area deemed off topic, and editing experience in this one that is limited, pre-February 2023, to School of Paris, which only slightly overlaps. Therefore I have been reading, but not debating, the finer points of Polish historiography, which others seem to know much better than I do. But. Is there doubt in anyone's mind that if he told a student that it was ok to use a source for non-controversial matters, he was tactfully saying (to a student editor) that the source was not wonderful but met the reliable sources policy? Is this in any way behavior that should be sanctioned? If an editor is correctly implementing policy and the result is not considered ideal, perhaps the policy needs refining. If so, then I submit that if we let Grabowski determine our policies, why not just knuckle under to the Kremin too? How we do things should start by determining what result *we* want, and that should be accuracy in *my* opinion at least. I very much share MVBW's concern about external vectors. Elinruby (talk) 23:41, 27 April 2023 (UTC)[reply]
Narrative and admins
I don't edit that much on controversial topics. But I rather dislike the concept of wikipedia articles being criticized for being consistent with a narrative. Such criticisms feel like they give people carte blanche to censor facts and content because it isn't in line with the "correct narrative to be". I feel as if editorial policies should constrain themselves to what is due and what accurately represents the sources and be very cautious about these "narrative" arguments. It all feels like it's part of a "misinformation creep" game, which defines "does not support every aspect of the viewpoint that I would like" as misinformation. I heard people trying to describe undue emphasis as "misinformation".
I also doubt that admins can or should deal with subtle content disagreements. It feels like this vague and fruitless hope that someone if you have enough authority and make your authority good enough it can solve all problems, whereas in reality the more powerful your authority is and the more subtle the issues it deals with the more inclined it is to be captured. Some things just have to play out in a haphazard way rather than be dealt with through process. Talpedia (talk) 00:25, 19 March 2023 (UTC)[reply]