Wikidata:Requests for comment/Non-free content
An editor has requested the community to provide input on "Non-free content" via the Requests for comment (RFC) process. This is the discussion page regarding the issue.
If you have an opinion regarding this issue, feel free to comment below. Thank you! |
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- There seems to be no consensus to implement non-free content policy. — regards, Revi 13:39, 3 February 2020 (UTC)[reply]
This RfC concerns Wikidata's approach to non-free content. This is a follow-up to this discussion on the project chat and this discussion on the English Wikipedia.
Currently, Wikidata does not have any policies related to copyright, although local file uploads are disabled. Most content on Wikidata is either uncopyrightable in the US or clearly in the public domain. However, there are a number of edge cases – mainly in the use of properties with the monolingual text and musical notation datatypes – where it could be beneficial to set out criteria for the acceptable, limited use of non-free content under the fair use doctrine of US copyright law (or, in lieu of a new policy, more actively enforce the prohibition of such content by making this abundantly clear).
Wikidata already contains small amounts of fair use content. For example, the item Harry Potter and the Philosopher's Stone (Q43361) uses first line (P1922) to quote one sentence from a copyrighted book. Jc86035 (talk) 09:02, 10 April 2019 (UTC)[reply]
My non-lawyer reasoning for why the quotation is considered fair use |
---|
|
Edit: The previous conclusion of this lead section was incorrect, since per foundation:Resolution:Licensing policy, the lack of a non-free content policy on Wikidata means that all content in Wikidata must be either copyleft or public domain. I have updated the end of the second paragraph to reflect this. Jc86035 (talk) 18:13, 10 April 2019 (UTC)[reply]
- Oppose "fair use" in itself is a bad idea for an international project (for same reason stated in commons:Commons:Fair use) and I'm still not convinced that "fair use" applies to data or that Wikidata already contains "fair use" data. Plus, "users are effectively only limited by US copyright law" is clearly wrong (see en:LICRA v. Yahoo! for a well-known case of lex loci, users must respect *both* US and local laws). Cheers, VIGNERON (talk) 09:20, 10 April 2019 (UTC)[reply]
- @VIGNERON: Firstly, not having the policy effectively means allowing any fair use content, as it would be legal in the US. This is why Commons has a fair use policy at all – to prevent the uploading of copyrighted files, as this would conflict with its goals.
- For all intents and purposes, it would be detrimental to the projects to adhere absolutely to the copyright laws of every jurisdiction in which they are accessible. This is why the English Wikipedia only requires uploads to adhere to US copyright law (which is definitely necessary), even though they may still be copyrighted or copyrightable in other jurisdictions. As an example, the United Kingdom has a much lower threshold of originality than the United States. As Commons would also require users to adhere to British copyright law for works originating in the UK, several logos of British companies, such as w:en:File:BBC Information and Archives Logo.svg, are hosted on the English Wikipedia under the presumption that they are not copyrightable in the US, rather than under the fair use criteria. While I am not arguing that this is definitely acceptable for the countries in which the projects would be considered to be infringing on copyright, it is clearly seen as acceptable by other Wikimedia projects, and Wikidata would be just another project to take this approach. Jc86035 (talk) 09:38, 10 April 2019 (UTC)[reply]
- (edit conflict) I think that yes, it should, because this would remove ambiguity in the leeway contributors are afforded in using limited amounts of non-free content. If it were not possible for fair use to apply to "data", then it would be completely inapplicable on the Internet because most web servers rely on databases to begin with. A full paragraph would presumably be different to an integer or a URL, in this regard. Jc86035 (talk) 09:23, 10 April 2019 (UTC)[reply]
- Strong oppose; important material at creativecommons.org: CC0 FAQ and Frequently Asked Questions in general; both contain several relevant paragraphs.
In short: we would have to add non-free content markers to all fair use statements. It would be practically impossible to make this visible through the Query Service, unless data users write queries in a way that always explicitly request this status for each and every statement they query. There are also lots of Wikipedia projects from jurisdictions that do not have a "fair use" law, which would have to complicate the code of their templates and modules that pull data from Wikidata to an extent which would render Wikidata completely unusable for them. There is no practical alternative to the current approach, which is "all content from Wikidata is free content". If there is doubt about existing content at Wikidata, this should be reviewed and eventually removed if it is non-free content. —MisterSynergy (talk) 11:07, 10 April 2019 (UTC)[reply]- @MisterSynergy: Wikimedia projects are not "from" jurisdictions; they are all hosted in the United States even if most of their users are from particular jurisdictions. The "current approach" is not documented in existing policy, which is why it would be useful to have one to remove any ambiguity.
- The point of this section was to determine whether a policy should exist, regardless of its content. If a policy allowing the use of some fair use content would be desirable, then it is desirable to have a policy on the matter. If a policy preventing the use of any fair use content would be desirable, then it is also desirable to have a policy on the matter, because the current situation is unclear and users have quoted from copyrighted works as a direct result of this ambiguity. As I understand it, your opinion that Wikidata should not be allowing fair use content at all (as well as VIGNERON's opinion) would better belong in the section below.
- As I've noted elsewhere on this page, reuse of Wikidata may already be legally questionable in countries which recognize sui generis database rights. The apparent position "all content from Wikidata is free content" has not stopped users from mass-importing data and identifiers from non-CC0 databases (including the Wikimedia projects), so the position does not necessarily match the reality of how "free" Wikidata's content actually is, nor how universal the position is.
- Furthermore, preventing "fair use" content outright would prevent quotation (P1683) from being used to quote from reference material that is non-free. This is a problem if Wikidata needs to be able to cite from books, and will be a bigger problem if references from Wikipedias are ever imported (which may become a possibility within the next decade). Jc86035 (talk) 11:36, 10 April 2019 (UTC)[reply]
- The "current approach" is that we all agree to the conditions set by the CC0 license (which effectively is a license waiver). With CC0 as the only possible "license", there is currently legally no possibility to add non-free content to Wikidata. Okay, we do not have an explicit policy to state that, but the license and the Terms of Use are linked from the edit interface, thus all editors had to agree to those conditions at least once. In my opinion, there is no need for a policy that explains how to use the CC0 license; instead, I think all editors should read the license by themselves at least once.
Generally, I very much would like to keep it like that, with Wikidata being radically free even at the expense of some data which we might not be able to incorporate here. If we were to allow "fair use", we would only be a little step away from allowing statements with other licenses (such as CC-by-sa or oven completely different ones). From data users' perspective, this would be highly undesirable. —MisterSynergy (talk) 17:31, 10 April 2019 (UTC) You may move my "vote" and the subsequent discussion to the other section if you think it is more appropriate there.[reply]- As previously mentioned, the actual license terms do not prohibit the use of fair use content; they only waive the licensor (Affirmer)'s rights, and the licensor obviously cannot arbitrarily waive another entity's rights. In fact, section 4 includes a disclaimer that the onus is on the licensee/reuser to determine whether someone else's copyrighted material is included ("Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof"). The Terms of Use are not relevant to whether fair use is allowed, since they allow fair use on other wikis. Jc86035 (talk) 17:46, 10 April 2019 (UTC)[reply]
- Yeah sure we may add such content, but according to the CC FAQ we are supposed to add a clear marker (maybe a reference qualifier? there is no really suitable place for such a marker right now). However, data users who deliberately want to avoid using fair use content have to write exessively complex queries to an extent which makes Wikidata unusable. —MisterSynergy (talk) 18:55, 10 April 2019 (UTC)[reply]
- On the other hand, foundation:Resolution:Licensing policy would seem to indicate that if there is no "exemption doctrine policy", which Wikidata probably does not have one of, then there must be no non-free content anywhere on the website. (I will be updating the statement up at the top, since it does not reflect this.) However, a strict interpretation could indicate that if Wikidata has been infringing EU sui generis database rights through large-scale imports of content from non-CC and non-PD databases, then it is possible that Wikidata would either need an EDP to address this or would be required to immediately remove any content which could potentially be non-free. Jc86035 (talk) 18:07, 10 April 2019 (UTC)[reply]
- As previously mentioned, the actual license terms do not prohibit the use of fair use content; they only waive the licensor (Affirmer)'s rights, and the licensor obviously cannot arbitrarily waive another entity's rights. In fact, section 4 includes a disclaimer that the onus is on the licensee/reuser to determine whether someone else's copyrighted material is included ("Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof"). The Terms of Use are not relevant to whether fair use is allowed, since they allow fair use on other wikis. Jc86035 (talk) 17:46, 10 April 2019 (UTC)[reply]
- The "current approach" is that we all agree to the conditions set by the CC0 license (which effectively is a license waiver). With CC0 as the only possible "license", there is currently legally no possibility to add non-free content to Wikidata. Okay, we do not have an explicit policy to state that, but the license and the Terms of Use are linked from the edit interface, thus all editors had to agree to those conditions at least once. In my opinion, there is no need for a policy that explains how to use the CC0 license; instead, I think all editors should read the license by themselves at least once.
- Oppose This question is way premature. Before taking any steps along this path we need to understand the legal landscape much much better. What content are we talking about? What is the existing legal position? Before any detailed discussion, we should get clarification from WMF Legal on that; and before approaching WMF Legal, we should think in some detail about what we would want to ask. What might be the sort of particular statements with particular types of values that we think there potentially could be any issue with? Are there legal cases and principles from around the world that we would like WMF Legal to include in its assessment? Both of those ought to be worked up in some detail, before we would ask WMF Legal for its assessment and review; and only after a comprehensive assessment of the basis like that should we even begin to think about policy formation.
The proper question to ask, IMO, is whether there is anything that in any way we mislead people about by claiming that Wikidata is CC0. Is there any content that reusers would need to be specifically cautious about, either at the level of a single statement, or at the level of an aggregate of statements produced by a query?
Contra to Jc86035 above, I don't believe there is any issue with us including the incipit of the first Harry Potter book -- or even of a 14-line poem. I believe that the first is de minimis, and even the latter would be entirely appropriate, as a regular form of identification.
Similarly, we might consider whether the titles of scientific papers or news articles contained in references represent a form of copyright taking. This of course has recently been in the news with Article 11 of the new EU copyright directive granting news publishers a limited-term related right equivalent to the existing rights held by copyright owners, and the suggestion by a German conciliation service that a taking of seven words from a title or the text of a news article might be considered a significant taking and not de minimis. Against that, the right of citation has basic protection under the Berne Convention, and I think we should take a stand that that means that the presentation of the basic information needed for identification is legitimate without limitation, and something to fight for if necessary.
A further area is our presentation here of the identifiers created by particular external organisations for particular things. One datapoint here is the "IMS Bricks" case, ruled on by the CJEU in April 2004 ("IMS Health vs NDC Health", commentary), where it was accepted by the parties that the segmentation of Germany into 1860 geographical 'bricks' with identifiers, that IMS collates and pubishes aggregated data marketing over, was covered by copyright. On the other hand there was an argument made against Microsoft in the EU vs Microsoft antitrust case, in relation to identifier codes used in APIs, that where such codes are arbitrary, reflecting no particular creative design or purpose, then in such a case they should no more attract copyright than a choice of a combination for a resettable combination lock. Two further factors are that where such identifiers are being presented as part of a URL, to which page access is being encouraged, then in the absence of any indications to the contrary, it may be reasonable to take that as an implied license to allow indexing and dissemination of those identifiers. Also in the United States, in some of Google's cases (eg Perfect 10, Authors' Guild), indexing and helping people to find documents has been identified as helping to further the very purposes that Copyright Law is instituted for, so something that Copyright Law should actively encourage.
There may be edge cases that I haven't thought of (and obviously we can't extract wholesale from anybody else's compilations or summaries that may be copyright serving a similar purpose); but overall I am not aware of a major copyright taking in anything we are presenting, or anything that should give pause or cause for concern for anybody else re-disseminating it, in whole or part, for whatever purpose. But if it turns out there is any legal uncertainty about anything we do here, then in my view our freedom to do what we do and to do more of it is a cause worth fighting for. Jheald (talk) 12:40, 10 April 2019 (UTC)[reply]
- @Jheald: On de minimis: As noted by Ghouston on the project chat, it might be difficult to argue that the inclusion of the statement is de minimis, because the statement was added to the item deliberately and manually (by Valentina.Anitnelav).
- @Jc86035:. The key phrase in copyright law is whether there has been a "substantial taking". Yes, it is true that this is judged by the quality of what is taken, rather than quantitatively how much is taken: so that in the United States cumulative extracts of 3-400 words from the memoirs of Gerald Ford were held to be substantial, because they represented the 'heart of the work'. As one gloss puts it: what matters is "the importance or value of the copied parts in relation to the original work... A small part of the original work may be highly significant to the piece as a whole." But that is not the same as saying that anything that is worth taking is ipso facto substantial and therefore should be protected. The opening line of Harry Potter may be worth deliberately and manually recording as a first line, without necessarily being a substantial taking from the piece as a whole. Jheald (talk) 16:43, 10 April 2019 (UTC)[reply]
- @Jheald: For clarification, is the "substantial taking" test used to determine whether something is fair use/infringement or public domain/fair use? I can't tell from your description, although the sources seem to indicate that it is the former, in which case it may not necessarily be relevant. Regardless, it does not altogether prevent the introduction of fair use content on Wikidata.
- As I noted on the talk page: You could infringe someone's copyright in 1,500 characters, right? With LilyPond you could infringe the copyright of an entire song in much fewer than 1,500 characters, and the server could turn it into an audio file for you. Jc86035 (talk) 17:31, 10 April 2019 (UTC)[reply]
- The purpose of the example was to indicate that there is probably already fair use content on Wikidata, which would necessitate a policy to manage the use of said fair use content. It seems to have been misinterpreted; the entire text is only 121 characters long. A lot more copyrighted content could be contained in a single monolingual-text statement, and I didn't think it was worth mentioning that one could hypothetically use more than one statement to deliberately use as much text or music notation as could vaguely be allowed under fair use. Not everything deserves a policy, but that no one has bothered to test the limits of the system doesn't mean that limits should not be defined. Jc86035 (talk) 17:38, 10 April 2019 (UTC)[reply]
- @Jheald: Some of the above is not written with the correct implications; see my edit to the lead. Jc86035 (talk) 18:18, 10 April 2019 (UTC)[reply]
- @Jc86035:. The key phrase in copyright law is whether there has been a "substantial taking". Yes, it is true that this is judged by the quality of what is taken, rather than quantitatively how much is taken: so that in the United States cumulative extracts of 3-400 words from the memoirs of Gerald Ford were held to be substantial, because they represented the 'heart of the work'. As one gloss puts it: what matters is "the importance or value of the copied parts in relation to the original work... A small part of the original work may be highly significant to the piece as a whole." But that is not the same as saying that anything that is worth taking is ipso facto substantial and therefore should be protected. The opening line of Harry Potter may be worth deliberately and manually recording as a first line, without necessarily being a substantial taking from the piece as a whole. Jheald (talk) 16:43, 10 April 2019 (UTC)[reply]
- On titles: While I did consider whether to mention titles in my initial project chat post, I decided against it because it could be argued that names and titles, being facts independent of the contents of works, are not copyrightable. I don't know whether they are relevant here. However, I think taking the same amount of words from the body of an article would be different.
- @Jheald: On de minimis: As noted by Ghouston on the project chat, it might be difficult to argue that the inclusion of the statement is de minimis, because the statement was added to the item deliberately and manually (by Valentina.Anitnelav).
- @Jc86035:. Advice from the U.S. Copyright Office is indeed that names and book titles are not generally considered copyrightable, being held to be merely 'short slogans', that must remain free to use as part of everyday language. Whether this is also true of news headlines and journal article titles, which are typically longer and more like distinct sentences, is harder to say. The German arbitration which came up with the suggestion of the seven word test was indeed intended to apply to and include newspaper headlines, that may well encapsulate the key point of the news story. Further EU law is Infopaq, which considered that a taking of eleven words could be partial reproduction for the purposes of the Infosoc Directive, "provided that the particular part in question consists of the expression of the author's intellectual creation". This perhaps might also be inferred to catch the first line of Harry Potter, in contrast to the stipulation that only the taking of a "substantial part" will be an infringement in legislation like s.16(3) of the UK CDPA [1]. Jheald (talk) 17:26, 10 April 2019 (UTC)[reply]
- A reconciliation of Infopaq with CDPA s.16(3) is that while Infopaq may read the Infosoc directive strictly to find that copyright subsists in any part of a work that "consists of the expression of the author's intellectual creation" (with Art 2 of the directive [2] containing no limitation that the part must be "significant"), nevertheless per the language of Art 8 of the directive ("Member States shall provide appropriate sanctions and remedies in respect of infringements... The sanctions thus provided for shall be effective, proportionate and dissuasive"), it may be that it is only proportionate that a copyright taking should be actionable if that taking is "significant". Just a thought. Jheald (talk) 00:08, 13 April 2019 (UTC)[reply]
- A bit more on copyright in titles. In England the High Court confirmed in the Meltwater case in 2011 that newspaper headlines could be protected by copyright, rejecting earlier Australian case-law. [3]. Meltwater subsequently overturned a different part of the judgment on appeal at the UK Supreme Court, which found that end-users viewing such headlines via Meltwater's website would not be infringers (see eg [4] and our article on en-wiki, a reading that was also subsequently taken by the CJEU. But it looks to me that the point on the subsistence of copyright in the titles had previously been confirmed on appeal by the Court of Appeal [5] and may not have been challenged at the Supreme Count. Jheald (talk) 12:38, 13 April 2019 (UTC)[reply]
- A title could be anything. Imagine that somebody wants to break the world record for the longest title, and puts an entire novel in the title. Would that be uncopyrightable? Ghouston (talk) 00:52, 14 April 2019 (UTC)[reply]
- The Court of Appeal judgment in Meltwater [6] includes citations at paragraph 21 to earlier cases where it was ruled that headlines or titles could be copyright. Jheald (talk) 19:44, 14 April 2019 (UTC)[reply]
- A title could be anything. Imagine that somebody wants to break the world record for the longest title, and puts an entire novel in the title. Would that be uncopyrightable? Ghouston (talk) 00:52, 14 April 2019 (UTC)[reply]
- A bit more on copyright in titles. In England the High Court confirmed in the Meltwater case in 2011 that newspaper headlines could be protected by copyright, rejecting earlier Australian case-law. [3]. Meltwater subsequently overturned a different part of the judgment on appeal at the UK Supreme Court, which found that end-users viewing such headlines via Meltwater's website would not be infringers (see eg [4] and our article on en-wiki, a reading that was also subsequently taken by the CJEU. But it looks to me that the point on the subsistence of copyright in the titles had previously been confirmed on appeal by the Court of Appeal [5] and may not have been challenged at the Supreme Count. Jheald (talk) 12:38, 13 April 2019 (UTC)[reply]
- A reconciliation of Infopaq with CDPA s.16(3) is that while Infopaq may read the Infosoc directive strictly to find that copyright subsists in any part of a work that "consists of the expression of the author's intellectual creation" (with Art 2 of the directive [2] containing no limitation that the part must be "significant"), nevertheless per the language of Art 8 of the directive ("Member States shall provide appropriate sanctions and remedies in respect of infringements... The sanctions thus provided for shall be effective, proportionate and dissuasive"), it may be that it is only proportionate that a copyright taking should be actionable if that taking is "significant". Just a thought. Jheald (talk) 00:08, 13 April 2019 (UTC)[reply]
- @Jc86035:. Advice from the U.S. Copyright Office is indeed that names and book titles are not generally considered copyrightable, being held to be merely 'short slogans', that must remain free to use as part of everyday language. Whether this is also true of news headlines and journal article titles, which are typically longer and more like distinct sentences, is harder to say. The German arbitration which came up with the suggestion of the seven word test was indeed intended to apply to and include newspaper headlines, that may well encapsulate the key point of the news story. Further EU law is Infopaq, which considered that a taking of eleven words could be partial reproduction for the purposes of the Infosoc Directive, "provided that the particular part in question consists of the expression of the author's intellectual creation". This perhaps might also be inferred to catch the first line of Harry Potter, in contrast to the stipulation that only the taking of a "substantial part" will be an infringement in legislation like s.16(3) of the UK CDPA [1]. Jheald (talk) 17:26, 10 April 2019 (UTC)[reply]
- I have sent another email to Legal to say, essentially, that the first email was premature and they should probably wait for the discussion to develop. I disagree that the actual question in the section header was premature, since this would be the logical step after the project chat discussion, which has not been edited for more than a day (excluding my notification of the RfC); and (as noted in my reply to MisterSynergy) the question is only "should such a policy exist", not "what should the policy contain". Commons has such a policy which mainly serves to explain that fair use content should not be uploaded. Jc86035 (talk) 14:14, 10 April 2019 (UTC)[reply]
- I think casting this as a "fair use" question, to be subject to a "fair use policy", may already be begging the question (as well as casting it in specifically U.S. terminology, that is probably unhelpful).
- Better I think, per User:VIGNERON on the talk page, would be first to ask to what extent titles, short text, or external identifiers can be quoted in Wikidata, when Wikidata is still to remain compatible with the letter of the CC0 license, and also the spirit of the licensing, namely that any element or combination of elements of the content here should be reusable by anyone for any purpose, without their needing to worry about any legal claims on it. Jheald (talk) 17:40, 10 April 2019 (UTC)[reply]
- Comment I'm not sure this is the right question - we do have a copyright policy here: Wikidata:Copyright. Does this policy need to be amended given edits that have already been included in Wikidata? Do we need to do more policing the way Commons does? Do we need to expand upon it somehow? I definitely prefer we maintain that everything in the main (and property and lexeme) space of Wikidata is CC0 one way or another. ArthurPSmith (talk) 18:03, 10 April 2019 (UTC)[reply]
- @ArthurPSmith: It's not really a policy, since it is not currently explicitly labelled a policy in any way. If the current content of Wikidata stays, then it may be necessary to create a policy to regulate the use of non-free content, since it can most likely be demonstrated to have existed for much of Wikidata's existence (the quote-from-references property has existed for several years). See also my edit to the lead. Jc86035 (talk) 18:17, 10 April 2019 (UTC)[reply]
- Oppose per above. Multichill (talk) 17:07, 11 April 2019 (UTC)[reply]
- Strong oppose per above. NMaia (talk) 11:46, 15 April 2019 (UTC)[reply]
- @NMaia: Which of the above? They all say different things. Jheald (talk) 12:18, 15 April 2019 (UTC)[reply]
- Am I the first ever man that voted support here? There's no "entire free" in the whole living world, free in one part, means that you and all others must therefore non-free in another part. --Liuxinyu970226 (talk) 13:30, 23 April 2019 (UTC)[reply]
- Oppose any changes to "all content in Wikidata must be either copyleft or public domain" and support enforcement of this rule. I oppose proposing such deep changes without careful review of consequences what is absent from this proposal Mateusz Konieczny (talk) 12:39, 9 May 2019 (UTC)[reply]
- Weak oppose we are a knowledge base, we already have a copyright page, perhaps this can be re-resolutionized in 2022-2023, as Wikidata does not have a lot of non-free content. almost everything image is from WCommons. – The preceding unsigned comment was added by Znotch190711 (talk • contribs) at 19:28, July 28, 2019 (UTC).
- Comment Note: I do not have an preference over what the outcome should be (if this should be done, or not), I am only stating facts here. Commons could not implement fair use criteria, even if they wanted. See for example wmf:Resolution:Licensing policy, but there are other examples too. Wikidata on the other hand could implement fair use criteria. The licencing on wikidata is not set in stone, like on commons. As for the legal groundwork, take a look at the Bene convention. It is definatly possible to make one on an global scale. Making an non-free criteria without using the bene convention, but instead using USA legislation, is always going to have problems with re-usability on the WMF wikis.--Snaevar (talk) 19:51, 8 August 2019 (UTC)[reply]
- (edit conflict) In my opinion, if my reasoning on the re-use implications is correct, using the same rules as the English Wikipedia would be a good place to start, since it would allow acceptable uses of non-free content which would not substantially hinder US re-users. An alternate method would be to prohibit all fair use content altogether; I do not think this approach would be as good, as it would prevent or hinder certain analyses relying on quotation from works, and would prevent users from quoting from secondary sources. Jc86035 (talk) 09:23, 10 April 2019 (UTC)[reply]
- If anything, Wikidata should have a policy of forbidding "fair use" (like on Commons) but it seems to me that it is already clear with Wikidata:Copyright. Cheers, VIGNERON (talk) 11:34, 10 April 2019 (UTC)[reply]
- @VIGNERON: Wikidata:Copyright is not a policy. The CC0 license, as I understand it, does not prohibit the use of fair use material within works which use the license. In fact, the author/licensor does not make any guarantee that they have all of the rights to the work; they only waive their own rights (see section 4c). As such, to my understanding, it is legal and not prohibited by policy for Wikidata users to upload fair use content, and a policy would be required to disallow this. Jc86035 (talk) 11:42, 10 April 2019 (UTC)[reply]
- I think it's not entirely established what can be copyrighted and whether we care about legal jurisdictions apart from the USA. What seems needed is to 1) establish which legal jurisdictions we need to care about, and determine what content in Wikidata is copyrightable and not covered by a CC0 license 2) For such material, decide whether it's of little importance, and can be deleted. Perhaps the first and last lines of copyrighted works can be deleted: they can still be included for public domain works. However, if it was required to truncate titles to 6 words, that may be highly undesirable. Likewise, for deleting facts that have been imported in bulk from Wikipedia. Perhaps Wikipedia can waive its database rights, if it doesn't already. 3) If Wikidata needs to include copyrighted material, found in 1) and 2) then it will need to rely on fair use or the like. If it doesn't need to include copyrighted material, then it can be purely public domain, in principle (ignoring user errors). Ghouston (talk) 00:20, 11 April 2019 (UTC)[reply]
- @Ghouston: On content imported from Wikipedia specifically, depending on your reading of the Foundation's global licensing policy, without a local non-free content policy it's technically permissible for Wikidata to use any copyleft ("Free Content License") content even if the content is incompatible with the CC0 license and others own some of the rights, so we would not necessarily be required to delete Wikipedia-imported statements even if it's found that the imports would violate database rights. However, the resolution (from 2007) does predate even the existence of CC0, so it's possible that the Foundation did not consider this to be a possibility at the time. Jc86035 (talk) 10:26, 11 April 2019 (UTC)[reply]
I have sent an email to [email protected] asking if it would be possible to clarify some of the related legal issues. Jc86035 (talk) 10:21, 10 April 2019 (UTC)[reply]
For clarification, the first email asks for clarification in three areas: jurisdiction and the enforcement of non-US copyright law on US servers based solely on the availability of content served; whether sui generis database rights would prevent Wikidata's reuse in the EU; and whether the CC0 license prevents the use of fair use content within CC0 works. The second email states that the first email is premature and they should probably wait a while before actually investigating anything. Jc86035 (talk) 14:17, 10 April 2019 (UTC)[reply]
- IANAL, but CC0 is a clean approach, "fair use" is as unclear as a w:en:Project:WEASEL, and EU database rights would require CC-BY 4.0 or later. Maybe GPLv3 would also work, I forgot the details about CC×GPL, but they have nothing to do with fair use and databases. –84.46.52.142 22:05, 12 April 2019 (UTC)[reply]
- It looks like the consensus above is that Wikidata should not have a fair use policy. I think the consequence is that it can't include any copyrighted material, such as the first and last lines of copyrighted books (in general; there may be cases where they are uncopyrightable.) I think database rights are not really an issue for a project hosted in the USA, except as far as individual contributors are subject to the laws of their countries, although the Wikimedia Foundation has shown a willingness to submit to German laws. Ghouston (talk) 01:00, 16 April 2019 (UTC)[reply]