Page MenuHomePhabricator

Create a user right that allows ignoring the spam blacklist
Closed, ResolvedPublic

Assigned To
None
Authored By
Rillke
Mar 3 2012, 11:39 AM
Referenced Files
None
Tokens
"Like" token, awarded by BethNaught."Like" token, awarded by Cyberpower678."Like" token, awarded by Xaosflux."Like" token, awarded by waldyrious."Dislike" token, awarded by zhuyifei1999.

Description

Since this list is (ab)used more and more on Meta, disrupting the work of Commons administrators tagging images as copyright violations (source) and it is very inflexible so URLs like http://www.google.de/url? (\bgoogle\..*?\/url\?) are blacklisted, there is the need that at least established users can override it.

AbuseFilter and the Title blacklist are much more flexible in this area.

Thanks in advance.


Version: unspecified
Severity: normal
See Also:
T57794: Allow file sources to be exempted from spam blacklist

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
DannyS712 subscribed.

This is honestly technically simple. If no user groups should be given the right by default, thats fine.

Change 552618 had a related patch set uploaded (by DannyS712; owner: DannyS712):
[mediawiki/extensions/SpamBlacklist@master] Add sboverride right to override tho spam blacklist

https://gerrit.wikimedia.org/r/552618

The patch provided:

  • Adds a new user right, sboverride[1]
  • That, by default, is not given to anyone
  • That allows the user to override the spam blacklist
  • On both normal edits and file uploads

[1] Name inspired by tboverride allowing users to bypass title blacklist

I don't think it unreasonable that sysops should have it by default, and maybe bots.

As a bot operator this patch would make a big difference to save dead links and fix usurped domains.

I prefer being blocked in normal operations, it allows to see where a problem exists and fix it. But there are cases, such as replacing a usurped domain with archive URL versions (from before usurpation), the filter makes it impossible. It can impact thousands at a time.

I don't think it unreasonable that sysops should have it by default, and maybe bots.

I agree, but configuration can be tweaked later - my first priority was getting the right created, so that sites can use it by manually granting it to admins.

@Cyberpower678 as far as WMF defaults go, may want to use this like the 'flood' flag. Create a standalone group for the permission, allow sysops to +/- the group, and perhaps bots to +self/-self from it.

@Anomie gave a -2 on gerrit, saying "The problem is concern over the impact when only some users are prevented and others are not." - this patch being accepted would have absolutely no impact on some users being prevented and others not, since it doesn't give the right to anyone.

  • For non-WMF wikis, this conversation doesn't need to be had, since it should be up to wiki operators to decide
  • For WMF wikis, this conversation should be had, but until it is had, the feature should be available to non-WMF wikis

By not assigning the right to anyone in the initial patch, I had hoped to avoid exactly this issue of requiring the discussion before implementing the technical framework

I don't understand the issue. We have abusefilters already that block additions of specific content by users matching specific criteria, and noone is complaining there. How is this different, albeit a bit more generalized?

I don't understand the issue. We have abusefilters already that block additions of specific content by users matching specific criteria, and noone is complaining there. How is this different, albeit a bit more generalized?

The issue is that abuse filters can be tuned to specifically block a certain user, group of users, simply to log, warn, or deny the specific action. A spam blacklist outright blocks an edit that introduces anything matching a pattern in the regex list. It can't be overridden and it can't be targeted towards specific users.

There may be the problem. An user that owns the right adds a text containing a spam link, but the user doesn't know the link is in the spam list. I think, there should be an alert message while saving a text with adding links contained in the spam list.

There may be the problem. An user that owns the right adds a text containing a spam link, but the user doesn't know the link is in the spam list. I think, there should be an alert message while saving a text with adding links contained in the spam list.

There probably should be, but perfection is the enemy of good. Its better to give this functionality and then make it pretty

DannyS712 changed the task status from Open to Stalled.EditedDec 10 2019, 9:00 PM

Patch is ready, but has -2 pending approval

IDK why the right is needless, I have a bot running a task to fix pages by null editing whole page in a non-WMF wiki, and there are always pages cannot be edited because they included spam links come from meta-wiki blacklist.

And I have no choice but manually add the domain in local whitelist :(

I really appreciate if bot can have right to override it.

Come on. This has been discussed for literally seven years. "Perfection is the enemy of the good" indeed.

Change 584712 had a related patch set uploaded (by Dbarratt; owner: Dbarratt):
[mediawiki/extensions/SpamBlacklist@master] Add user right to bypass blacklist

https://gerrit.wikimedia.org/r/584712

Whoops... created a duplicate of this task (as well as a duplicate patch). I noticed a discussion on Wikidata about this topic.

Change 584712 abandoned by Dbarratt:
Add user right to bypass blacklist

Reason:
Duplicate of I516dc68ec7a2dfaa82647feb67ec9bd264b8c380

https://gerrit.wikimedia.org/r/584712

So, to clarify the status of this task: I marked it as stalled because the patch has a -2 from @Anomie saying "Giving this a -2 until the concerns raised on the task are addressed...The problem is concern over the impact when only some users are prevented and others are not."
I believe that, since my patch does not grant the right to any users by default, this is not really an issue - end users (sites) can decide if they want to grant the right to a specific user groups, and deal with the results accordingly

So, to clarify the status of this task: I marked it as stalled because the patch has a -2 from @Anomie saying "Giving this a -2 until the concerns raised on the task are addressed...The problem is concern over the impact when only some users are prevented and others are not."
I believe that, since my patch does not grant the right to any users by default, this is not really an issue - end users (sites) can decide if they want to grant the right to a specific user groups, and deal with the results accordingly

Makes sense to me.

Why is this still stalled, even though a patch exists?

Aklapper changed the task status from Stalled to Open.Jun 6 2021, 9:13 AM

Why is this still stalled, even though a patch exists?

So, to clarify the status of this task: I marked it as stalled because the patch has a -2 from @Anomie saying "Giving this a -2 until the concerns raised on the task are addressed...The problem is concern over the impact when only some users are prevented and others are not."
I believe that, since my patch does not grant the right to any users by default, this is not really an issue - end users (sites) can decide if they want to grant the right to a specific user groups, and deal with the results accordingly

Just a reminder, as someone who has several times run upload projects with over 100,000 files, with names automatically created with sensible and published naming rules, Commons is routinely losing out on content because of bizarre matches to things like a repeated name in the title of a book in Latin; this happens so often I just never return to these skipped uploads. Even if the right was limited to users with over 100,000 edits, it would be a great improvement and avoid wasting significant volunteer time writing scripts to work around a badly designed or untested blacklist.

Another example: I run a deletion request archiver bot on Commons which once in a while gets limited by the spam filter. When working behind this manually, I'm unable to perform these edits even as a sysop, so I spend hours to find the relevant matches and "nowiki" them in pages of dozens of KB size. The cost of such manual work is hardly in good relation to the benefit.

In T36928#7136600, @Krd wrote:

Another example: I run a deletion request archiver bot on Commons which once in a while gets limited by the spam filter. When working behind this manually, I'm unable to perform these edits even as a sysop, so I spend hours to find the relevant matches and "nowiki" them in pages of dozens of KB size. The cost of such manual work is hardly in good relation to the benefit.

I fully agree, and furthermore since my patch *does not grant anyone the rights yet* pending actual discussions about who should have it, but rather just makes it possible to grant, I hope @Anomie will reconsider their -2

Also, it shouldn't take hours to find the matches, you can check the spamblacklist log to see which url triggered it

In T36928#7136599, @Fae wrote:

Just a reminder, as someone who has several times run upload projects with over 100,000 files, with names automatically created with sensible and published naming rules, Commons is routinely losing out on content because of bizarre matches to things like a repeated name in the title of a book in Latin; this happens so often I just never return to these skipped uploads. Even if the right was limited to users with over 100,000 edits, it would be a great improvement and avoid wasting significant volunteer time writing scripts to work around a badly designed or untested blacklist.

spam blacklist only checks urls, so your title names will still be an issue.

If you have issues with title blacklist you should talk to your community about an option to have something with the right tboverride

If the technical implementation in the patch works then I think it should be merged. It's up to local communities (and external projects) to grant this right or not. There's not a particularly good reason for why it shouldn't be technically possible for them to decide though.

I agree with Proc. This capability should be added, and communities and external projects should weigh the pros and cons themselves.

A couple of responses:

In T36928#7136599, @Fae wrote:

Just a reminder, as someone who has several times run upload projects with over 100,000 files, with names automatically created with sensible and published naming rules, Commons is routinely losing out on content because of bizarre matches to things like a repeated name in the title of a book in Latin; this happens so often I just never return to these skipped uploads. Even if the right was limited to users with over 100,000 edits, it would be a great improvement and avoid wasting significant volunteer time writing scripts to work around a badly designed or untested blacklist.

spam blacklist only checks urls, so your title names will still be an issue.

If you have issues with title blacklist you should talk to your community about an option to have something with the right tboverride

Fae does have tboverride on Commons because they are a templateeditor. They shouldn't experience these problems.

Re the March 30 2020 comment by @dbarratt re naming, perhaps spamblacklistoverride? I tend to agree with @DannyS712, I don't really care as long as it is approved.

@Aklapper this is really blocked by @Anomie's -2 rather than needing to improve the patch, I haven't rebased it and updated it since a -2 implies that it would never be accepted

Removing task assignee due to inactivity, as this open task has been assigned for more than two years. See the email sent to the task assignee on February 06th 2022 (and T295729).

Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.

If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".

Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.

Removing task assignee due to inactivity, as this open task has been assigned for more than two years. See the email sent to the task assignee on February 06th 2022 (and T295729).

Please assign this task to yourself again if you still realistically [plan to] work on this task - it would be welcome.

If this task has been resolved in the meantime, or should not be worked on ("declined"), please update its task status via "Add Action… 🡒 Change Status".

Also see https://www.mediawiki.org/wiki/Bug_management/Assignee_cleanup for tips how to best manage your individual work in Phabricator.

Per above, this is really blocked by the -2 on the open patch. @Anomie I believe that the concerns that led to your -2 have been addressed - the patch does not grant this right to any group, so issues of allowing some users to bypass the blacklist while others cannot need not be evaluated now, it should be up to individual communities (including non-WMF wikis) to make that decision for themselves. If you don't agree, can you explain why not? If you do agree, please remove your -2

I was going to raise this on Anomie's talk page, but it seems Oshwah beat me to it. Anomie explained here that they are physically unable to remove their -2. I am not very familiar with this process, so I am not sure what the appropriate course of action would be from here.

That -2 is on a patch though right, not on this task - a forked patch can just be submitted right?

From what I have gathered (this will be my second post ever on phab, so take my understanding with a grain of salt), the -2 is on the patch, not the task.

Semi-relevant aside, if the -2 was on this task, could we still create a new task (and not merge them as duplicates)?

I'll remove Anomie's -2 and see what I can do about re-opening their Gerrit account.

I fully agree, and furthermore since my patch *does not grant anyone the rights yet* pending actual discussions about who should have it, but rather just makes it possible to grant

This line of argumentation doesn't really make much sense to me, there is no point adding a feature if it won't get used. The comments here are pretty clear that the feature *will* get used, so trying to stand behind it saying it doesn't grant anyone the rights really doesn't make sense. I think it should be granted to sysop by default and then if wikis want it added to other groups like how tboverride is done, they can go through the normal channels for it.

I agree with the concerns about allowing some users to make some edits and not others, but I found T36928#5691865 convincing in that it's already possible to do this via AbuseFilter, so adding it to SpamBlacklist doesn't make the situation *that* much worse.

Change 552618 merged by jenkins-bot:

[mediawiki/extensions/SpamBlacklist@master] Add `sboverride` right to override the spam blacklist

https://gerrit.wikimedia.org/r/552618

If a user with sboverride permission performed an edit and override a SpamBlacklist rule, will such edit be logged at Special:Log/spamblacklist? If not, I suggest to log it or add a tag to that edit, so that user will know they do override a rule.