Jump to content

Help talk:CirrusSearch

About this board

Can't run UpdateSearchIndexConfig.php file

7
Summary by DCausse (WMF)

Solved by downgrading from php 8.4 to php 8.2

75.130.249.175 (talkcontribs)

MediaWiki 1.39:

When I run the command php UpdateSearchIndexConfig.php in the CirrusSearch/maintenance folder, I get the following error:

[930f130bf0cbf86ca7483c41] [no req] Error: Class "MediaWiki\Extension\AbuseFilter\Parser\RuleCheckerFactory" not found Backtrace: from /var/www/html/w/extensions/AbuseFilter/includes/ServiceWiring.php(113) #0 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(124): require() #1 /var/www/html/w/includes/MediaWikiServices.php(447): Wikimedia\Services\ServiceContainer->loadWiringFiles() #2 /var/www/html/w/includes/MediaWikiServices.php(285): MediaWiki\MediaWikiServices::newInstance() #3 /var/www/html/w/includes/Hooks.php(174): MediaWiki\MediaWikiServices::getInstance() #4 /var/www/html/w/includes/exception/MWExceptionHandler.php(807): Hooks::runner() #5 /var/www/html/w/includes/exception/MWExceptionHandler.php(336): MWExceptionHandler::logError() #6 /var/www/html/w/includes/AutoLoader.php(244): MWExceptionHandler::handleError() #7 /var/www/html/w/includes/AutoLoader.php(244): require(string) #8 /var/www/html/w/extensions/AbuseFilter/includes/ServiceWiring.php(113): AutoLoader::autoload() #9 /var/www/html/w/vendor/wikimedia/services/src/ServiceContainer.php(124): require(string) #10 /var/www/html/w/includes/MediaWikiServices.php(447): Wikimedia\Services\ServiceContainer->loadWiringFiles() #11 /var/www/html/w/includes/MediaWikiServices.php(285): MediaWiki\MediaWikiServices::newInstance() #12 /var/www/html/w/includes/Setup.php(322): MediaWiki\MediaWikiServices::getInstance() #13 /var/www/html/w/maintenance/doMaintenance.php(83): require_once(string) #14 /var/www/html/w/extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php(117): require_once(string) #15 {main}


This file does exist in the AbuseFilter/includes/Parser folder. Does anyone know what's going on here?

PMiazga (WMF) (talkcontribs)

It's difficult to find out what could cause. Let me ask you couple questions/throw some suggestions before hand:

  • AbuseFilter files are autoloaded automatically thanks to composer `AutoloadNamespaces`. Please check if you have the merge-plugin enabled - Composer#Using composer-merge-plugin
  • Did you update/install anything recently? Is it a new set-up/installing new extensions you're trying to finalise, or is it something that worked before but stopped working after update?
  • I assume you already have both AbuseFilter and CirrusSearch extensions enabled (by calling wfLoadExtension() in LocalSettings file).
  • Can you specify exact versions of MediaWiki, are you on 1.39.10? I tried to run it and it worked to me, therefore it may be related to specific version or a version mismatch.
This post was hidden by DCausse (WMF) (history)
Bawolff (talkcontribs)

To confirm, does mediawiki work normally (like during web views) and do maintenance scripts in mediawiki core work fine (like e.g. view.php)? What version of AbuseFilter do you have?

75.130.249.175 (talkcontribs)

- I'm on MediaWiki 1.39.5.

- Just enabled the merge plugin, all composer json files should be correct. It should be noted that I installed the plugin using the tarball file, as I can't figure out how to install it from git and have it be compatible for MediaWiki 1.39.

- Both AbuseFilter and CirrusSearch extensions are enabled in LocalSettings.php

- I haven't installed any new plugins recently - I just did a full php re-install to see if that was the problem and it wasn't

- My wiki works normally in web view; scripts like view.php don't work because they run into the same error from AbuseFilter

DCausse (WMF) (talkcontribs)

Could you check if you have multiple versions of AbuseFilter installed?

I wonder if with multiple version installed, an old one gets its class loaded (likely <= 1.36) but the new one gets its ServiceWiring file executed.

Perhaps one way to investigate would be to debug the issue by printing the location of the AbuseFilter classes location:

$reflector = new \ReflectionClass( 'MediaWiki\Extension\AbuseFilter\FilterUser' );
print("FilterUser class location: " . $reflector->getFileName() . "\n");

You could perhaps put this at the very beginning of /var/www/html/w/extensions/AbuseFilter/includes/ServiceWiring.php?

Bawolff (talkcontribs)

Just to close the loop, this user reports that the issue went away after they downgraded from php 8.4 -> php 8.2

Jonteemil (talkcontribs)

In the page it says that the search index will be updated, at least once a day. I've been trying to fix broken files over at Commons that have 0 x 0 px. I used the search fileh:0 filew:0 filetype:image -filemime:image/tiff to find them. Now, files I fixed weeks ago are still listed in the results. When will they go away?

DCausse (WMF) (talkcontribs)

Thanks for reporting the problem, there seems to be a problem in the way CirrusSearch is handling these edits, I filed Phab:T342562 to track and fix the issue.

Jonteemil (talkcontribs)

Okay, perfect.

Reply to "Search index update"

How to search the fields of the File information template on Commons?

6
Prototyperspective (talkcontribs)

Please see this thread. How to search for example for a specific string specifically in the source field?

Also how can one search for files from a specific uploader? (I'd like to check which of my video2commons uploads were imported below resolution at source.)

EBernhardson (WMF) (talkcontribs)

Unfortunately, the image description is simply an argument to a template. CirrusSearch doesn't do anything at that level and can't be that specific. Something like insource:kathmandu would require the wikitext source to have the word kathmandu in it, but it's not a great substitute.


Regarding filtering by uploader, I'm not too familiar with how the P170 there is structured, but with structured data available it seems plausible the appropriate information could be indexed. Today though P170 is indexed as a plain statement and does not include any context about it. The best workaround i could provide is that the Information template used on many images renders such that the searching for "Author <name>" , with the quotes, tends to bring up only pictures from them.

Prototyperspective (talkcontribs)
  1. I don't know why but the results for insource:"kathmandu" don't seem to show the intended results
  2. The uploader username is not in the structured data
  3. The link you shared only shows original works by that username
  4. So I will create an issue for enabling showing uploads by a particular user (please let me know if this could/should be changed in a tool other than CirrusSearch)
  5. I think the best workaround currently would be to use insource with the field name first so for example I searched for insource:"|source=[https://soundcloud.com to identify files for c:Category:Audio files from Soundcloud.com. I think easily searching fields of the File pages' Information template could be enabled by
    1. Developing some regex that searches for any content after e.g. |source=
    2. Creating some alias for it so instead of writing some complex regex query every time one can simply enter e.g. info-source:"soundcloud.com"
Keith D (talkcontribs)

{{user|keith_d}}

A problem with searching the information template fields for things like author is that author also appears in the {{tl|Credit line}} template and the 2 could be different.

Prototyperspective (talkcontribs)

I first misunderstood what you were saying but understood it via your comment in your proposal. That's may be an issue for other templates, but I think in that case it doesn't matter because it would also contain the same author name so it would even be best if both fields are searched (actually it would be a problem if it doesn't search both fields).

This post was hidden by Clump (history)
Reply to "How to search the fields of the File information template on Commons?"

Searching talk pages that use Structured Discussions

3
HaeB (talkcontribs)

Is it possible to use CirrusSearch to search (the topic pages of) a particular talk page that uses Structured Discussions (like this one)? I.e. restrict search results to only topics from that talk page.

Pppery (talkcontribs)

No.

Tacsipacsi (talkcontribs)

Unfortunately, it’s not possible to search Structured Discussions pages using CirrusSearch at all, with or without constraining the search to a particular page. This is one of the many reasons for which Structured Discussions is deprecated and to be replaced with DiscussionTools.

Reply to "Searching talk pages that use Structured Discussions"

Abuse filter logs on plwikiquote

4
Ferien (talkcontribs)

I'm not really sure why this is occuring, but I'm pretty sure this isn't supposed to happen in abuse filter logs to this level.

Tacsipacsi (talkcontribs)

I’m pretty sure plwikiquote shouldn’t block the account from being created (filter 3). I don’t think CirrusSearch is really at fault here – it just tries to create its account on first use. (Since it doesn’t succeed, the next time also counts as the first use. And the next one. And so on.)

Ferien (talkcontribs)

Thanks, I didn't know why it was occurring or what abuse filter it was relating to as I can't understand the language.

DCausse (WMF) (talkcontribs)

Thanks, I reported phab:T373778 to have a closer look into it.

Reply to "Abuse filter logs on plwikiquote"
Beland (talkcontribs)
Pppery (talkcontribs)

That's intentionally blank, as a result of an untidy refactoring in 2015 that's not worth fixing now. This page uses Structured Discussions, which doesn't have the concept of archiving, and instead uses an infinite scroll system.

Beland (talkcontribs)

Aha! Hmm, that seems somewhat poor. There's no indication in the UI that scrolling down to the bottom of the page and staying there will show more threads, and there's no apparent facility for searching the entire history of the page? The URL I loaded was:

https://www.mediawiki.org/wiki/Help_talk:CirrusSearch#Relevance_52070

It seems like that should take me to the thread if it's on the page, but I can't tell if it is or isn't, and searching on my username doesn't really work because some threads are collapsed. -- Beland (talk) 03:40, 25 June 2024 (UTC)

Pppery (talkcontribs)

That would be Topic:S8cojikw0xzel2u8 (found via your contributions). The URL seems to have dated from way back when LiquidThreads was involved, and stopped working in 2015 when this page was migrated.

The chance of this getting fixed, realistically, is zero, since both discussion systems are deprecated and going to be removed someday.

Beland (talkcontribs)

Ah, whew, I was a bit worried these were going to be spreading to other wikis. 8)

Reply to "Archive broken?"
2001:14BA:9CD6:4200:D43C:5ABA:9AD8:104 (talkcontribs)
2001:14BA:9CD6:4200:D43C:5ABA:9AD8:104 (talkcontribs)

That ping didn't work so I'll try again: User:JWBTH

JWBTH (talkcontribs)

Done, thanks for pointing out.

This post was hidden by JWBTH (history)
Reply to "Update en-wiki"

Automatically jump to first result

2
Aschroet (talkcontribs)
TheDJ (talkcontribs)

Multiple keyword searches

2
Seeker1030 (talkcontribs)

Hi how to search using multiple key words, For eg: Libra ascendant born on 1965 how could we search this parameters

Speravir (talkcontribs)

Simply by typing libra ascendant born 1965 into the search form (I assume "on" is a so called stop word). If there are dedicated categories for a topic you could also use the filter word incategory, e.g. ascendant libra incategory:"1965 births".

Reply to "Multiple keyword searches"
Pirhayati (talkcontribs)

Hi. In case I want to search (fuzzy search) two words with a word to fit in but not the exact sequence of the two words, is it possible? For example I want "flowers for Algernon" to be in findings but not "flowers Algernon".

DCausse (WMF) (talkcontribs)

Hi,

Unfortunately no, you could do an approximation by using a negation: "flowers Algernon"~1 NOT "flowers Algernon".

The first part would find documents with flowers algernon or flowers for algernon and the second part would exclude documents matching flowers algernon.

In the end you might find pages that have occurrences of flowers for algernon but not all of them. If a page have both forms flowers for algernon and flowers algernon it would be excluded.

Pirhayati (talkcontribs)

Thank you. It works for me.

DCausse (WMF) (talkcontribs)

Hi, I saw that you contacted me on IRC but I responded too late.

We don't have immediate plans to improve this kind of queries and implement the feature you need so I would suggest to file a new ticket at https://phabricator.wikimedia.org/ (tagging CirrusSearch) to describe your usecase.

Thanks!

Reply to "fuzzy search"