Page MenuHomePhabricator

Commons search seems to have stopped indexing statements since 30 October 2019
Closed, ResolvedPublic

Event Timeline

Hmm, in all of the linked pages the version numbers in elasticsearch match the revision id of the latest page, so updates are making it through.

Using one of the example pages: https://commons.wikimedia.org/wiki/File:Group_of_Hetepheres_II_and_Meresankh_III-30.1456-IMG_4559-gradient.jpg

We can ask cirrus to build a new document without writing it to elastic (test the pipeline essentially): https://commons.wikimedia.org/w/api.php?action=query&prop=cirrusbuilddoc&titles=File:Group_of_Hetepheres_II_and_Meresankh_III-30.1456-IMG_4559-gradient.jpg

Indeed this does not contain statement_keywords, even though the file has P180=Q74377458. Will look closer into this and see where it's gone haywire.

Change 550578 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@master] Restore CirrusSearchBuildDocumentParser hook

https://gerrit.wikimedia.org/r/550578

Change 550578 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@master] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550578

Change 550769 had a related patch set uploaded (by EBernhardson; owner: EBernhardson):
[mediawiki/extensions/CirrusSearch@wmf/1.35.0-wmf.5] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550769

Change 550769 merged by jenkins-bot:
[mediawiki/extensions/CirrusSearch@wmf/1.35.0-wmf.5] Restore CirrusSearchBuildDocumentParse hook

https://gerrit.wikimedia.org/r/550769

Mentioned in SAL (#wikimedia-operations) [2019-11-14T00:36:47Z] <ebernhardson@deploy1001> Synchronized php-1.35.0-wmf.5/extensions/CirrusSearch/includes/BuildDocument/BuildDocument.php: T237849: Restore CirrusSearchBuildDocumentParse hook (duration: 00m 54s)

Mentioned in SAL (#wikimedia-operations) [2019-11-14T00:41:06Z] <ebernhardson> T237849 Start CirrusSearch forceSearchIndex.php commonswiki 2019-10-20T00:00:00 - 2019-11-14T01:00:00 pushing into jobqueue

Backfill has completed, this should be resolved.

I can confirm this is resolved. Just did some edits at https://commons.wikimedia.org/w/index.php?title=File:20120922-Collse_Watermolen-042.jpg&action=history and I see the statements in https://commons.wikimedia.org/w/index.php?title=File:20120922-Collse_Watermolen-042.jpg&action=cirrusdump :

statement_keywords
0 "P180=Q2117023"
1 "P7482=Q66458942"
2 "P6216=Q50423863"
3 "P275=Q18195572"

This task can probably be closed. Thanks for fixing.