Page MenuHomePhabricator

Implement ability to get page title of a deleted revision by its rev_id (and use it!)
Closed, ResolvedPublic

Description

I'm working on a cross-WMF-wikis spam list with a few stewards, which sadly has a huge data corruption because of the iso-latin the original dev (Beetstra) used in the database instead of a plain UTF8. We've been trying to correct it, but we're hitting a problem when it comes to get the page title of revs that have been deleted : it's impossible. (end of "my life message")

In the user interface, Special:Undelete uses a timestamp instead of a rev id, and requires a page title, and allows to see the contents and comments.

In the API, &list=deletedrevs provides similar information.

The problem is some links are provided with nothing more than a rev_id (especially when shortened, "/wiki/?oldid=12345" for instance)

It should be possible to get the page title and perhaps other properties of a deleted rev by providing only its id.
It could also prove useful to alter Special:Undelete to use revids instead of timestamps, seing as timestamp seem somewhat "messy" (there's always a possibility for two edits to occur in the same second for the same page).


Version: unspecified
Severity: enhancement

Details

Reference
bz23489

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:59 PM
bzimport set Reference to bz23489.
bzimport added a subscriber: Unknown Object (MLST).

This would also help sysops that get a link to a specific revision of deleted page (or a deleted revision) without a &title= in the URL. It's near impossible for that sysop to then restore that revision or page using the GUI.

For example the following URL:
http://commons.wikimedia.org/w/index.php?diff=40130742&oldid=40130724

How would I do that without knowing the title is "Test12" ?

Krinkle

ayg wrote:

There's no index on ar_rev_id. If this were added, the bug should be easy to fix.

Due to the content being deleted, the page name should only be visible if the current user is a sysop.
In that case the pagetitle can be queried from the 'archive' table by checking with ar_rev_id and/or ar_page_id.

Whenever available the data from where ar_rev_id is "diff" should be gotten, instead of where ar_rev_id equals "oldid", since the page name could've been changed between the two.

Also, a bit, but not much, more complicated would be url's that (only) have "curid" specified.
WHERE ar_page_id=$wgRequest['curid'] ORDER BY ar_timestamp DESC would return the latest known revisio and thus the ar_title.

I'm not sure but I guess the "deletion / protection log" excerpt that is usually shown on deleted pages, can then be shown aswell.

ie. to make

all show the same as: http://commons.wikimedia.org/w/index.php?title=Test12 , if the current user has the appropiate undelete-related permissions.

Change 168646 had a related patch set uploaded by Anomie:
API: Split list=deletedrevs into prop=deletedrevisions and list=alldeletedrevisions

https://gerrit.wikimedia.org/r/168646

Change 168646 merged by jenkins-bot:
API: Split list=deletedrevs into prop=deletedrevisions and list=alldeletedrevisions

https://gerrit.wikimedia.org/r/168646