Page MenuHomePhabricator

Activate write API on production wikis
Closed, ResolvedPublic

Description

what does it take to activate it?


Version: unspecified
Severity: enhancement

Details

Reference
bz14210

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:08 PM
bzimport set Reference to bz14210.
bzimport added a subscriber: Unknown Object (MLST).

Some favorable reports from testers trying it out on test.wikipedia would be nice.

Bryan.TongMinh wrote:

My basic edit test using mwclient http://mwclient.sourceforge.net/ works. See http://mwclient.svn.sourceforge.net/viewvc/mwclient/tests/basic_edit_test.py?view=log for the test routine.

This test routine contains:

  • Page creation
  • Old file deletion
  • File deletion
  • Page deletion

We need upload support I think before continuing.

(In reply to comment #2)

We need upload support I think before continuing.

I agree that we need upload support, but I don't agree that we absolutely *need* it: we can develop it later and disable it at WMF ($wgAPIModules['upload'] = 'ApiDummy'; where ApiDummy is a class that always throws an error message) until we've tested it.

As to your script: I can't really read Python, but judging by the messages it prints it looks like it does what you said.

I also think that an extra sort of check should be used to ensure data integrity. For example, if you send the edit token early (i.e., before the payload of the actual body of the edit) and transmission of the body is interrupted for some reason, the bad edit will still get committed. This doesn't happen in normal client edits, as most browsers send post arguments in the order in which they appear on the final rendered page, and the edit token appears after the body; so, if transmission is interrupted, the correct edit token won't be sent, and the bad edit won't be committed.

... however, when it comes to the API, most edit requests will be coming from developers who *might* stick the edit token after the payload, but could realistically make it appear anywhere. It might be an idea to explicitly request that it be sent after the body; or, alternatively, add an extra argument that hints a crc (e.g., MD5 or SHA-1) so that if the payload doesn't match the crc, the edit is discarded with an error. The downside to the latter method would be that non-md5/sha-1-enabled clients would be left in the dark without explicit modification to integrate the hash algorithms.

--slakr

(In reply to comment #4)

... however, when it comes to the API, most edit requests will be coming from
developers who *might* stick the edit token after the payload, but could
realistically make it appear anywhere.

If that's the case those developers should ensure that token= is sent after text=. It's really not that hard to come up with or to implement.

This is going off on a tangent, really, but I wonder if it wouldn't be too much trouble to require some type of checksum parameter (MD5? SHA? CRC?) in the edit API? That would immediately catch most types of edit corruption, even things like charset issues or broken proxies that the current checks don't, and wouldn't rely on details like parameter ordering. The downside, of course, would be that clients would have to compute the checksum, but surely any decent programming language these days has functions to compute a simple hash, right?

(In reply to comment #6)

This is going off on a tangent, really, but I wonder if it wouldn't be too much
trouble to require some type of checksum parameter (MD5? SHA? CRC?) in the edit
API? That would immediately catch most types of edit corruption, even things
like charset issues or broken proxies that the current checks don't, and
wouldn't rely on details like parameter ordering. The downside, of course,
would be that clients would have to compute the checksum, but surely any decent
programming language these days has functions to compute a simple hash, right?

Added md5 parameter in r35473

(In reply to comment #5)

(In reply to comment #4)

... however, when it comes to the API, most edit requests will be coming from
developers who *might* stick the edit token after the payload, but could
realistically make it appear anywhere.

If that's the case those developers should ensure that token= is sent after
text=. It's really not that hard to come up with or to implement.

methinks in some languages that's not trivial, as often it's not defined that the order in an associative array (hash) is maintained.

(In reply to comment #8)

methinks in some languages that's not trivial, as often it's not defined that
the order in an associative array (hash) is maintained.

Well at least we now have the md5 parameter (see comment #7), which should solve that issue.

matthew.britton wrote:

Any progress on getting API editing into a sufficiently usable state for Wikimedia use? Seems that this would reduce the bandwidth demands of many bots and editing tools.

(In reply to comment #10)

Any progress on getting API editing into a sufficiently usable state for
Wikimedia use? Seems that this would reduce the bandwidth demands of many bots
and editing tools.

It's being slowly worked upon. So far it's still very immature.

http://noc.wikimedia.org/conf/highlight.php?file=CommonSettings.php shows "#$wgEnableAPI = false;", which means that it is using the setting in DefaultSettings.php, which is $wgEnableAPI = true;. Not emabled for private/fishbowl wikis.

Hence: RESOLVED.

mountainblueallah wrote:

*** Bug 14863 has been marked as a duplicate of this bug. ***