Author: richardg_uk
Description:
At present, the number of jobs in the job queue is exposed through the API but the delay caused by queuing is not. But the age of the oldest job would often be more helpful to know (at least for editors, who can be concerned or confused if they see categories unchanged for some time after pages are edited).
The API, through ApiQuerySiteinfo::appendStatistics() and SiteStats::jobs(), already exposes the estimated number of queued jobs:
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=statistics
-> <api><query><statistics ... jobs="918518" /></query></api>
Since MediaWiki version 1.19, the job table has included a job_timestamp field. The field is already indexed. Therefore exposing MIN(job.job_timestamp) as an additional API output should be easy and efficient.
An alternative or additional new statistic would be the queue duration, i.e.:
time() - MIN(job.job_timestamp)
This relative measure would be more suitable for graphing, especially if the site statistics are aggressively cached (since it would typically be more stable than the absolute timestamp during a caching interval).
The API could then return something like:
http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=statistics
-> <api><query><statistics ... jobs="918518" joboldesttime="2012-12-19T10:59:59Z" joboldestseconds="86412" /></query></api>
(Incidentally, the queue duration might be a useful or at least interesting additional metric for Ganglia, since it would help to distinguish a pathological backlog from high throughput.)
Version: 1.21.x
Severity: enhancement