Wikidata:Property proposal/Imagehash difference hash
Imagehash difference hash
[edit]Originally proposed at Wikidata:Property proposal/Sister projects
Description | Imagehash difference hash is hash which tells whether two images look nearly identical. |
---|---|
Represents | Imagehash difference hash (Q124969714) |
Data type | String |
Domain | mediainfo (Commons only) |
Allowed values | [a-z\d]{16} |
Example 1 | M68454019 → e1f1c6c4c4c0e9ca |
Example 2 | M68456558 → b0e47a8ac4c6c4c8 |
Example 3 | M68455617 → ecede1f06d23fc47 |
Example 4 | M68456184 → 92d8c1c491ccc0e8 |
Source | |
Planned use | First I would populate hash values for photos uploaded by user:FinnaUploadBot, but generally hash could be added to all of the Commons files |
Number of IDs in source | currently there is 100M files in Commons and checksum can be calculated to all photos |
Expected completeness | eventually complete (Q21873974) |
Robot and gadget jobs | checksum should be generated by bot |
See also |
|
Motivation
[edit]Same as with pHash proposal -- I am using the pHash and dHash checksums to detect duplicate photos in the Commons. I am also using pHashes and dHashes to confirm if the photos in the Commons and Finna repositories are the same. However, it would be useful if hashes could be shared so any user could query them. Pre-generated perceptual hashes of files could also be fetched from SDC as a list without downloading actual files. However, as there is slight wobbling in the hashes (because of scaling and compression), matching is much more robust when filtering out false negatives/false positives with a second hash, which is calculated using a different method so I would add the the dHash to the uploaded files too. --Zache (talk) 11:51, 18 March 2024 (UTC)
Discussion
[edit]@Abbe98, Multichill, Jura1, Tinker Bell: pinging for attention who was interested in pHash. Regards, ZI Jony (Talk) 17:02, 27 March 2024 (UTC)
- Support Complements pHash for additional reliability in matching images. Ipr1 (talk) 21:16, 27 March 2024 (UTC)
- Support --Tinker Bell ★ ♥ 01:22, 28 March 2024 (UTC)
- @Zache, Ipr1, Tinker Bell: Done: Imagehash difference hash (P12563). Regards Kirilloparma (talk) 03:19, 30 March 2024 (UTC)
- Thanks! Ipr1 (talk) 03:22, 30 March 2024 (UTC)
- You're welcome! Regards Kirilloparma (talk) 15:58, 30 March 2024 (UTC)