Jump to content

TinEye: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Tag: Reverted
revert
Line 43: Line 43:
# '''Average the colors''' Compute the mean value of the 64 colors.
# '''Average the colors''' Compute the mean value of the 64 colors.
# '''Compute the bits''' This is the fun part. Each bit is simply set based on whether the color value is above or below the mean.
# '''Compute the bits''' This is the fun part. Each bit is simply set based on whether the color value is above or below the mean.
# '''Construct the hash''' Set the 64 bits into a 64-bit integer. The order does matter, just as long as you are consistent.
# '''Construct the hash''' Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent.


The resulting hash should be something like 8f373714acfcf4d0. To compare two images, construct the hash from each image and count the number of bit positions that are different. This is a [[Hamming distance]]. A distance of zero indicates that it is likely a very similar picture or a variation of the same picture. A distance of 5 means a few things may be different, but they are probably still close enough to be similar. A distance of 10 or more is a probable indication that the images are different.
The resulting hash should be something like 8f373714acfcf4d0. To compare two images, construct the hash from each image and count the number of bit positions that are different. This is a [[Hamming distance]]. A distance of zero indicates that it is likely a very similar picture or a variation of the same picture. A distance of 5 means a few things may be different, but they are probably still close enough to be similar. A distance of 10 or more is a probable indication that the images are different.

Revision as of 08:48, 24 December 2022

TinEye
Type of site
Image Search Engine
Available inmultilingual
OwnerIdée, Inc.
URLtineye.com
CommercialYes
RegistrationOptional
LaunchedMay 6, 2008; 16 years ago (2008-05-06)
Current statusActive

TinEye is a reverse image search engine developed and offered by Idée, Inc., a company based in Toronto, Ontario, Canada. It is the first image search engine on the web to use image identification technology rather than keywords, metadata or watermarks.[1] TinEye allows users to search not using keywords but with images. Upon submitting an image, TinEye creates a "unique and compact digital signature or fingerprint" of the image and matches it with other indexed images.[1] This procedure is able to match even heavily edited versions of the submitted image, but will not usually return similar images in the results.[1]

History

Idée, Inc. was founded by Leila Boujnane and Paul Bloore in 1999. Idée launched the service on May 6, 2008 and went into open beta in August that year.[2][3] While computer vision and image identification research projects began as early as the 1980s,[4] the company claims that TinEye is the first web-based image search engine to use image identification technology. The service was created with copyright owners and brand marketers as the intended user base, to look up unauthorized use and track where the brands are showing up respectively.[5]

In June 2014, TinEye claimed to have indexed more than five billion images for comparisons.[6] However, this is a relatively small proportion of the total number of images available on the World Wide Web.[7]

As of June 2022, TinEye's search results claim to have over 54.3 billion images indexed for comparison.[8]

Technology

A user uploads an image to the search engine (the upload size is limited to 20 MB) or provides a URL for an image or for a page containing the image. The search engine will look up other usage of the image in the internet, including modified images based upon that image, and report the date and time at which they were posted. TinEye does not recognize outlines of objects or perform facial recognition, but recognizes the entire image, and some altered versions of that image. This includes smaller, larger, and cropped versions of the image. TinEye has shown itself capable of retrieving different images from its database of the same subject, such as famous landmarks.[9]

TinEye is capable of searching for images in JPEG, PNG, WebP, GIF, BMP and TIFF format.[10]

Results generated from TinEye include the total number of matches in their database, a preview image, and the URL to each match. TinEye can sort results by best match, most changed, biggest image, newest, and oldest.

User registration is optional and offers storage of the user's previous queries. Other features include embeddable widgets and bookmarklets. TinEye has also released their commercial API.

Algorithm

Although TinEye doesn't disclose the algorithm used in their reverse image search, various methods for image matching exist. Perceptual hashing is a method of creating a unique hash that can describe an image and is tolerant to changes. A few methods of creating a hash exist, such as radial variance, block mean value, Marr/Hildreth edge detection, and DCT based hashing.[11] Below is a simple algorithm from The Hacker Factor Blog for average hashing, which works by comparing each pixel to the average color of the image:[12]

  1. Reduce size The fastest way to remove high frequencies and detail is to shrink the image. In this case, shrink it to 8x8 so that there are 64 total pixels. Don't bother keeping the aspect ratio, just crush it down to fit an 8x8 square. This way, the hash will match any variation of the image, regardless of scale or aspect ratio.
  2. Reduce color The tiny 8x8 picture is converted to a grayscale. This changes the hash from 64 pixels (64 red, 64 green, and 64 blue) to 64 total colors.
  3. Average the colors Compute the mean value of the 64 colors.
  4. Compute the bits This is the fun part. Each bit is simply set based on whether the color value is above or below the mean.
  5. Construct the hash Set the 64 bits into a 64-bit integer. The order does not matter, just as long as you are consistent.

The resulting hash should be something like 8f373714acfcf4d0. To compare two images, construct the hash from each image and count the number of bit positions that are different. This is a Hamming distance. A distance of zero indicates that it is likely a very similar picture or a variation of the same picture. A distance of 5 means a few things may be different, but they are probably still close enough to be similar. A distance of 10 or more is a probable indication that the images are different.

Usage

TinEye's ability to search the web for specific images (and modifications of those images) makes it a potential tool for the copyright holders of visual works to locate infringements on their copyright. It also creates a possible avenue for people who are looking to make use of imagery under orphan works to find the copyright holders of that imagery. Being that orphan works can be defined as "copyrighted works whose owners are difficult or impossible to identify and/or locate,"[13] the use of TinEye could potentially remove the orphan work status from online images that can be found in its database.

See also

References

  1. ^ a b c "TinEye Reverse Image Search". tineye.com. Retrieved November 1, 2022.
  2. ^ "Releases". Tineye.com. Archived from the original on July 17, 2011. Retrieved February 21, 2013.
  3. ^ Claburn, Thomas (August 18, 2008). "TinEye Image Search Finds Copyright Infringers". InformationWeek. Retrieved September 28, 2014.
  4. ^ Szeliski, Richard (2010). Computer Vision: Algorithms and Applications. Springer Publishing. p. 832. ISBN 9781848829343.
  5. ^ George-Cosh, David (n.d.). "Idée's TinEye next frontier in Web searches" (PDF). National Post. Retrieved February 11, 2010.
  6. ^ "Retrieved 2014-07-01". Tineye.com. Retrieved July 1, 2014.
  7. ^ "Flickr hosts 5bn images as at Sep 10 – Retrieved 2011-04-06". Royal.pingdom.com. Archived from the original on July 12, 2018. Retrieved February 21, 2013.
  8. ^ "TinEye Reverse Image Search". tineye.com. Retrieved June 5, 2022.
  9. ^ Elias, Jean-Claude. (December 11, 2009). Search by photo. The Jordan Times. Retrieved on 2/19/10 from Factiva database.
  10. ^ "TinEye Developer Documentation". services.tineye.com. Retrieved June 5, 2022.
  11. ^ Christoph Zauner (July 2010). Implementation and benchmarking of perceptual image hash functions.
  12. ^ "Tools, Techniques, and Tangents". Dr. Neal Krawetz.
  13. ^ Yeh, B. (February 1, 2010). "Orphan works" in copyright law. Congressional Research Service. Retrieved on 2/19/10 from Factiva database.