Jump to content

Client Hints

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Sohom Datta (talk | contribs) at 04:21, 22 September 2024. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Client Hints are a set of HTTP Header fields and a JavaScript web application programming interface (API) for proactive content negotiation in the Hypertext Transfer Protocol (HTTP). The client can advertise information about itself through these fields so the server can determine which resources should be included in its response. Initially proposed in 2013 by engineers at Google, Client Hints were presented as a privacy-preserving alternative to user-agent header strings as part of Google's Privacy Sandbox initiative. The initial design of Client Hints faced pushback from browser vendors due to various privacy concerns. As of May 2024, over 75% of all internet traffic supports Client Hints. Despite this widespread adoption, privacy researchers have raised concerns that Client Hints are primarily being used by tracking scripts.

Background

In 1992, an extension to the HTTP protocol was introduced adding a User-Agent HTTP Header which was sent from the client to the server and contained a simple string identifying the name of the client and its version. The header was meant purely for statistical purposes and for tracking down clients that violated the protocol. Since then, User-Agents have become increasingly more complex, and has started containing significant granular information about the user. Often, this information is used in browser fingerprinting, allowing sites to track users across sites passively without having to load any JavaScript for the user.[1]

History

The original draft for the client-hint specification was proposed in 2013 by engineers at Google. The specifications became an Internet Engineering Task Force (IETF) draft in November 2015. Subsequently, in 2021, the specification was upgraded to an experimental RFC. Around the same time, the specifications for handling HTTP client hints on the web were published as a draft in a W3C Community Group Report.[2]

In 2020, Google announced their intention to deprecate user-agent (UA) strings as part of their Privacy Sandbox initiative, citing client-hints as a privacy-preserving alternative.[1] The initial client-hints proposal was met with pushback from other browser vendors due to privacy concerns. In 2019, Brave raised concerns about the initial proposal, citing ways in which it could be used to track users on the internet.[3] Mozilla, the company that makes Firefox, initially classified the proposal as harmful, and Apple, the company that makes Safari also took a negative stance against the proposal.[1] Despite these concerns, Chrome implemented support for HTTP Client Hints in August 2020. While the deprecation of the UA strings was delayed due to the COVID-19 pandemic, this process was completed in February 2023.[1]

Since their initial opposition, Mozilla has updated their stance to neutral and Brave has synchronized its implementation of client hints with that of Chrome.[1] As of May 2024, over 75% of all web users use browsers that support client hints.[2]

Mechanism

The Client Hints protocol defines two entities: a user agent (UA) (typically a browser) and a server. These two entities communicate with each other to negotiate what kind of content should be served to the user.[4] The process involves the server sending the UA a response with an Accept-CH HTTP Header, containing a list of Client Hint HTTP headers that it requires. Subsequently, the UA is expected to return the requested client hints with each subsequent response, provided it supports those hints. These headers are then used by the server to make decisions on what kind of content to serve the UA.[2] If the UA does not understand or support a particular client hint then the UA is instructed to ignore the particular client hint. In cases where the Client-Hints cannot be cached, the server must specify the applicable client hints headers in a separate Vary header sent to the UA.[1] This ensures that caching mechanisms understand that responses can vary based on different client hint values.[5] For client hints that specifically identify a browser, additional random browser identifiers are included as grease in order to prevent protocol ossification around browser sniffing.[6]

For UAs that allow JavaScript, an additional option is available through the navigator.userAgentData JavaScript API. This API enables JavaScript to retrieve the same information as provided by the Client Hints headers.[1] The API separates the data it provides into two types, low entropy data and like the platform on which the browser is running and the brand of the browser and high entropy data like the exact version number of the browser and the model of device the user is using. Low entropy data is included in the API as object parameters whereas high entropy data which can uniquely identify the user needs to be explicitly fetched by the client by calling the getHighEntropyValues() function in the API which allows the browser to ask for user permission or to perform additional checks.[7]

Example

To initiate a content negotiation, a HTTP server appends the Accept-CH header to the response of a HTTP request:

HTTP/1.1 200 OK
...
Accept-CH: Viewport-Width
...

If the user-agent supports the view-port width client hint, the user-agent will append the Viewport-Width header in every subsequent request,

GET /gallery HTTP/1.1
...
Viewport-Width: 1920
...

the server can then use the information in the Viewport-Width header to make a decision about the kind of content to serve the user-agent. For example, if the server has a particular image that is extremely large, the server can be configured to return smaller image if the image does not fit the viewport.[8]

Privacy concerns

When the client-hints proposal was originally published, it was met with significant privacy concerns. Browser vendors like Brave and Mozilla pointed out that a particular provision in the initial draft of the proposal allowed websites to instruct the browser to provide Client-Hint data to third-party domains. Third-party domains are domains that do not execute any JavaScript code, but rather load resources like images and script files.[3] The provision in the initial draft would allow these third-party domains like content delivery networks (CDN) and cloud service providers like Cloudflare and Google Cloud (called TLS terminators) to track users across the web by instructing the browser to send client-hint information to their servers.[3][9] Additionally, concerns were also raised that the Client-Hint proposal was too permissive and explicitly allowed for new privacy compromising information that could not be obtained by simply parsing HTTP Headers to be leaked to servers.[9] Additionally extensions that aim to preserve a user's privacy like the NoScript extension also opposed the proposal on the grounds that it would make it significantly harder to prevent sites from exfiltrating privacy-compromising information about users.[3]

Since the adoption of Client Hints by major browsers like Google Chrome and Microsoft Edge, privacy researchers have raised concerns over their real-world use for tracking.[2] A 2023 study by researchers from KU Leuven and Radboud University found that out of a crawl of over 100,000 websites, 60% of the scripts accessed the Client Hints JavaScript APIs, with most being tracking and advertising scripts, many of which came from Google. Over 90% of these scripts exfiltrated the obtained data to tracking domains.[1] A subsequent study in May 2024 by researchers from the Hochschule Bonn-Rhein-Sieg University of Applied Sciences noted that while overall adoption of Client Hints amongst websites on the internet was low, a significant number of third-party domains known for tracking accessed HTTP Client Hints data.[2]

See also

References

  1. ^ a b c d e f g h Senol, Asuman; Acar, Gunes (2023-11-26). "Unveiling the Impact of User-Agent Reduction and Client Hints: A Measurement Study". Proceedings of the 22nd Workshop on Privacy in the Electronic Society. ACM. pp. 91–106. doi:10.1145/3603216.3624965. ISBN 979-8-4007-0235-8. Archived from the original on 2024-06-26. Retrieved 2024-06-25.
  2. ^ a b c d e Wiefling, Stephan; Hönscheid, Marian; Iacono, Luigi Lo (2024-05-22), "A Privacy Measure Turned Upside Down? Investigating the Use of HTTP Client Hints on the Web", arXiv:2405.13744 [cs]
  3. ^ a b c d Cimpanu, Catalin (May 16, 2019). "Privacy concerns raised about upcoming Client-Hints web standard". ZDNET. Archived from the original on 2023-12-01. Retrieved 2024-06-02.
  4. ^ Grigorik, I.; Weiss, Y. (February 2021). HTTP Client Hints. IETF. doi:10.17487/RFC8942. RFC 8942. Retrieved February 11, 2021.
  5. ^ "HTTP Client hints". HTTP. MDN. 2024-03-05. Archived from the original on 2024-06-07. Retrieved 2024-06-02.
  6. ^ Taylor, Mike; Weiss, Yoav, eds. (1 April 2024). "User-Agent Client Hints § 6.2. GREASE-like UA Brand Lists". WICG. Archived from the original on 18 June 2024. Retrieved 26 June 2024.
  7. ^ "NavigatorUAData: getHighEntropyValues() method - Web APIs". Mozilla Developer Network. 2024-07-26. Retrieved 2024-09-21.
  8. ^ "Improving user privacy and developer experience with User-Agent Client Hints". Privacy & Security. Chrome for Developers. Archived from the original on 2024-06-02. Retrieved 2024-06-02.
  9. ^ a b "Brave's Concerns with the Client-Hints Proposal". Brave. 2019-05-09. Archived from the original on 2024-06-26. Retrieved 2024-06-02.