Content-Type and Status Code Leakage

This blog post explores the issue of content-type and status code leakage. It examines the meaning of HTTP status codes and their effect when used with HTML attributes. The typemuchmatch HTML attribute receives special attention. It also explains how to prevent data leaks, and emphasizes the importance of correct implementation.

Content-Type and Status Code Leakage

The author of a bug bounty write-up published in Medium on March 20, username 'terjanq', demonstrated that the response to a resource varies based on the state of authorization of the user requesting it. As we explained in a previous blog post, referenced below, if the user is authorized to view the resource, the Content-Type header has the value 'text/html'. However, if the opposite is the case, the response returns without a Content-Type header, which is equal to the value 'text/plain'.

Content-Type and Status Code Leakage

In this blog post, we analyze the role of the Content-Type header and HTTP status codes in obtaining user data using terjanq's research. We also suggest methods that can be used to prevent the scenario described in his writeup.

The Meaning of HTTP Status Codes

The responses returned from web pages vary based on various factors, such as:

  • The user’s authorization and browser
  • The availability of the requested resource on the server
  • The relocation status of the resource

In the HTTP protocol, for example, if the user requests a resource that isn’t present in the destination, the 404 Not Found HTTP code is returned.

Similarly, if the user is not authorized to view the source, 403 Forbidden is returned. In some cases, the HTTP 401 Unauthorized code can also be used to remind the user to enter authorization credentials.

Likewise, servers use the data in the Content-Type header to help browsers to render the contents of the page appropriately on the browser.

The typemustmatch HTML Attribute

The researcher initially focused on the typemustmatch HTML attribute to determine the user's authentication state. The typemustmatch attribute is boolean, which means that its features are activated as long as the attribute exists in the HTML element.

This typemustmatch boolean attribute ensures that if the type of the resource loaded by an object element doesn’t match the value indicated in the Content-Type header, the resource is prevented from loading. This leads to an interesting information leak. If you knew that a response for authorized users always returns an application/json content type, and a response for an unauthenticated user always returns text/html, you could try to load the resource using an object tag with the typemustmatch attribute enabled. Then, if you set the type as application/json and the resource failed to load, you'd be able to determine that the user was not authenticated.

However, this attribute only works on Firefox.

The next question is, how will you know whether the browser blocked the resource from loading?

Usually, the onload and onerror event tags in the HTML elements are triggered when a resource is loaded successfully or unsuccessfully. But the object elements do not support these events. The text feature of the object element is the next solution the researcher aimed to use. In the code below, the text 'not_loaded' is displayed when the resource isn’t loaded.

<object type= data= typemustmatch> not_loaded </object>

The Effect of HTTP Status Codes Used With HTML Attributes

Let's look next at the status codes. While the typemustmatch attribute prevents the unmatched resources from loading, it also prevents the resources from loading if the response doesn’t have the HTTP status code '200'. Spotting this through the text feature of the object element is not possible, since there isn’t an attribute reference to access this text value.

The researcher uses clientHeight and clientWidth features of the object element, whose values change depending on the loaded resource. These features have a default value of '0', so the changes in the value can be used to determine whether the resource has been loaded.

These features aren’t as useful since we don’t know if the resource has finished loading, since the object tag doesn’t have the onload or onerror events. Despite this problem, the researcher discovered that an object element that hasn't yet loaded prevents the window object from firing the onload event. However, once all the elements are loaded, the window element triggers this event. Afterwards, by observing the changes in the clientHeight and clientWidth attributes through the actions of the window object, the user data can be acquired from the values in the Content-Type header and HTTP status codes.

For further information about terjang's implementation of this concept, check the code on jsfiddle.

How to Prevent Data Leaks

It’s quite straightforward to prevent the type of information leak that arises as a result of a combination of the object element and the typemustmatch attribute. This is because such data leaks can be time-based, or Content-Type header and HTTP status code based.

  • If the Content-Type header isn’t set in the HTTP response, the browser must determine which type of content the response contains. This can lead to vulnerabilities such as Cross-site Scripting. So you have to be very careful that you don't forget this when implementing the Content-Type header.
  • In addition to that, you can prepare information pages for custom errors to return HTTP status code '200' in all circumstances, in order to prevent returning codes such as '404' and '403' in the response.
  • Another prevention method would be to check the Referer header in the request. As you can imagine, such attacks are made through requests from an attacker controlled website. So checking the value of the referrer header could prove to be very useful!
  • The attacker has to make the requests from your browser. Using the Same Origin Policy will ensure that the attacker cannot arbitrarily change the Referer header when the requests are made from a different origin. However, if you incorrectly implemented the Referer check, the attacker may be able to add the expected value to some part of the Referrer header and bypass your security check. This code from StackOverflow on Checking PHP referrer is an example of an incorrect implementation of the Referer header check, even though it's been accepted as a solution by the original poster.
$ref = $_SERVER['HTTP_REFERER'];
if (strpos($ref, 'example.com') !== FALSE) {
  redirect to wherever example.com people should go
}

The attacker may force the user to visit an attacker controlled website (example below) to bypass the control mechanism in the code.

http://www.attacker.com/hello_world?example.com

The Importance of Correct Implementation of the Content-Type Header

HTML status codes and HTTP headers are some of the mechanisms that can help make web browsing easy and secure. However, when partnered with HTML attributes and features, they become dangerous tools when it comes to user data.

The prevention and protection methods we have outlined above will help to develop a more secure implementation. Combined with other security headers, they can help prevent the abuse of these web features.

Further Reading

For full details of terjanq's writeup, see Cross-Site Content and Status Types Leakage.

You can read more about MIME Type sniffing and X-Content-Type-Options security header in our whitepaper on HTTP Security Headers and How They Work.

For further information, see The Importance of the Content-Type Header in HTTP Requests.