Using Google Bots as an Attack Vector

Category: Web Security Readings - Last Updated: Thu, 08 Nov 2018 - by Netsparker Security Team

According to the statistics, Google always has a market share of more than 90% among search engines. Many users use their address bar as Google’s search bar. Therefore, being visible on Google is crucial for websites as it continues to dominate the market.

Using Google Bots as an Attack Vector

In this article, we analyze a study from F5 Labs which brings our attention to a new attack vector using Google's crawling servers, also known as Google Bots. These servers gather content from the web to create the searchable index of websites from which Google's Search engine results are taken.

How Search Engines Use Bots to Index Websites

Each search engine has unique sets of algorithms, but the common thing they do is to visit any given website, look at the content and links they find (known as 'crawling'), then grade and list the resources. After one of these bots finds your website, it will visit and index it.

For a good ranking, you need to make sure that search engine bots can crawl your website without issues. Google specifically recommends that you avoid blocking search bots in order to achieve successful indexing. Attackers are aware of these permissions and have developed an interesting technique to exploit them – Abusing Google Bots.

The Discovery of a Google Bot Attack

In 2001, Michael Zalewski wrote in Phrak magazine about this trick. He also highlighted how difficult it is to prevent it. Just how difficult became apparent 17 years later, when F5 Labs inspected the CroniX crypto miner. When F5 Labs' researchers analyzed some malicious requests they had logged, they discovered that the requests originated from Google Bots.

Initially, the F5 Labs researchers assumed that an attacker used the Google Bot's User-Agent header value. But when they investigated the source of the requests, they discovered that the requests were indeed sent from Google.

There were different explanations for why Google servers would send these malicious requests. One of them would be that Google's servers were hacked. However, that idea was discarded quickly as it wasn't likely. Instead they focused on the scenario laid out by Michael Zalewski, who stated that Google Bots are abused in order to make them behave maliciously.

How Did the Google Bots Turn Evil?

Let’s take a look at how attackers can abuse Google Bots in order to use them as a tool for malicious intent.

First, let's suppose that your website contains the following link:

<a href="http://victim-address.com/exploit-payload">malicious link<a>

When Google Bots encounter this URL, they’ll visit it in order to index it. The request that includes the payload will be made by a Google Bot. This image illustrates what happens:

Using Google Bots as an attack vector diagram

The Experiment Conducted to Prove the Attack

Researchers verified the theory that a Google Bot request would carry the payload, by conducting an experiment in which they prepared two websites: one that acted as the attacker, and one that acted as the target. The links that carried the payload were added to the attacker's website and then sent to the target website.

Once the researchers set the necessary configurations for the Google Bots to browse the website, they then waited for the requests from the Google Bots. When they analyzed the requests, they found out that the requests from the Google Bot servers indeed carried the payload.

The Limits of the Attack

This scenario is only possible in GET requests where the payload can be sent through the URL. Another drawback is that the attacker won't be able to read the victim server's response, which means that this attack is only practical if it's possible to send the response out of bounds, like with a command injection or an SQL injection.

The Combination of Apache Struts Remote Code Evaluation CVE-2018-11776 and Google Bots

Apache Struts is a Java-based framework released in 2001. The regular discovery of code evaluation vulnerabilities in the framework generated many discussions about its security. For example, the Equifax Breach that facilitated the loss of $439 million and the theft of a huge amount of personal data, was the result of CVE-2017-5638, a critical code execution vulnerability found in the Apache Struts framework.

A Quick Recap of Apache Struts Remote Code Evaluation CVE-2018-11776

Let’s recap on the vulnerability that can be exploited on recent Apache Struts versions. The CVE-2018-11776 vulnerability (discovered in August this year) is perfect for a Google Bot attack, since the payload is sent through the URL. Not surprisingly, this was the vulnerability that CroniX abused.

Example

Here are two examples:

When a namespace is not set, the configuration that leads to the vulnerability allows user-defined namespaces to be set from the path. In this situation it's possible to inject an OGNL (Object-Graph Navigation Language) expression. OGNL is an expression language in Java.

Here is an example of a configuration that is vulnerable to CVE-2018-11776:

<struts>
<constant name="struts.mapper.alwaysSelectFullNamespace" value="true" />

<package name="default" extends="struts-default">

<action name="help">
  <result type="redirectAction">
      <param name="actionName">date.action</param>
  </result>
</action>
..
..
.
</struts>

You can use the following sample payload to confirm the existence of CVE-2018-11776. If you open the URL http://your-struts-instance/${4*4}/help.action and you get redirected to http://your-struts-instance/16/date.action, you can confirm that the vulnerability exists.

As mentioned before, this is the perfect context for a Google Bot attack. As CroniX shows, attackers can go as far as spreading Cryptomining malware using a combination of Apache Struts CVE-2018-11776 and Google Bots.

Solutions to the Google Bots Attack

At this point, the possibility of malicious links directed to your website from Google Bots should make you question which third-parties you can really trust. Yet, blocking Google Bot requests entirely would negatively influence your position in the search engine's results. If Google Bots cannot browse your website, this will pull down your ranking in the search results. So if your application detects malicious requests and blocks them, or even blocks the sending IP, attackers could use the Google Bot requests to send malicious payloads, which would result in blocked Google Bots, and therefore further damage your search result rankings.

Control the External Connections on Your Website

Attackers can use their websites, or those under their control, to conduct malicious activity using Google Bots. They might also place links on a website in comments under blog posts.

If you want an overview of the external links on your website, you can check the Out-of-Scope Links node in the Netsparker Knowledge Base following a scan.

Out of Scope Links

The Correct Handling of Links Added by Users

Even though it won't prevent attackers from abusing Google Bots to attack websites, you might still be able to prevent a negative Search Engine Ranking if you take certain precautions. For example, you can prevent search bots from following links using the rel attribute in combination of nofollow. This is how it's done:

<a rel="nofollow" href="http://www.functravel.com/">Cheap Flights</a>

Due to the 'nofollow' value of the rel attribute, the bots will not visit the link.

Similarly, the meta tags you define between the <head></head> tags will help control the behavior of the search bots on all URLs found on the page.

<meta name="googlebot" content="nofollow" />
<meta name="robots" content="nofollow" />

You can give these commands using the X-Robots-Tag response header, too:

X-Robots-Tag: googlebot: nofollow

You should note that the commands given with X-Robots-Tag and meta tags apply to all internal and external links.

Further Reading

Read more about the research on the Google Bots attack in Abusing Googlebot Services to Deliver Crypto-Mining Malware.

Authors, Netsparker Security Researchers:

Ziyahan Albeniz
Umran Yildirimkaya
Sven Morgenroth


Netsparker

Dead accurate, fast & easy-to-use Web Application Security Scanner

GET A DEMO