Search engines in research and vulnerability assessment

2007-11-01

Alex Eckelberry

Sunbelt Software, USA
Editor: Helen Martin

Abstract

'Search engines are free, powerful and efficient tools that can be used to find vulnerabilities and hacked sites on the web, and even in your own organization.' Alex Eckelberry, Sunbelt Software.


On 2 October an odd thing happened: state and government websites in California started shutting down. It turned out that the US General Services Administration (GSA) had overreacted somewhat to reports of pornography being hosted on a website for the Transportation Authority of Marin County, and simply pulled the plug on the entire ca.gov domain.

News that the Marin County website was serving porn came as no great surprise to malware researchers. A number of individuals had contacted the owners of the site in mid-September, alerting them to the problem. Unfortunately these alerts were ignored, as they were believed to be ‘phishing’ attempts.

The primary source of the problem was an apparent DNS hack, which redirected parts of the Marin County website to pornographic sites. These were redirects – the government site itself was not hosting porn. The hack occurred at an outsourced provider, which didn’t have the tightest security practices in place.

As many in the research community know, finding these types of hack is trivial work – it can often be a matter of using simple Google searches such as sex porn site:gov.

Malware researchers are quite familiar with the power of search engines in conducting research. Search engines are free, powerful and efficient tools that can be used to find vulnerabilities and hacked sites on the web, and even in your own organization.

Generally, one sees websites compromised through stolen FTP credentials; unpatched (usually open source) software, including poorly maintained LAMP stacks; the increasing use of collaborative, ‘web 2.0’-type software (wikis, tikis, etc.); DNS hacks; poorly written ASP code; sloppy PHP work and SQL hacks.

So-called ‘Google dorks’ can be useful in finding compromised and malware-hosting websites. The term was coined by Johnny Long on his website johnny.ihackstuff.com, where he described the practice (which had been around for some time) of using Google searches to find ‘dorks’ – people who expose too much information on the web.

Google dorking’ has evolved, and malware researchers continue to fine-tune their searches to find vulnerable websites and malware. Furthermore, by using Google’s Alerts feature, one can input a number of different searches and receive alerts when a matching site is found – useful in finding newly compromised or rogue sites.

As one example, some broader starter queries might be any of the following: inurl:traff site:.biz (or .info), inurl:in.cgi site:.info, inurl:klik site:.info, or intitle:"index of" (the last followed by any of a number of terms, such as "love exe", "jpg exe", porn exe, xxx pif, "bot exe" or "gif exe" – hence, a final search might be intitle:"index of" xxx pif -filetype:html -filetype:php -filetype:htm or, as another example, intitle:"index of" "bot exe" -filetype:html -filetype:php -filetype:htm).

Since ‘jump pages’ are often created with the sole purpose of being indexed on search engines and redirecting visitors to other content, running searches on something like site:nm.ru might prove useful. Alternatively, searches can be performed for specific directory structures of frameworks used in malware, such as /stata/index.php.

Pornography and malware distributors commonly hack into websites for search engine optimization and increased distribution (ironically, Google's work in marking sites as 'unsafe' in search results is likely driving malware and porn distributors to rely increasingly on hacking 'good sites' to perform redirections to their own bad sites). Finding these hacked sites is similarly trivial. One can simply look for any combination of terms, such as ‘porn, free ringtones, free casino’, followed by some operators to narrow down the search.

Some knowledge of the language used by the distributors also helps – ‘sesso’ and ‘fottilo’, for example, are often used by Italian malware and porn distributors (such as Gromozon). At the time of writing this article, the search sesso OR gratuito porno OR fottilo site:gov produces some rather interesting (and sometimes very dangerous) results.

One can continue to experiment by adding different domains and additional operators to the searches. It’s common to find plenty of comment spam using these methods, but very often you’ll also find compromised websites.

Organizations large and small can use similar searches to find vulnerabilities on their own sites. This holds especially true for larger organizations that work in collaborative environments, such as academic institutions and some governmental organizations. For example, problems with vulnerabilities commonly exist in colleges and universities, where students are often provided with their own websites, academic discourse is encouraged through open source collaborative software, and servers are managed by different groups throughout a campus. It’s a recipe for disaster, and that’s often exactly what happens – finding hacked university websites is almost trite work. IT administrators could complement their security toolboxes with search engines, seeking inappropriate content on their own domains.

In addition to finding malware on the web, there are numerous (and often hair-raising) searches available that can be used to find vulnerabilities on a site. Queries are limited only by creativity, technical acumen and knowledge of data structures.

Finally, there are distinct differences between search engines. Yahoo, Google and Live.com present similar data, but sometimes one provides clearer results than the other. Live.com has the powerful and unique feature of allowing IP searches. Searching for common malware IPs produces profitable results, such as ip:89.28.13.208, ip:89.28.13.213, and so on.

Some researchers are frustrated by the inability to search within the source of web pages – which, if provided, would open up a mother lode of information, obviating the need to use proprietary spiders. For now, however, one can find plenty of information using simple searches, enabling the research community to find bad things before they get out broadly to the public – and in the process, hopefully making some impact on the safety of the Internet experience for users.

Sunbelt researchers Francesco Benedini and Adam Thomas contributed to this article.

twitter.png
fb.png
linkedin.png
hackernews.png
reddit.png

 

Latest articles:

Nexus Android banking botnet – compromising C&C panels and dissecting mobile AppInjects

Aditya Sood & Rohit Bansal provide details of a security vulnerability in the Nexus Android botnet C&C panel that was exploited to compromise the C&C panel in order to gather threat intelligence, and present a model of mobile AppInjects.

Cryptojacking on the fly: TeamTNT using NVIDIA drivers to mine cryptocurrency

TeamTNT is known for attacking insecure and vulnerable Kubernetes deployments in order to infiltrate organizations’ dedicated environments and transform them into attack launchpads. In this article Aditya Sood presents a new module introduced by…

Collector-stealer: a Russian origin credential and information extractor

Collector-stealer, a piece of malware of Russian origin, is heavily used on the Internet to exfiltrate sensitive data from end-user systems and store it in its C&C panels. In this article, researchers Aditya K Sood and Rohit Chaturvedi present a 360…

Fighting Fire with Fire

In 1989, Joe Wells encountered his first virus: Jerusalem. He disassembled the virus, and from that moment onward, was intrigued by the properties of these small pieces of self-replicating code. Joe Wells was an expert on computer viruses, was partly…

Run your malicious VBA macros anywhere!

Kurt Natvig wanted to understand whether it’s possible to recompile VBA macros to another language, which could then easily be ‘run’ on any gateway, thus revealing a sample’s true nature in a safe manner. In this article he explains how he recompiled…


Bulletin Archive

We have placed cookies on your device in order to improve the functionality of this site, as outlined in our cookies policy. However, you may delete and block all cookies from this site and your use of the site will be unaffected. By continuing to browse this site, you are agreeing to Virus Bulletin's use of data as outlined in our privacy policy.