Web Crawl to Infrastructure Blowout

In our last blog post, we broke apart the RiskIQ web crawlers and outlined all the content they collect when browsing the Internet. This was helpful in understanding the data, but it didn’t really provide a good example of how we use this content to link to actor infrastructure. For this post, we are going to focus in on a criminal-based threat that often targets social media services and see how we could leverage the crawlers and PassiveTotal to identify their infrastructure.


If you don’t recognize the names above, then chances are you haven’t had to deal with the Turkish-based actors who leverage platforms like Facebook to spread their malicious code. While they may not focus on high-value targets or perform newsworthy hacks, these actors demonstrate a keen attention to detail and technical sophistication that makes their code difficult to combat. For the sake of this example, it’s worth understanding the common tactics deployed by the group.

BePush actors generally infect their victims through some creative posting, message or photo shared on Facebook. Upon clicking, the user will typically be bounced around a number of short-link services and 3rd-party providers before hitting a final web page comprised of highly obfuscated javascript. This javascript may entice the user to install a Chrome extension or in this case, will prompt the user to run a JSE file.

Once ran, this script will use ActiveX in order to write both an older version of Chrome (Version 31.0.1650.63 m) and a rogue extension to the file system before redirecting the user back to Facebook. The extension will hijack the user’s Facebook access tokens and use them in order to make posts, send messages and share photos on their behalf, all with the goal of spreading the same cycle to as many users as possible.

Web Crawling Bepush

A couple weeks ago, RiskIQ observed a blacklist incident based on the domain notswart[.]xyz which led our crawlers through a number of different providers before landing on a Dropbox page. At the time, we didn’t know if this were malicious or what actor it might related to, so the first place to check was the cause chain.

Viewing this chain gave us a pretty good idea that something strange was going on. The crawler was bounced through a couple short links from Bitly and Google, landed on a random AWS page, was tagged with a tracking code from whos.amung.us and made its way eventually through some registered domains down to the final Dropbox landing. As an analyst, a few things stuck out here as important:

  • Several providers and redirection sequences made it clear that the actor wanted to obfuscate their traffic to make it difficult to follow
  • Two unique domains showed up outside of the 3rd-party services and shared the same netblock allocation
  • Tracking pixels were used as a way to potentially track victims or keep tabs on the campaign

Unfortunately, by the time we reached this particular crawl, some of the infrastructure had been taken down or disabled. However, the RiskIQ web crawler saved a copy of each of the results as it crawled the links. Taking a look at the Amazon page content and DOM changes gave us some additional context into the first set of redirections.

It’s clear to see here that the actors were abusing the jQuery name in order to redirect the crawler to a PHP script they controlled on their server. Inspecting that page a bit more revealed a significant amount of obfuscated Javascript.

At this point in my analysis, I would normally need to decode the Javascript of the actors in order to figure out what happened. Fortunately, I could again leverage the DOM changes and cause chain captured by RiskIQ in order to figure out what was going on inside the page at the time of the crawl. In this case, whos.amung.us was loaded via Javascript and was further redirected to the secondary registered domain by setting the location variable. Following the crawl to the final destination leads us to the JSE payload that would install the malicious Chrome Extension.

Investigating in PassiveTotal

Using the crawl data from RiskIQ, we were able to start with an unknown blacklist event and build out a larger picture of the attack. In total, we were able to collect two domains, two IP addresses and a tracking code. Using PassiveTotal, we decided to investigate the actor infrastructure further to see if there was anything else worth blocking.


Querying for the first domain didn't reveal much in the order of passive DNS, but both WHOIS and host pairs showed promising leads.

Performing a reverse WHOIS search on the hotmail address proved to be instant gold. Not only did we get several more domains, but we had already flagged them in the platform as BePush from previous analysis.

Knowing that these actors used a lot of redirects in the crawl we viewed, it made sense to note the two domains (kosmantinablog[.]xyz, pkxjxfbidhfi[.]net from host pairs as suspect. Diving into their results didn't yield any additional details, but did show connections to the whos.amung.us ID of "kardasdas" which we had obsered in our crawl.

Starting with just one of the two domains, we were able to collect 12 additional domains, an email address used for registration, a pixel tracker connection, an SSL certificate associated with the IP and the use of CDN services like Amazon and Cloudflare.


Much like the first domain, this one also appeared to show little DNS resolutions and the same WHOIS record, but contained a lot more host pairs. In fact, it contained over 500 different pairs.

Looking at the times and sources, it was clear that this infrastructure was active up until at least May 14 and continued to use Amazon as a primary traffic source for redirection.

Pivoting on the currently resolving IP addresses revealed a heatmap with several orange markers indicating new content being associated over the past few days. Since we had already flagged several of the domains found in this investigation and others, most of the results table was colored red and tagged appropriately. However, there were several other white rows revealing domains we had not investigated yet.

Following several of the new leads a bit further shows connections to even more BePush infrastructure through passive DNS, Google Analytics account and tracking IDs, and host pairs. Again, using this one domain as our source, we were able to identify hundreds of recent Amazon pages, overlap in WHOIS data, more domains not yet investigated and several other tracker IDs.

Proactive Defense

Knowing that these actors change infrastructure, but reuse certain aspects provides us with a good opportunity to monitor their infrastructure. PassiveTotal allows analysts to monitor domains or IP addresses in order to understand when they change. I have several monitors set on BePush infrastructure and get an email alert each morning with the daily changes. As a defender, this means I can stay ahead of the attackers by simply taking the output from the notifications and using it as new leads for more intelligence gathering.

Boosting Your Analysis

In this blog, we started off using the output of RiskIQ's web crawlers in order to analyze some suspicious web traffic. From that, we were able to identify five different points of interest to start an investigation inside of PassiveTotal. By simply clicking through the infrastructure and making pivots, we were able to surface hundreds of new leads, identify that the campaign was likely still active and create a body of knowledge around the BePush actor's tactics, techniques and procedures.

At PassiveTotal, we pride ourselves on making analysis easy. Our goal has always been to provide analysts with as much data as possible in order to determine if something is malicious. With the RiskIQ data we continue to add into the platform, making connections becomes almost too simple. Analysts can access our data from the web, our API or through one of our many growing integrations. As always, if you have comments, questions or feedback, send us a message at feedback@passivetotal.org.