Learn from the Penguin Penalty of ConcertHotels.com
We are proud to present the 6th deep dive case study by a Certified LinkResearchTools Professional. We greatly appreciate you sharing this quality piece of work by Michael Levin. Enjoy!
Christoph C. Cemper
ConcertHotels.com Penguin 2.0 Penalty Analysis and Link Audit.
In this case study we will analyze ConcertHotels.com - looking into the reasons why we think this website dropped in search engine visibility after the Penguin 2.0 update.
- Defining Penguin 2.0 Update Consequences
- SEO Visibility
- Negative Changes in Keyword Ranks
- Drop for “dc ave”
- Drop for “Hotels Asheville”
- Drop for “Arena Hotel”
- Drop for “New Daisy”
- Drop for “Shephards”
- Drop for “Arena Texas”
- Analysing the Reasons behind a Google Penalty
- Unnatural Anchor Text Diversity
- Defining Competitors
- Competition Analysis
- Weak and Spammy Links
- Power*Trust and Power*Trust dom Metrics
- Competition Analysis
- Link Status
- Deeplink Ratio
- Links from Theme-related or Unrelated Websites
- Victim of a Low-quality, Spammy and Infected Link Network?
- “Toxic” Backlinks
Defining Penguin 2.0 Update Consequences
To delve more deeply into and to analyse the impact of the Penguin 2.0 update on the website concerthotels.com, let's start with Searchmetrics’ SEO Visibility graph, displaying a significant drop after Penguin 2.0 release in mid-May 2013.
Looking at this graph, we can also assume that concerthotels.com lost a huge amount of traffic. By attempting to identify those keywords that caused the aforementioned drop in Searchengine-Visibility and website traffic, we may see a reason behind why this drop happened.
Negative Changes in Keyword Ranks
In order to look more closely at these keywords, we used SISTRIX Tool and ran a “Negative Changes” Report. In our particular case, we compared the Keyword ranks of 11.03.2012 with 03.06.2013.
This is the outcome:
This is impressive; many keywords which ranked in the top 20 in March 2012 lost their ranking dramatically and are ranking now somewhere between positions 70 and 100.
It is useful to take a few keywords with high search volume and see the keyword drops graphically:
|Keyword||Local search volume in google.com (USA)|
Following latest news from the SEO, we can argue that there is a quite a wide range of reasons for receiving a google penalty, and consequently from this penalty, such a dramatic drop in rankings.
Let’s look at these reasons individually and in more detail to try to identify what the main reason is for a google penalty as well as seeing if there are combinations of reasons as to why google penalised the website concerthotels.com
Were the backlinks of concerthotels.com so full of money keyword anchors, that google penalised the site?
By looking at the anchor text diversity – we can do this by running a BLP Report – we can obtain the necessary data quickly.
As we wish to obtain a general overview of the backlink anchors, by switching on the “sitewide links filter”, this enables us to deliver the three strongest links from one domain and skip additional sitewide links from other subpages. But why is it necessary to do this? Search engines do not count each link separately, so by doing this we increase the reliability of the delivered results.
The anchor text diversity that we can see in our results rejects the hypothesis that google just penalised concerthotels.com due to the money keywords backlinks. Much more probable is that it was not the main reason but a supplementary reason. Concerthotels.com has the biggest money keyword anchor percentage of 4% of total links for “uk hotels”, but if we look at the keyword diversification according to keyword types, we can clearly see that it has more or less an equal percentage of links allocated to brand, compound and money keywords.
In order to prove that there were also other reasons for google’s penalty, it is necessary to compare the results we received with two competitors. We can define the competitors of concerthotels.com in many different ways, but first it is useful to see what the SEM Rush tool can tell us: enter concerthotels.com in the search bar, choose US for US local results and press search.
On doing this, you should see a screen just like the one below:
By clicking on “Competitors” under “Organic Research” on the left-hand side of the screen, and after sorting the results by competition level, we see the following:
In order to complete the comparison, we’ll use the one site with the highest competition level, namely songkick.com and the one with most common keywords, namely hotelguides.com
In order to check the keyword type spread of these two domains, we can run BLP Reports for each competitor:
So we see that these competitors have a much higher percentage of money keyword anchors and therefore it is valid to argue that anchor text diversity was not the main reason for concerthotels.com to be penalised.
Among prevailing penalty reasons are “weak links”, or so-called “spammy links”, from very weak and untrustworthy domains and these are common reasons for why many websites find themselves penalised. It is interesting to establish if the link profile of concerthotels.com contains weak links and if so, to what extent. We do this by delving deeper into our LinkResearchTools BLP Report, the report that we used to evaluate our hypothesis regarding unnatural anchor text diversity.
The figures in percent for Power*Trust can be seen by looking at the “Link Profile by Metrics” section and clicking on the “Power*Trust” tab and then the “Relative” tab below the graph.
This diagram is very interesting as you can immediately see that the majority, namely 69% of the total links pointing to concerthotels.com have Power*Trust equal to zero! This is very suspicious and at the same time extremely unusual. If we look at the graph again, we also note that concerthotels.com barely has any links from pages with Power*Trust more than 12. But what do Power*Trust metrics tell us?
They determine a particular page quality according to its strength and trustworthiness from where the backlink is placed. To disprove our hypothesis that concerthotels.com was a victim of “bad” links, we need to look at the spread of Power*Trust dom metric. But first, what is the difference between Power*Trust and Power*Trust dom metrics? The only difference is that Power*Trust dom shows us a linking domain quality rather than a particular page. If we click on the “Power*Trust dom” tab, we see the following picture:
The spread looks much better here, but the majority of linking domains still have zero Power*Trust. This may be potentially harmful for any domain and should therefore be avoided.
To prove our hypothesis further, that the major reason for the Penguin 2.0 drop was bad backlinks, by taking a short look at competitors we can see whether such Power*Trust and Power*Trust dom spread is common for entertainment / travel sites.
We begin with LRT Competitive Landscape Analyzer Quick Report:
Type in www.concerthotels.com and in order to fill in the part “URLs To Compare”, click on “Find Competing Pages”:
By typing in the Keyword(s) so that the tool searches for competitors, we took the keyword from the url, namely concert hotels. We use google.com search engine in English and in the USA. After a few seconds, we have a list of 10 competitors:
As we want to focus on the competing websites from more or less the same industry we filter manually the list, so at the end we have following competitors:
We then focus on the following metrics:
Power*Trust - CEMPER Power*Trust™ determines the quality of a page according to its strength and trustworthiness
Link Status - Follow/nofollow
Deeplink Ratio - Startpage / Deeplink
Let’s look at the “Power*Trust” tab:
This graph can also be used as proof of problematic weak links, a theory already presented in our previous BLP Report. Although the average Power*Trust value for competitors’ links is somewhere between 13-21, concerthotels.com has more than 90% of links with Power*Trust equal to 0 & 1.
Another suspicious issue we can see by looking at the “Link Status” tab of the CLA Report in more detail is that 100% of the links are follow links, which of course looks particularly artificial and which usually has strong connections with linkbuilding and linkbuying activities, those activities that google is desperate to prevent from occurring.
By looking at the “Deeplink Ratio” comparison, we can be sure, and to a great extent, that there unnatural linkbuilding was in play. While competitors have on average 98% to 2% deeplink to startpage link ratio, concerthotels.com ratio looks exactly the opposite. It has 12% to 88% deeplink to startpage link ratio, which could be evidence of manual SpamLink activities.
Could it be that concerthotels.com has a lot of weak links, but most of them are natural and come from theme-related websites? It’s definitely worth looking at this hypothesis in further detail. It is important to bear in mind that the website concerthotel.com offers accommodation near major concert venues and could therefore be classified as an entertainment or travel website. We can use our BLP Report again and more specifically the “Link Profile by Metrics” section of the report. Click on “More” on the menu and choose the submenu “Theme”.
The results are as follows:
Wow! Actually, there are numerous things that need to be highlighted here:
- concerthotels.com has 17% links from harmful so-called malware or virus infected domains, which is a vast amount.
- It would make sense that links to the entertainment/travel site would naturally come from similar sites in these two categories. But instead, we see a minority of links coming from travel (6%), and only 15% (even less than from malware sites) from entertainment sites.
It is now evident that we have discovered that concerthotels.com has lots of backlinks from malware or other sites unrelated to the entertainment / travel topic sites. But how can it be that one website can have such a large percentage of harmful links in their backlink portfolio? Is this an accident? It is important to study this in more detail and see where the links are coming from.
Let’s assume that concerthotels.com accidentally acquired all of the malware links. What would the diversity of IPs look like from where the links are coming? They would be more or equally spread among many different IPs. Our beloved BLP Report can deliver this information as well. Click on "More" and then click on "IP Popularity":
Almost 20% or 289 of all links come from one single IP. As we used a sitewide filter with a maximum of three links per domain, it definitely cannot be the links coming from one website, but at a minimum, from 96 different domains with the same IP address. This is called a “low quality spammy link network”. Furthermore, have you worked out what most of these domains have in common except their IP addresses? The answer to this question is only one click on the IP away. Click on 18.104.22.168 and look at the updated table below:
Nearly all of domains located on IP 22.214.171.124 are malware and therefore harmful sites. Moreover, almost all of them are completely unrelated to the concerthotels.com English site and come instead from Polish - .pl TLD. If we look at the TLD Breakdown (tab “TLD”) and country popularity (tab “Country Popularity”) we obtain further proof that concerthotels.com has a huge amount of unrelated yet harmful TLD Polish links to the .com website:
It is essential that we look at these harmful or so called “toxic” backlinks further and analyse them in more detail.
Another possible reason for being penalised could be these so-called “toxic” backlinks that are considered harmful for a website. These are, for example, the links from de-indexed websites, (beginning in 2012, google constantly de-index link networks and link farms websites) or from malware or virus infected sites. If a site has too many such links, you would probably get a google penalty, as according to google ideology, user security is of most important value.
In order to find out how many “toxic” links concerthotels.com has, we ran a LRT Link DTOX report:
Type in www.concerthotels.com and choose the option that we don't know whether the site received "unnatural link warning" from google as well as "Classic Mode" for checking existing backlinks.
Using the outcome of this report, we can see links from de-indexed sites as well as from infected sites. Already at the general overview part of the report results, we see the following picture:
From this we can of course assume that concerthotels.com has a lot of harmful links, but in order to differentiate between link types we have to filter the results for two of the rules: “TOX1” for links from de-indexed and “TOX2” from infected sites. Additionally, we see that only 42% of total links are identified as “healthy”. One can argue that a normal website should have at least 50% healthy links, but this of course does not guarantee that google will not penalise it.
So we start with links from de-indexed sites and choose the “TOX1” filter in the “Rule” Column:
We can clearly see that 99 of all links come from de-indexed sites and are therefore especially harmful for concerthotels.com. What we can also see is that many of those links have money keywords as an anchor text.
Let’s look at one example link that is recognised as toxic according to the TOX1 rule:
We click on the link and get the following page:
The page looks very unnatural and particularly awful. It is a simple link list on a free host, so definitely spam.
The chart below shows how many links from infected sites concerthotels.com has:
This time we have even more links, namely 158 links coming from malware or virus infected sites. We can see that most of the backlink sources are Polish .pl domains and most of the anchor texts are compound keywords (money keyword + brand).
Let’s look again at one link example from a malware site:
This is clearly a spammy page, but is moreover infected with malware or a virus and therefore a "no go" for google.
What conclusion can we derive from this data?
You could say that this is a possible example of negative blackhat SEO when competitors spam the site with such harmful links, but due to anchor text diversity, this is simply not good SEO work by concerthotels.com itself. Surprisingly, we can see many of those toxic sites among link sources on another site, chepoair.com, which was also hit by Penguin, so are those spammy sites part of one link network?
A total amount of toxic 257 or about 18% of total links amount as very high and are probably another important reason why concerthotels.com was hit by the Penguin update.
But what are those 41% of links that were automatically classified as suspicious? In more detail, they are as follows:
As we have 602 links classified as suspicious, we do not currently have the time to prove that each of them are so, but let’s just pick a few urls and check why our DTOX classified those links as unhealthy.
We’ll start with this page:
It is crucial to bear in mind that the target website is a website dealing with criminal defence issues in Phoenix and it is this fact of unrelated website topic that is suspicious. Moreover, the site consists of nothing more than different articles on unrelated topics with outgoing links.
Furthermore, what we see here is an article with two outgoing links having similar money keyword anchors. While the second link goes directly to the Mercy Church subpage on concerthotels.com, the first link goes to http://ellanichols.blogspot.com/2013/01/mercy-church-san-luis-obispo-reaches.html This is what you see if you click on this link:
This is what we expected to see, a short text with the same Mercy Church anchor deeplink taking you to a concerthotels.com subpage. This is called a Linkwheel We assume that the first article is a good example of linkbuying activities, for which google is currently penalising such sites.
Is it possible to find other similar structures or was this just one, single case? Let’s check this url:
The structure looks completely the same, the website consists of numerous articles, and our particular article has two links with the same anchors again. Just as in our first example, the second link is a deeplink to concerthotels.com and the first one links to http://www.tumblr.com/tagged/mercy-church-san-luis-obispo
A miniblog in tumblr relates to Mercy Church and again links to concerthotels.com:
This is just another Linkwheel with the same structure as in our first example. This definitely seems to be another important reason for the google penalty, as google "hates" linkbuying. The following url: http://www.90daypowersuccess.com/experience-the-sacrosanct-and-spirit-of-universe-god-through-mercy-church-san-luis-obispo/
may also show suspicious links:
This looks like another bought posting but this time it only has one external link with the same money keyword anchor linking to concerthotels.com.
If we check a few pages with another anchor, "uk hotels" for example, that has a 4% share of total backlink portfolio, will we find something common in these links? We begin with the url: http://www.diverselist.com/Travel-Tourism/Hotels-00146.htm
This website is nothing more than a very weak link directory. Such directories were often set up to artificially inflate link popularities and/or from which to sell links. While it was practice for years to fill up link profiles with these types of links for many SEOs, it is not recommended these days.
We continue and check another backlink source:
This is also a typical link directory; however it is possible to see something else: Here we have completely the same description as on the previous screen together with the same anchor text. This means that concerthotels.com submitted the same texts to numerous link directories, and without a doubt, it is quite realistic that such behaviour will be penalised by google.
All these examples show that we identified another reason, and probably one of the most important reasons, for the penalising of concerthotels.com from the Penguin 2.0 update.
Our findings for possible reasons behind the Penguin 2.0 update penalty are that a dangerous mix was created, resulting in a drop in rank and a google penalty for concerthotels.com keywords. Our research uncovered the following:
- Weak links from low-quality pages and domains having low Power*Trust & Power*Trust dom
- Unnatural follow / nofollow link ratio: concerthotels.com has 100% follow links, which is extremely suspicious
- Deep link ratio: concerthotels.com has 12% to 88% deep link ratio, meaning that 88% of total backlinks point to the start page and only 12% are deep links. This is a clear evidence of linkbuilding activities, since an average ratio among competitors is 98% to 2%
- Links from malware or unrelated theme domains: concerthotels.com has around 18% links from malware or virus infected links and around 50% from unrelated theme websites.
- Heavy linkbuying activities combined with numerous backlinks from link directories as well as money keywords as anchor texts have been a cause of this penalty.
What could be recommended to a website having such problems as concerthotels.com?
First of all we have to decrease the penalty risk in several ways:
- The first and maybe easiest way to do this is to disavow so called toxic links using google disavow tool. In the LRT DTOX Report there is also an option enabling the user to export all toxic links in required for google format.Here is the whole procedure step-by-step:
1. After we filter all the toxic links, export the links by clicking "Google Disavow Links":
2. You should then see the following screen with a popup asking you to carefully review the chosen links. Since the classification of these links as toxic was automatically carried out, there is always a chance that some could be some mistakenly processed as toxic:
After you accept the message download, the export file should begin automatically and you should see the following screen:
Click on the link and you will be redirected to the disavow tool directly where you have to upload the export file. The same can be applied for suspicious links, links which have emerged as a result of linkbuying activities or manual/automatic link directories submissions.
- The next recommendation would be to manually delete a maximum amount of links from low-quality pages and domains as well as a maximum amount of offline bought links.
- In order to change the follow/nofollow as well as deeplink ratio, it is recommended to either change the target URL and the link status of existing links or to acquire new deep links and nofollow links.
Final note: if something in this post reads like the low quality or spam links were created by the owner that is a misunderstanding. Actually based on the public data that is public, you cannot decide if spammy links were built by current site owners, past site owners or even a competitor (in lieu of a Negative SEO attack).
We wish you good google health!
This case study was written by Michael Levin, CLRTP and was reviewed and approved by Christoph C. Cemper for publishing as Certification work for the Certified LRT Professional level.
A word from Christoph C. Cemper
Michael was one of the first LRT Associates and I'm happy to approve him for the Certified LRT Professional level today! Michael showed proficiency in doing a backlink profile audit using the LinkResearchTools and pointed out critical issues to fix. Therefore I'm happy to certify Michael as the latest Certfied LRT Professional by approving and publishing his research on our site.
This is Michael's next step towards the Certified LRT Xpert level which is pre-requisite for the Certified LRT Agency certification. Both will qualify him to receive consulting leads from us. Our goal is to provide our community and clients a high quality service, and our certified experts are key to that.
I look forward to future work by Michael on his way. I can recommend Michael to work with you whenever you get a chance!