A Penguin Penalty Recovery Story - In Review
icelolly.com suffered from a dramatic Google Penalty in July 2013 sustained at the Google Penguin 2.0 update. We analyzed that penalty in detail just like many other Google penalties we have covered here on LRT, to get a deep understanding of Google Penguin and other penalties to improve our products and services.
LRT Certified Xpert Derek Devlin did that original analysis and three years on has been given exclusive access to re-analyze the work undertaken in-house by the team at icelolly.com. Thanks to this new detailed case study from Derek, we can peel back the curtain and get unique insights into the actions taken in-house, their disavow and clean up process, the real impact on organic traffic and ultimately, we get to see how icelolly.com not only recovered from the Penalty but also improved their organic traffic way beyond their old traffic levels.
Thanks to a cooperation and transparency of the team at icelolly.com we can now present you a fully featured follow-up case study of all the work they did including:
- Historical data from Google Search Console and Google Analytics
- The impact on the traffic of icelolly.com
- How icelolly.com managed to achieve a turnaround in organic traffic
- How the Google Penguin 3.0 update helped icelolly.com
- How their link graph changed following clean up
- How icelolly.com’s Disavow file looked
- Many more learnings, details and insights
Learn from the actions and drastic changes that the team at icelolly.com took. Follow their path on the way to recover from a penalty, protect from new penalties and build new organic traffic.
Read on and see what icelolly.com did to get rid of a Penguin penalty:
- 2013 Case Study Findings
- Impact of Penguin 2.0 on Organic Search Visibility
- icelolly.com Disavow File
- How has the Link Graph Changed Following Clean Up
- Findings & Conclusion
We look forward to your feedback and help spread the word on this!
- Enjoy & Learn!
Christoph C. Cemper
and the team of
Bonus: Download the full version of this case study as PDF for easy print or offline reading!
In 2013, I published an investigation that assessed the impact of Google’s Penguin 2.0 Algorithm on the popular UK travel website – icelolly.com. My analysis discovered that icelolly.com had been engaged in link building tactics, which were not compliant with Google’s Webmaster Guidelines. As a result, the site sustained a 67% drop in search visibility and a 28% drop in Organic Traffic.
Three years on, I was given unfettered access icelolly.com’s internal data and have been able to compare today’s position versus that of 2013, looking at the tactics currently employed, the link clean-up undertaken and the Organic Performance over time.
icelolly.com worked diligently over the last three years to build their brand and to make their site 100% compliant with Google Webmaster guidelines. The changes made have been far-reaching and aggressive. Their disavow file contains more than 11,000 domains, and LinkResearchTools (LRT) found that 77% of the live links pointing at the site are now disavowed. This is an astonishing move by a company with a significant organic market share and showed their intent to right the wrongs of old legacy SEO tactics.
There was much to risk in removing so many links, but it has paid off. icelolly.com’s link profile is now far less toxic than competitors, and the company has achieved two consecutive years of Organic Growth. Organic sessions are now above the level they were at pre-Penguin 2 and the company’s focus on brand, and quality content looks set to position them for continued success in the coming years.
Derek Devlin, June 2016
Derek created this research using our Superhero Plan, which allows you to perform professional SEO and backlink analysis for you and your competitor’s sites.
On July 23rd, 2013, I published a Case Study on linkresearchrtools.com assessing the impact of Google’s Penguin 2.0 update on the UK Travel price comparison website - icelolly.com.
icelolly.com were chosen because they topped a list of Penguin 2.0 “losers”, published by Econsultancy on June 11th, 2013.
Econsultancy cited that icelolly.com had lost 105,712 clicks, equating to a decrease of 75% search visibility across the queries they analyzed.
I speculated that such a high loss in search visibility coinciding with the Penguin 2.0 update indicated that icelolly.com had most likely been engaged in tactics outside the acceptable bounds of the Google Webmaster Guidelines.
Furthermore, a drop of this magnitude was sure to be hurting icelolly.com’s commercial success at what was a critical time of year for all travel websites.
Almost three years on… icelolly.com have now kindly given me unfettered access to the sites disavow file data, Google Search Console, and analytics.
In this follow-up investigation, I am going to assess the actions taken by the company to try and recover the site from Penguin and to gauge the degree to which they have been successful.
The methodology remains the same. Compare and contrast icelolly.com (factoring in their disavow file) with the top ranking competitors who were not hit by Penguin, focusing on the vertical of “Cheap Holidays” to assess whether anomalies still exist and to draw hypotheses based on their current search visibility.
First, a re-cap of the issues…
1.1 2013 Case Study Findings
Here is the resulting list of hypotheses for factors that I found could have contributed to the icelolly.com Penguin drop on May 23rd, 2013 based on the fact that the icelolly.com link profile was significantly out of proportion with the high ranking competition that passed through the update unscathed:
- Too many site wide links targeted exact match money keywords (even without site wide links, too many links targeted ‘money anchor text’.
- Too many site-wide links targeting the homepage.
- Not enough ‘brand’ OR ‘other’ keywords used.
- Too many weak links coming from sites that pass LRT Power but very little LRT Trust.
- Too many links from US websites.
- Too many links from weak and low-quality domains with LRT Trust 0.
- Not enough quality links as a percentage of their overall links.
- Too many blog links at the expense of ‘general’ sites.
- Too many links from old school content farm sites and domains that have been de-indexed from Google.
- As a whole, link profile was far more toxic than competitors.
2.1 Indexation - Site: Search
Let’s first get an up-to-date snap shot of icelolly.com’s search visibility as of the March 2016.
In July of 2013, I established that Google still had well over 400,000 web pages indexed for the site, so it was clear that icelolly.com had not been completely de-indexed, and my assumption was no manual action was in play, which has since been confirmed by icelolly.com – no manual action was to blame for the drop.
Looking now Google has 1870 pages indexed based on a site: search.
The number of indexed pages has dropped from 440,000 to just less than 2000. Clearly, this is a massive decrease. Based on my discussions with icelolly.com, this is not the result of a mass de-indexation by Google but more a reflection on the infrastructure improvements they have made.
Poor canonicalization has been addressed and sections of the old site that were “outdated and thin” have been removed altogether.
2.2 Sistrix VisibilityIndex
Sistrix’s VisibilityIndex indicated a very significant 67% drop in relative Search Visibility resulting from the Penguin 2.0 hit.
Looking at the same visibility index graph now in 2016 reveals that there has been very little movement in search visibility since that fateful hit in May of 2013. In fact, there’s been a small decline of 11% from post-Penguin visibility.
What is clear is that according to Sistrix’s Visibility Index icelolly.com has not “recovered” in the true sense of the word; in so much as they have thus far been unable to regain the visibility they once coveted pre-May 2013.
There was a jump in August of 2014; the only notable Google update at this time was the announcement by Google to make HTTPS a ranking factor. The increase is short lived. icelolly.com started a phased migration from HTTP to HTTPS in Dec 2014, where you can clearly see a small decrease followed by a gradual increase until April of 2015 when the full HTTPS migration was complete. Looking at the graph trend, migrating to HTTPS did not in itself result in a stepped change improvement in search visibility.
Despite the recent discussion coming from Google that the next iteration of Penguin will work in ‘real-time’, this version of the Algorithm is at the time of writing to our knowledge not yet live and working.
Historically, Penguin has been a binary update, so when Google ran the update, sites that triggered a set threshold of unnatural links were ‘flagged’ and demoted. Unfortunately, sites negatively impacted by Penguin have always had to wait for the next update to roll round before they could assess whether they would have their Penguin ‘flag’ removed – this is a painful process, especially when you consider the last confirmed Penguin update was more than 18months ago.
Looking at the Sistrix Visibility trend, one cold assume that Icelolly.com’s Penguin flag has not been removed, since the trend line has stayed largely flat. However, another scenario is such that the flag may indeed have been removed, which means the site is no longer subject to any Penguin suppression but that the visibility has not “recovered” to previous levels because extensive disavowing / link removal has de-valued the link profile to such extent that where the site sits now is actually a true reflection of its ‘authority’ based its current standing and not as the result of continued Penguin suppression.
2.3 Keyword Changes Over Time
More context can be added to the search visibility graph by looking more granularly at specific keyword movements. I’m most interested in looking at whether any keywords that dropped at Penguin 2.0 have improved ranking – if specific queries have rebounded, it could indicate that these keywords are free from any Penguin action.
Here are the Top 50 Keywords that lost ranking at Penguin 2.0, with their current rank. Rankings have been taken using Authority Labs rank tracker from Google.co.uk.
Only 11 of the 50 keywords that declined with the Penguin 2.0 update have improved their position. Of the ones that have improved none rank higher than position 20 and so I would deem these to be competitive.
Interestingly, the two most overly optimized keywords – “cheap holidays” and “cheap holiday deals” have improved their position though the improvements are marginal. “Cheap holiday deals” has been the best performer moving up 11 places from 37 to 26. Though considering this keyword ranked first before Penguin, position 26 on the third page is not that competitive.
25 of the 50 keywords that still ranked post-penguin have also been dropped and no longer rank in the top 150 positions.
Given that the majority of the keyword set is performing worse than at post-Penguin levels and that there have not been any very significant recoveries to page 1 of the SERPS then it is hard to say with conviction that the Penguin flag is no longer active on these keywords.
2.4 Impact on Traffic
Although visibility dropped by 67%, the actual real-terms impact on traffic six weeks after the update was a drop of just under 20%.
Looking at a path page-level analysis overleaf shows the sections of the site that evidenced the most significant drops.
The homepage saw the biggest real-terms drop. However, proportionately /resorts/ and /hotels/ saw bigger drops of >40% in Organic sessions. It was not just the homepage that was negatively impacted by the update.
Checking the web archive, it looks like /hotels was removed from the site sometime around June / July of 2014 and 302 redirected to the homepage of the site.
A check of this URL now using the server header check from SEOBook shows that it has a chained 301 redirect to the new HTTPS version of the homepage.
It’s a similar story for /resorts; this directory was 302 redirected sometime after 14th of November, which was the last active snapshot of the page captured by the Wayback Machine.
The next capture on 16th December 2013, shows the 302 redirect not to the homepage this time but to a new version of the same directory renamed as /destinations.
The Wayback machine first captures /destinations on 27th November of 2013, to me it looks like a new version of /resorts.
Whether the company took these steps to see if their Google Penalty would be lifted from these pages or not is irrelevant. What is interesting here is that we are effectively able to look back through time to see the impact of 302 redirecting a penalty hit directory to a new directory and to see if changing the name and repurposing the same or similar content is an effective technique for lifting sections of your site out of Penguin suppression.
Conventional wisdom has been that 302 redirects do not pass Pagerank (I disagree with this assumption) and so some people have hypothesized that a 302 redirect could act as a dead-end for Penguin because the Penalty would not pass over to the new URL
For me, this graph proves that assertion to be false:
You can clearly see that at the point at which /destinations is launched the directory is flat-lined at 0 search visibility and has no notable ranking visibility.
However, what is amazing to see is that almost one month after the Penguin 3.0 update /destinations rapidly starts to gain visibility. If you recall, Penguin 3.0 was reported to be rolling out over some “weeks” and so it is very plausible that Penguin 3.0 removed some or all of the suppression from icelolly.com:
Is it a co-incidence that icelolly.com’s new directory starts to gain traction rapidly after Penguin 3.0? I think not.
I would speculate that at least on the directory level for /resorts and in turn, the new incarnation of the directory /destinations the Penguin flag is more than likely no longer present.
Just as a side note, /resorts now 301 redirects through a chain of two redirects to https://www.icelolly.com/destinations.
/hotels was never redirected back into any other directory of the site, and it appears from looking at the Wayback Machine that /hotels was not resurrected in the same way that /resorts was. A new strategy was developed to focus on “types” of holidays rather than specifically on hotels.
A check of the Wayback Machine reveals that the first capture of /holidays was taken in June of 2013, just one month after the Penguin 2.0 update hit. It seems fair to assume that the new structure could have been in response to the update.
As with /destinations the new directory did not immediately gain SEO search visibility and it wasn’t until Penguin 3.0 that it started to gain visibility.
Here is a second example of a page being suppressed by Penguin and then it would appear to have been “released” at Penguin 3.0.
At this point, I can make an interesting observation.
Given that /holidays did not exist pre-penguin 2.0, it would not have been subject to any specific “targeted action”. I can, therefore, draw the conclusion that until Penguin 3.0 rolled out, any new pages / directories added to icelolly.com would have been suppressed.
This is regardless of whether they were linked to from previously hit URLs via redirects or not. The launch of /holidays shows clearly that new pages / sections of icelolly.com were unable to rank.
Looking at the full year following Penguin 2.0, May 23, 2013, to May 23, 2014, and comparing it to the previous year shows that the sustained impact on Organic visits from Google was a decline of 28.55%:
The next year, May 2014 to May 2015 versus the previous year shows a very nice gain of 50.58%, which intuitively does not feel symptomatic of a site that is continuing to be fully suppressed by Penguin and is contrary to the picture painted when looking at visibility movements in keywords hit.
The year-to-date is also showing double-digit growth based on the previous year, so since 2014 the site has achieved continual year-on-year progress and growth in Organic Traffic.
Finally, if we look at the same period above and compare it to pre-penguin Organic Traffic levels, we can see that May 2015 to the end of Feb 2016 is 30.45% up on the same period leading up to Penguin 2.0.
2.5 Visibility & Traffic Summary
- com was NOT completely de-indexed by Google and was not subject to a manual action at the time of the Penguin hit.
- icelolly.com dropped 67% in the Sistrix Visibility Index after Penguin 2.0, resulting in an immediate drop of 20% in organic traffic and a sustained impact for the following year of 28% drop year-on-year.
- The site has failed to regain overall search visibility based on the Sistrix Search Visibility Index. However, Organic traffic has increased impressively by 50.8% year-on-year for the year of May 2014 to May 2015 and then again by 11.5% for May 2015 into the end of Feb 2016.
- Unbranded keywords that were dropped by Penguin have in the main not improved, those that have are very marginal, and none is back to pre-penguin positions – most are anchored on page 3 or worse of the SERPS.
- I would speculate that the growth has come from continued development and success of icelolly.com as a brand entity and growth into new verticals where previously icelolly.com had not competed aggressively.
- Though it’s very hard to pinpoint because there is a lot of ‘noise’ in the data, particularly from seasonal trends, it certainly appears that there was an uplift in Organic Traffic around the Penguin 3.0 update.
- It’s possible that the current search visibility is a reflection of the sites “new authority” score rather than as the direct result of any Penguin action on the site but on balance, I feel that icelolly.com is free to rank unimpeded, certainly in verticals previously not demoted.
- The most commercially damaging keyword drops from Penguin 2.0 occurred in the “Cheap Holidays”, “Cheep Holidays” and “Cheap Holiday Deals” keyword verticals and only “Cheap Holiday Deals” is in a higher position now that it was post-Penguin.
icelolly.com is gaining Organic Traffic year-on-year so from that perspective; the site has indeed “recovered”. At least one thing is clear, although hit hard by Penguin 2.0, three years on there is still life after Penguin; even if visibility trackers appear to tell a different story. This should be good news to webmasters who previously felt that attempting to achieve Pre-Penguin traffic levels was a hopeless endeavor.
3.0 icelolly.com Disavow File
Based on my observations and discussions with icelolly.com, it’s clear to me that they have been committed to cleaning up legacy tactics of the past and making the site fully compliant with the Google Webmaster guidelines. The new head of SEO was appointed 18months ago and has implemented a rigorous program of link cleanup and disavow.
In total, 11,787 domains have been disavowed, as well as 14 individual URL’s.
I have been given full access to the disavow data for this investigation, which allows me to view the link profile the same way Google does, discounting the links that icelolly.com believe to be unnatural.
I have factored the latest disavow file into LinkResearchTools (LRT), where I can compare the Link Graph as it stands now versus the top competitors in the “Cheap Holidays” vertical to see if any anomalies still exist. I will also be looking at the change since 2013 since this will provide an indication of the work that has been done to try and clean up / rectify the legacy tactics.
Let’s dive into LinkResearchTools (LRT).
3.1 Topline Metric Comparison using Quick Domain Compare (QDC)
Using the Quick Domain Compare (QDC) is a great way to get a bird’s eye view of what is going on with the site and uncover some initial clues that can help decide where to focus during the most in-depth stages of analysis.
In 2013, the top 4 ranked sites were for “cheap holidays” were:
Using the “find competing pages” function to pull in the top ranking competitors as of Feb 2016 the tool returns the latest top ranking sites on Google.co.uk:
Thomson is the only site to have made the top 4 in 2013’s analysis and maintained that to the present day.
I am using the current top ranked site for “cheap holidays” and not the previous set of competitors because I want as best as I can to profile the sites that Google continues to view as the best candidates for top rankings, rather than doing a comparison against sites that have since fallen away from the top spots.
Here is the QDC comparison table of the top 4 domains side-by-side with icelolly.com:
What can we learn from this initial topline analysis?
- Trust is still lower than power.
As we have seen with many of the Penguin 2 victims featured in the previous LinkResearchTools case studies, there is a disparity between LRT Power and LRT Trust for icelolly.com. This is symptomatic of overly engineered links that are passing higher Pagerank relative to the trustworthiness of the site the links are found on.
Disproportionate power to trust suggests that there may still be a proportion of low-quality links to be found pointing at icelolly.com. Links with more power than trust – this is something we will need to explore as we progress through our analysis.
Interestingly, Thomson – the site that has stood the test of time by ranking top 4 in 2013 and today in 2016 has an equal power to trust ratio. Expedia, the top ranked site as of the time of writing has a higher Trust score than Power. This emphasizes the importance of having a highly trusted link profile and how it underpins high rankings.
It should be noted that the Power*Trust score for icelolly.com is identical to 2013.
- Still the Lowest Number of keywords Ranking
As in 2013, icelolly.com still has fewer keywords ranking in comparison to the other leading sites in the “Cheap holidays” vertical.
- Lower ratio of Site wide links than competition
2013’s analysis found that icelolly.com did not have an excessively high ratio of site wide links relative to competitors. Looking at the site-wide ratio of the site revealed that icelolly.com had not overdone it with the NUMBER of site wide links.
This fact remains to be true. A site wide ratio analysis reveals that icelolly.com still has the lowest ratio of site wide links per site of any competitor in the vertical. There is scope to increase site wide links back into the link graph without any concern for appearing as an anomaly.
The site wide link ratio was previously 22 in 2013. This number has more than halved, which means site wide links have either been removed or disavowed.
I noted that the volume of site wide links did not in itself appear to have not been a contributing factor in the Penguin hit but that this did not mean that site wide links were not harming icelolly.com.
My hypothesis was that too many site wide links targeted exact match money keywords and that too many site wide links were targeting the homepage, so I am encouraged to initially see the site-wide ratio decrease. Hopefully, further investigation will show the offending links to have been significantly reduced.
- Referring domains has dropped by 427
827 referring domains link in now, versus 1265 in 2013.
N.B.: this is the ad-hoc analysis from LRT and so does not include links from all link sources. It is, however, a relative measure that holds validity. The true number of referring domains is higher.
- Lower .Edu Gov links than main competitors
icelolly.com had 0 .edu and .gov links in 2013, this was substantially less than the main big brand competition.
It was unlikely that the lack of.Edu and.Gov backlinks had been the cause of a penalty from Penguin 2.0, but it could be a contributing factor towards the lack of LRT Trust attributed to the site, inadvertently causing icelolly.com to lose out when the filter was implemented due to a lack of trust and an over-abundance of power.
There are now three .edu domains linking to the site, so this measure has improved.
- icelolly.com is competing well on Facebook
Except Thomson, who has a very significant social following, Icelolly looks to be competing well on social metrics. Although well behind on Google +1’s the site is holding it’s own for Facebook based social signals.
3.2 Quick Domain Compare Summary
The quick domain comparison has already provided a great starting block from which to start a deeper analysis.
- Low trust links are likely still present in the icelolly.com link profile.
- icelolly.com has a more narrow competitive focus than competitors and ranks for the least amount of keywords.
- The ratio of Site wide links has decreased and remains the lowest of all top 10 competition.
- The link graph has reduced in size with less referring domains found versus 2013.
- icelolly.com has gained three .edu links since 2013.
- icelolly.com has good social metrics and is especially strong on Facebook.
In 2013, all hypotheses concerning factors that could have contributed to the penguin hit were drawn from cross-examining the link profile of icelolly.com against high performing competitors, this provided context and helped to identify anomalies.
In this analysis in 2016, I am still on the look out for anomalies, but I’m also interested to gauge how the link graph has changed over time and to assess the degree to which disproportionate ratios have been corrected.
4.1 Anchor Text Diversity
To understand the impact of the links as a whole, I always start my link profile analysis by looking at the link profile in its entirety, so site wide links are NOT restricted.
In 2013 post-Penguin, 48.3% of links (including site wide) were found to have the exact match anchor text “cheap holidays”…
48.3% for one single keyword was very, very high.
Having such a high density of exact match commercial anchor text for one keyword is obviously a very unnatural signal to Google.
Looking now, the word cloud for the site’s anchor text profile looks a lot more in keeping with that you expect from a more “natural” looking site. Brand anchors are the dominant force and the exact match for “cheap holidays” has disappeared.
Looking at the breakdown of SW Anchor Text, the site URL is the top anchor.
Only two anchor texts stand out as being overtly commercial. “Menorca Hotels,” which has 14 links and represents 0.5% of the link graph and also “cheap hotels | cheap holiday deals | low-cost holidays,” which is slightly higher at 0.7% of links. As you might have guessed, this is the Meta Title of the site for the homepage.
It’s common for sites to attract links based on their Metadata and instance shows how it’s important to be conscious of the approach taken to craft your meta tags. Having overly commercial keywords repeated in your title is a risky strategy if you then get a disproportionately high number of links based on that title. In this instance, we are talking about 0.7% so not in my view a threat but repetitive keywords in the title is no longer best practice so I would look at testing these.
If you consider that out of “cheap holidays | cheap holiday deals | low cost holidays” that low-cost holidays ranks highest at position 6, and the others are >position 20, I would be tempted to try and focus exclusively on “Low-cost holidays” for the homepage and then also build out some new pages to focus on cheap holidays and cheap holiday deals, since it may be the homepage that is incapable of achieving a rank of <20 for those keywords. Certainly, worth a test since cheap holidays and cheap holiday deals over-achieved 130 odd clicks according to Search Console in the last 28days.
I next set the site-wide links filter to 5 per site to get a more accurate picture of how the site wide links are impacting the overall anchor text profile of the site.
There is nothing immediately concerning about the word cloud. Brand URL links are still the most prominent.
Looking at the detailed breakdown, “Menorca Hotels” and “cheap holidays | cheap holiday deals | low-cost holidays” are diluted more to 0.5%, which is in my view an acceptable concentration of these links.
I can now see how the link profile has changed compared to the keyword ratios found against the top performing competitors.
Previously, in 2013 the results with site wide links INCLUDED showed a significant anomaly:
Not enough ‘brand’ OR ‘other’ keywords had been used – instead far too many ‘money’ keywords!
TIP: Read more about the Keyword Intelligence technology built in LinkResearchTools (LRT) and learn how to classify keywords in "money", "brand", "compound" or "other".
Checking this with site wide links set to 5 to take site wide links out of the equation still showed a significant over indulgence in money anchor text links:
Far too many money keywords were used!
The percentage of brand keywords increased slightly by filtering out the site wide’s, so it was clear that site wide links were playing a part in increasing the money anchor text.
That didn’t hide the fact that even without site wide links; icelolly.com had massively overdone it with money keyword anchor text links.
We can see just how much they had overdone it when we look at the same graphs illustrated with the ‘absolute’ number of links.
icelolly.com had more than double the amount of keyword links compared to the average for the whole of the top 10 ranking sites in this vertical. Clearly, icelolly.com had been far too aggressive towards building money keyword anchor texts.
This was likely a contributing factor to being hit by Penguin 2.0 in 2013.
By comparison, 2016 shows a much-improved picture.
With SW included, money anchor text has been transformed. Relative to the remaining links in icelolly.com’s link graph, money anchors represent 4%, which is just below the vertical target of 6% - while brand anchors are sitting at 80%, which is much higher than the vertical top 3 average of 45%.
N.B.: LRT shows the graphs without disavowed links but does not recalculate the %.
With site wide links set to 5 per site, brand anchors represent 77% of remaining link graph of icelolly.com with money comprising just under 6%.
Looking at the absolute number of links tells a story. It’s here you can see just how many links have been removed / disavowed and the implications of these actions.
icelolly.com is now significantly behind the top ranking competition regarding the volume of links they have active in their link graph. There is scope to increase the number of links in all of the keyword categories. That being said, I would want icelolly.com to continue to focus on attracting brand links. This is where the biggest gulf exists.
4.2 Follow / NoFollow Links
In 2013, I concluded that the ratio of Follow to NoFollow links was in line with competitors and did not represent a significant anomaly.
Site wide links were pushing the percentage of followed links up slightly, but I didn’t see anything overly concerning about that. In fact, icelolly.com had a slightly higher proportion of NoFollow links than competitors, which is sometimes an indicator for the presence of more spammy links like blog comments and bookmarks.
Looking now, we can see the extent to which icelolly.com has aggressively sought to eradicate those unnatural links.
In total, just less than half of all live links are actively disavowed. 12.6% of the link profile that is disavowed are actual ‘NoFollow’ links so the company has taken the view that unnatural NoFollow links also present a threat to the integrity of the site and should be disavowed. This is a view I share as I think it shows clear intent to Google that you are taking action against any and all unnatural links and not just ones you deem to have SEO value.
Learn more about the potential risk of NoFollow links from this research published by Christoph C. Cemper: NoFollow Links – Risky or Irrelevant?
Here you see how the link profile looks now in its entirety. 49% of live links disavowed, 37% of live links do follow, and 10% no follow. Redirects are 3% of total live links.
4.3 Link Type
In 2013, the analysis showed that Text link percentage was in line with the competition and redirects were four times higher:
In the main the redirects active were found to be other TLD extensions as well as possible misspellings of the brand name:
Most of the redirected domains only had between 10 and 20 links, so relative to the total number of links for icelolly.com these represented a very small portion, and I deemed their influence to most likely have not been that significant.
From my initial assessment, I felt that redirects did not look like a major contributing factor to the drop in search visibility.
Looking now in 2016, roughly the same volume of redirects exists:
Assessing the redirects with links returns only ten results coming from 2 referring domains icelolly.co.uk and the icelol.ly branded URL shortener.
A quick look at the link graph for icelolly.co.uk shows there’s nothing that untoward. Just one anchor stands out as likely being unnatural and requiring attention.
Assessing the absolute number of links by Link Type again reveals the gap that now exists between icelolly.com and the competition, particularly in terms of the volume of text links required.
4.4 Deep Link Ratio
When more links point at the homepage than the inner pages, I’m always concerned. This is usually a sign of a heavily engineered link profile.
In 2013, site wide links were skewing the link graph. Homepage links went from 67.8% down to 39% when restricted to 5, which allowed me to infer that site wide links had mainly been pointed at the homepage. It’s possible that this was a negative signal.
Previously, with site wide links set to 5 the deep link ratio looked fine:
However, with site wide links activated, icelolly.com looked like an anomaly.
Based on that analysis, I believed too high a percentage of site-wide links are targeting the homepage relative to the overall link graph.
In 2016, the deep link ratio metrics are much improved, and more links now point to the internal pages as a proportion of the total link graph than do the homepage.
Even with site wide links included, we can see that a higher proportion of links target internal pages:
Restricting site wide links to 5 per site shows 55.4% are deep links versus 44.6%, which point at the homepage.
In comparison to competitors, this ratio is looking on track:
Looking at the absolute number of links, both homepage and internal there is scope to increase both. If added in even measure then the site will be in good shape:
4.5 Geographic Location of Links
In 2013, icelolly.com had significantly overdone it with US-based links and given that the primary target market for icelolly.com is the UK. This was most likely NOT a positive signal.
69.9% of links were from sites hosted in the USA, only 21.7% were from the UK:
Even with site wide links restricted, I could see that Icelolly.com had overdone it with US links:
Now in 2016, we can see that the ratios have drastically improved:
Links coming from other UK based sites are the dominant location for linking domains.
Looking at the absolute graphs shows that the number of links now coming from UK based sites is double that of the US sites.
Judging by the amount that US-based links have reduced, its fair to assume that these links were in the main unnatural because a very significant majority of them have been removed or disavowed.
Although icelolly.com has a lot of headroom to grow US based links, I would recommend that they continue to focus on the UK market, as this is sure to provide a very strong signal that they are an authority in the UK travel sector.
Link networks were not found to be prevalent in 2013, and this was ruled out as a likely contributing factor for Penguin in this case. Any lack of diversity in the IP address profile could be explained and appeared reasonable.
I want to see if this IP diversity remains strong:
Looking at the Class C Popularity report, I see the icelolly.com misspelling variations are still hosted on the same IP address as before, which is perfectly reasonable. However, they are no longer the dominant IP address linking to icelolly.com.
The top hit is 126.96.36.199, which are all local news sites owned by the NewsQuest Media Group:
To a search engine, having 21.9% of your link profile coming from 121 domains all on the same IP address most likely looks a bit suspicious. The saving grace is that these domains are all trusted local news entities and the article icelolly.com is linked from has just been syndicated across their local publications. These links were all earned, and so I see no reason to disavow them.
The second top IP address linking in is coming from blogger blogs:
That being said, in disavowing a very significant portion of their link profile due unnatural links the links that are left could do with having more diversity in IP address.
4.7 High-Risk Links
icelolly.com’s link profile very previously flagged as “Very High Risk” in 2013:
With only 24% of links are marked as “healthy”:
It should be noted that in the last three years, Link Detox has been refined, and toxic link detection has improved greatly. It’s very likely that the Link Detox of today would have calculated a much higher score that 671.
Looking now in 2016, the domain-wide Detox Risk has reduced to 534 and is now classed as below average.
You can see that Detox found that 8,630 links had been disavowed – 77% of the live links found. This is higher than our initial analysis because with Detox I have been able to import many more links into the system for consideration.
Taking Disavowed links out here’s how 2016 looks now:
Only 6.1% of the link profile is scored as being “high risk”. Around 10% are average or above average and the remaining 84.2% are either below average, low or very low risk. From a toxic links standpoint; this is a great result and testament to the rigorousness of the link cleanup undertaken.
Comparing the rules triggered by Link Detox to competitors showed clearly where the anomalies were:
Links for deindexed sites (TOX1) showed the biggest discrepancy, followed by Weak links (SUSP1 and SUSP4).
4.7.1 Links from De-indexed Sites
One of the most significant issues in my view from 2013 was the prevalence of links coming from domains that had been de-indexed from Google. It’s my view that this is one of the most damaging signals since it’s a clear indicator of links coming from “bad neighborhood’s”. These are most likely sites that have themselves been penalized for Blackhat SEO techniques.
A massive 23.2% of links were from sites de-indexed by Google – 543 domains – my analysis showed these to be primarily content farms:
In 2016, no such links exist. All links from sites de-indexed by Google have either been disavowed or removed:
4.7.2 Weak & Low-Quality Links
In 2013, more than half of the unhealthy links (60.3%) triggered from rules that indicate the linking domain was very low quality.
33.4% links were coming from domains that had no links themselves (SUSP 1) – these are virtually worthless links as they have 0 LRT Power and 0 LRT Trust.
26.9% links from domains that are weak because they have little power and trust, and they also don’t rank for their title (SUSP4).
This explained the lack of earned LRT Trust for icelolly.com.
Today, the proportion of very weak links is much lower. Though the number found for SUSP1, this time around is higher.
20.6% for SUSP1, down from 33.4% and just 1.9% for SUSP4, which is a massive improvement on the 26.9% found in 2013.
4.7.3 Relative Link Power
In 2013, there was far too high a proportion of LRT Power 0 links; not enough LRT Power 2, 3’s, 4’s or 5’s as a proportion of total links:
Looking at link POWER pointing back to icelolly.com in 2016 shows the extent to which LRT Power 0 links have come down:
There is still a discrepancy in the mid-level power links, 2,3’s and 4’s are some way behind competitors.
4.7.4 Relative Link Trust
TRUST was the same. Too high a proportion of LRT Trust 0 links; not enough LRT Trust 1, 2, 3’s, 4’s or 5’s as a proportion of total links.
Looking at link TRUST on it’s own in 2016 shows a much-improved picture again:
Trust 0 links have reduced from 78% of links to just 7%, still a little higher than the top 3 competitors but much more in keeping with the Top 5.
4.7.5 Relative Link Power*Trust
Looking at LRT Power*Trust combined, it was the same story, as before in 2013 there was far too high a proportion of LRT Trust 0 links; not enough LRT Trust 2, 3-4, 5-7 or 8-12’s as a proportion of total links:
In 2016, Power*Trust 0 links remain higher than the top 3 ranking sites as a proportion of Icelolly’s link profile, but they have still significantly reduced.
4.7.6 Absolute Link Power*Trust
We get a better view of what has changed by looking at the absolute numbers of links.
In 2013, icelolly.com dwarfed the competition in the volume of low-quality links.
In 2016, the number of Power*trust 0 links have been reduced to an acceptable level below that of the top 3 ranked sites in the vertical.
The sobering insight here is the gulf in high authority links.
The priority now has to be on earning links from more established and higher authority sites.
4.7.8 Link Detox Rule Comparison
icelolly.com now sits below competitors on all spam rules in terms of the volume of links present:
Looking at the relative proportions of the Link Detox Rules shows that SUSP7 and SUSP8 are a little higher than we would like to see, as are SUSP22 and SUSP24.
SUSP7 – domain has the same IP as other linking domains and SUSP8 - domain has the same Class-C as other linking domains are no surprise since I have already uncovered that IP diversity has suffered as a result of large scale link cleanup and disavow.
SUSP22 and SUSP24 are also rules related to the detection of link networks; they look for footprints in domains that link to your site and as with the lack of IP diversity I’m sure to have been triggered due to the high concentration of local News Sites and Blogger blogs linking.
4.8 Links from Malware Sites
Although not excessively high, I’m pleased to see that TOX 2 links from Malware infected sites are 0 – down from 12 previously.
The longtime favorite of Blackhat SEO’s was using mass-submission software that automated the process of uploading articles and links to directories.
In 2013, 2.5% of unhealthy links triggered SUSP15 & SUSP16:
4.10 Overall Link Risk Profile Versus Competitors
Comparing icelolly.com’s overall Link Risk profile versus competitors back in 2013 showed that the average link risk was “Very High” when they needed to fall into the moderate risk bucket to blend in with competitors. Here is what the Competitive Link Detox (CDTOX) results pointed out:
Suspicious links were not abnormally high as a proportion of the overall link profile. The key problem area was that there were far too many toxic links and not enough healthy links:
The “very low” risk links aren’t high enough, whereas the “high” and “Very high” risk links were over-powering the link graph:
I concluded that icelolly.com’s link risk profile as a whole has most likely contributed to the drop in rankings, just too many poor quality link signals from bad neighborhoods’ and weak sites, created for the sole purpose of passing Pagerank.
Let’s see how this has changed…
Now the inverse is true. Competitors are showing returning average Detox Risk scores way above icelolly.com.
Sitting at 260, icelolly.com has the lowest score in the vertical.
Drilling down in the metric comparison, we see the extent to which above average and high-risk links have been eradicated from the link graph. Very low-risk links now dominate.
4.11 Link Velocity Trend
Link Velocity refers to the rate at which a site gains or loses links.
Essentially, we are trying to ascertain whether links are being built at the same rate as competitors or whether there has been some surge in activity either positive or negative that could make the site stand out from the crowd and be targeted by a Google algorithm.
My 2013 analysis showed that March of 2013 was the time when icelolly.com were at their most aggressive in terms of link acquisition, which wasn’t so significant because competitors were also highly active at this time.
Post Penguin 2.0, half of all sites in the top 10 have cut their link building efforts significantly while the others have kept on the gas at a similar pace as before. Presumably, the top sites have been buoyed by their success and given renewed confidence in their tactics.
For icelolly.com, link building in June 2013 immediately after the update ground to a stop as visualized by the lighter areas in the table.
Since 2013, the link graph has decreased in size, which would indicate that new links are not being acquired at the same speed that they are being lost.
This graph from Ahrefs shows the link trend over time; the blue line is total referring pages so this accounts for site wide links, and the yellow line is referring domains.
We can see that 2013 and 2014 saw significant declines in both referring pages and referring domains. Referring pages has stayed roughly static since Sep 2014 but pleasingly; referring domains has started to turn around. The trend line is up and to the right and continues to grow. This is an indication that the company is back on the right track.
If we look at referring domain Link growth relative to the icelolly.com domain then we see that only Holidaygems is acquiring links at a slower pace.
icelolly.com have some way to go to catching the top competitors in the vertical. Not only are they far behind in the size of the link profile following this extensive cleanup but they are also not acquiring links at the same pace as competitors. I would recommend to icelolly.com to focus on initiatives that will drive growth in links.
In 2013, icelolly.com was hit by a significant Google Penguin update that resulted in a 67% drop in search visibility and a 28% drop in Organic Traffic.
This case study set out to understand the actions taken by icelolly.com to try and correct the anomalies that existed in their link graph as found in July of 2013. I wanted to gauge where the site sits now, what efforts have been made to clean up toxic links and to see if the steps taken have enabled the site to emerge from the Penguin abyss.
Encouragingly, Organic Traffic is up for two years in a row, and Organic sessions are now above the level they were at pre-Penguin. This is despite the fact that none the top 50 keywords that were hit by Penguin 2.0 have returned to their pre-penguin levels and only 11 have improved to positions better than those, which they held just after the update. My assumption is that growth has come primarily from the brand and from exploiting previously untapped keyword verticals, where over optimization was not an issue.
Penguin 3.0 looks to have been an important update where sections of the site were at least afforded the opportunity to rank, previously newly launched sections of the site appear to have been flat-lined. I would speculate that Penguin 2.0 was not impinging new sections of content added to the site after Penguin 3.0 ‘released’ some portion of the suppression on the site.
The changes made by icelolly.com have been far-reaching and aggressive. The disavow file contains more than 11,000 domains, and LRT found that 77% of the live links pointing at the site are now disavowed.
Site wide links that had previously negatively influenced icelolly.com’s link graph and disproportionately increased the signals for exact match money anchor texts and homepage link ratio have been dramatically reduced. As a result, the anchor text profile of the site is almost exclusively made up of brand anchors. icelolly.com now has the lowest site-wide links ratio of any site analyzed in the “cheap holidays” vertical at present.
The Deep Link Ratio has been corrected, and the proportion of links pointing at the site now skews slightly towards the internal pages, rather than the homepage.
icelolly.com continues to have a challenge with too many weak links.
Weak links are disproportionately high but the numbers in real terms have been brought down well below the threshold set by competitors. The link profile is out of proportion with the competition in this regard - still there are too many links from weak and low-quality domains with LRT Trust 0 as a proportion of the total link profile and not enough quality links as a percentage of the overall link profile.
The task now is to try and earn links from much more trusted and highly authoritative websites because the top quality links are very sparse and are significantly lower than that earned by the top ranking competitors.
Taking positive action against the vast degree of unnatural links has had the side effect of leaving a link profile that is significantly reduced in terms of link volume compared to competitors and also weakened in terms of IP diversity. Link networks have not been used, but the concentration of fewer referring IPs has been exacerbated by the link clean up process.
Links coming from the UK now make up the highest proportion of links from any country and links from the US have been decreased significantly below levels of competitors. Links from old school content farm sites, link directories, article directories and links from domains that have been de-Indexed or infected with Malware have also been completely eradicated.
As you would hope with such far reaching link cleanup and disavowing, icelolly.com's link profile is now far less toxic than competitors providing a good foundation upon which to build. icelolly.com targeted both followed and NoFollow links in the disavow process, and the ratio of followed to NoFollow links remains acceptable and in line with competitors.
Link velocity is a challenge as competitors are in the main earning links at a greater rate than icelolly.com. Encouragingly, after two years of link decline following the Penguin 2.0 Update, the site has started to see an increase in link growth and referring domains are positively increasing over time.
The big challenge here for icelolly.com is to close the gap, the link profile has been dramatically reduced and rightly so. However, this has left a void between where the site currently sits and where competitors who have never had to go through this process currently sit. They have a distinct advantage that is only going to be made up but building out more long-tail content, broadening the sites reach into new verticals and by continuing to focus on building icelolly.com the brand. In the meantime, creative link earning campaigns should continue to be explored.
A word from Christoph C. Cemper
Our LRT Certified Xpert, Derek Devlin conducted and wrote the initial analysis three years ago and I am excited that he not only helped icelolly.com recover from the Google Penalty, but also improve their traffic beyond past levels AND get approval to publish all the details about their success with us.
I am more than happy to publish Derek Devlin's findings and experiences on our site and re-certify Derek as LRT Certified Xpert.
Our goal is to provide our clients with quality service and knowledge. Our LRT Certified Professionals and Xperts are key to achieving this goal.
I look forward to his future work, and recommend Derek Devlin to work with you, whenever you get the opportunity!