Don't miss the game changer LRT Smart | Contact Us+43 720 116440(866) 347-3660+44 800 011 9736(877) 736-7787
Home » Case studies » 9 useful tips to use ScrapeBox for your SEO workflow

9 useful tips to use ScrapeBox for your SEO workflow

ScrapeBox is a well known tool for black hat SEO, but white hat SEOs have found it to be a very useful and time saving tool for on-page SEO, link analysis and even link building. In this guide, LRT Xpert – Bartosz Góralewicz, shows us how to use this tool of darkness to make Google very happy with your website.

How to Find and Simulate Links in Bulk, with FIVE secret methods.
Don't be that SEO to just use the gut feeling, "strong recommendations" in link building.

Learn how to find great links with five link prospecting methods you probably never heard of before.

Understand and see how to simulate links and their risk before acquiring them, without any risk to your rankings, in bulk.

Examples created using the Superhero Standard plan

The Superhero Standard Plan includes all our 25 link data sources and allows you to perform link analysis and monitoring, SEO competitor research, Google recovery and link building for your own or your competitor's sites. You get your website's complete backlink profile re-crawled and enriched with all SEO metrics you can dream about for your analysis. This makes all the difference.

9 useful tips to use ScrapeBox for your SEO workflow

How to use ScrapeBox for link audits, link building, and on-page SEO

Google must hate ScrapeBox. It’s long been a favorite tool of black hat SEOs. But now this infamous tool is finding new life as an excellent timesaver for white hat SEOs.

In this revealing tutorial, LRT’s Xpert and case study extraordinaire, Bartosz Góralewicz, shows us how to use this tool of darkness to make Google very happy with your website.

Ditching Excel is just one of the many great reasons for using ScrapeBox to do the heavy lifting on your next link audit. If you still haven’t audited your sites, please do your sanity a favor and audit your links ASAP.

Going crazy is all too common among us SEOs. Thankfully, tools like ScrapeBox help us put more distance between us and padded walls. I’m sure you will find this tutorial as useful as we did. It’s yet another lifesaver any serious SEO cannot live without.

- Enjoy & Learn!
Christoph C. Cemper

Bonus: You can download this Tutorial in PDF, ePub, and Kindle format for easy offline reading. Get the eBook Download for a Tweet

Table of Contents


The SEO community seems to be divided into two groups: fans of manual work and link processing with e.g. Excel, and fans of tools made to speed up the process, like Link Research Tools and ScrapeBox, created specifically to speed up working with backlinks. Obviously, I am not a huge fan of Excel myself.

ScrapeBox was one of first automated Black Hat tools. Nowadays I doubt that anyone is still using ScrapeBox for mass blog commenting. Not because comment links are not that powerful anymore, but mostly because there are many more advanced Black Hat SEO tools.

Now, after 5 years, ScrapeBox is making up for all the SPAM issues from the past by helping with link audits, White Hat link building and on-page SEO work.


Why ScrapeBox?

  • Scrapebox is really cheap – it costs only $57 ($40 discout price) when bought with the BlackHatWorld discount here:
  • It is a one-time-off payment for life
  • It is easy to use


Scrapebox overview

As you see, the tool itself looks quite simple when opened. Now let me explain the most important fields step by step.

Scrapebox overview explained

You can do most of the list processing from the main window above. For some more advanced actions, we need to go to the “Addons” tab.

Scrapebox addons


Processing the link lists

Let’s start with the basics. As a good example, each one of you has been in a situation where you were just flooded with link reports from an ex SEO agency, exports from Google Webmaster Tools, MajesticSEO, Ahrefs etc. I bet each and every one of you that having this organized quickly has a huge value, but only when you can do it within 5 minutes.

Let’s see what we can do with such a situation step by step.


1. Removing duplicate links



  1. Open ScrapeBox
  2. Click “Import URL list”

Scrapebox add url

  1. Click Paste/Add from Clipboard (or “Import and add to current list” if you want to import from a TXT file).
  2. Repeat with each of the lists you have
  3. After you’ve pasted all the URLs, click Remove/Filter

Scrapebox remove duplicates

  1. Click Remove Duplicate URLs

Scrapebox remove duplicates

Now you’ve got a list of unique URLs that you can save, and use only 1 file for all your reports.


2. Removing duplicate domains



  1. Open ScrapeBox
  2. Click “Import URL list”

import urls

  1. Click Paste/Add from Clipboard
  2. After you’ve pasted all the URLs, click Remove/Filter

remove duplicate domains

  1. Click Remove Duplicate Domains

remove duplicate domains results

Now you can see the list of unique domains. ScrapeBox will show you the popup window with a report about the amount of duplicate domains removed.

Personally – I like to keep stuff tidy. I would rather see all of those domains in a top domain format only. So instead of


3. Trimming URLs to root

With any list or URLs uploaded to the harvester’s field, just click the “Trim To Root” button.

trim to root

After that we should see only top domains with no subpages on our list.

trim to root results

Now our list is de-duplicated and trimmed to root. This format is usually used when working with disavow files (for example to see the percentage of disavowed domains vs. “alive” domains linking to our website).


4. Exporting the lists

If you would like to export this (or any previous) list, you can of course do that as well. There are also some options to choose from while exporting lists.

To export any link list, simply click “Export URL List” and choose the option that suits you.

export urls


Export as Text (.txt)

Simply exports all the URLs to a .txt formatted file with the URLs listed one after another.


Export as Text (.txt) and split list

This option is really helpful when working on really large link lists. URLs will be exported into multiple .txt files with a selected amount of URLs in each of those files.


I personally use it when I want to use the Link Research Tools – Link Juice Tool, that accepts up to 10,000 URLs. If I want to analyze a larger list, I split it into the chunks that I then paste into the Link Juice Tool.


Export as Text (.txt) and randomize list

This option is pretty self-explanatory. All the exported URLs will be randomly sorted in the TXT file.


Export as Unicode/UTF-8 Text (.txt)

I have never used this option, but you can simply change the URL’s formatting to Unicode or UTF-8.


Export as HTML (.html)

This is an interesting option, used for indexing backlinks in the past. It exports all the URLs to the HTML list of links (example below).

export urls html


Export as Excel (.xlsx)

Exports all the URLs to Excel format (.xlsx) (example below).

excel export


Export as RSS XML List

Creates an RSS feed from the link list and exports it in XML format (example below).

export rss xml


Export as Sitemap XML list

Creates a sitemap-format XML file from the link list. Really useful when used with the Sitemap Scraper addon.


Add to existing list

Simply adds the URLs on the list to existing the TXT file.


5. Checking for dead links/domains

After we’ve gathered all the backlinks, we need to find the alive ones and filter out the dead backlinks. We can do this with the ScrapeBox Alive Check addon.



  1. Open ScrapeBox
  2. Go to Addons and click ScrapeBox Alive Check

ScrapeBox alive check

  1. Load the URLs that you would like to check

ScrapeBox alive check

  1. Now, all you need to do is click Start.

ScrapeBox alive check results

  1. After the check, we should see a window similar to the one above. You will find the stats of the check in the bottom of the window. Now all you have to do is save your alive and dead links.
  2. Click Save/Transfer and pick the option that suits you.

ScrapeBox alive check save transfer


6. Remove duplicate URLs or domains in a huge list (up to 160 millions)

This is a feature that I wish I’s known about while working on my case study. With it, we can merge, de-duplicate, randomize or split huge lists (up to 180 million links).

To open this ScrapeBox Addon, we need to go to Addons and then click ScrapeBox DupRemove.

scrapebox dup remove

Now you can see a new window with the tool’s overview.

scrapebox dup remove

Using the tool is really intuitive. All you got to do is load source and target files with each part of the tool. For target files, I recommend using new and empty TXT files.


7. Scraping Google

This is probably the most popular use of ScrapeBox. In 2009 it was a feature allowing you to harvest more blogs for posting comments.

I personally use the ScrapeBox scraping feature for:

  • Scraping a website’s index in Google (for on-page SEO purposes)
  • Scraping backlinks’ footprints (SEO directories with duplicate content, footprintable link networks, footer links from templates etc.)
  • Looking for link building opportunities

A word about proxies

To start scraping, we need some proxies, otherwise our IP will be blocked by Google after just a few minutes.

Usually the best way to find a Google proxy is to use the built-in ScrapeBox Proxy Harvester. Using it right is quite complicated though, and I will not cover the whole process here. As an easier way for SEOs starting with ScrapeBox, I recommend going to any large SEO forum. There are always a few “public proxy threads”.

As an example, you can go to one of the posts listed below and simply use the proxies listed there daily.


The average lifetime of a Google proxy is 1 to 8 hours. For more advanced scrapes you’ve got to use either a lot of private proxy IPs or simply use more advanced scraping software (feel free to contact me for info).


Google scraping workflow

1. Find and test Google proxies

We’ve already got a few proxy sources (pasted above). Let’s see if we can get any working Google proxies from those lists.

Go to

Find at least 2000 – 3000 proxies and paste them into the Proxy tab in ScrapeBox, then click on “Manage” and start the test. We are obviously looking for a Google Proxy.

This is how proxy testing looks:

proxy check

You will see the number of Google-passed proxies at the bottom of the screen. Google proxy will also be shown on the list in green.

After the test is finished, we need to filter out the list. To do that, click Filter, and “Keep Google proxy”.

proxy check results

Now we’ve got only Google proxies on the list. We can save them to ScrapeBox and start scraping.

save proxies to scrapebox


Remember to use proxies straight away, as they will usually not be alive for more than 1-3 hours.


2. Setup the desired keywords

Now that we’ve got the proxies, all we need to start scraping are our desired keywords or footprints.

To show the scrape in a “real life” example, I will scrape the footprint used for Expedia’s WordPress Theme. For those of you that didn’t read case study, it is a WordPress theme, with footer links. Pretty easy to footprint.

expedia footer links

As you can see on the screenshot above, our footprint to scrape is “Designed by Expedia Rental Cars Team.”

Copy the footprint mentioned above and paste it into ScrapeBox.

scrapebox settings settings

To setup your scraping, follow the screenshot above. Paste your desired footprint to the top right field. Then add as many keywords as possible (I only used 3, as this is just an example), to generate more results.

Yahoo, Bing, AOL

I personally don’t like using them. In my opinion, scrapes done with them are not as precise as the ones done with Google. On the other hand, I know that many of my SEO colleagues use those search engines quite successfully. I leave the choice to you. You can run some benchmarks yourself and decide for yourself.

Why should we add extra keywords?

Each Google search is 10 – 1000 results (depending on the setup). If we want to scrape, for example, 20,000 results, we need to use extra keywords, so our footprint will look like:

  • “Designed by Expedia Rental Cars Team.” Cars
  • “Designed by Expedia Rental Cars Team.” Travel
  • “Designed by Expedia Rental Cars Team.” hotels
  • etc.

This way we can cover much more “ground” and dig much deeper.


Before scraping, Google your footprint manually. With that you can have a clear idea of what you want to accomplish, and then benchmark your results.

For the footprint we’ve got Google shows ~180 unique results.

google search

Of course, having 180 unique pages scraped is a perfect score for us, but it is not always possible. Let’s see if we will be lucky enough to get close to 180 pages.

All we’ve got to do now is press “Start Harvesting”.

start harvesting

Now we can watch ScrapeBox doing what it does best. Scraping Google.

harvester completed

OK, the search is finished, we’ve got 226 results. This is not epic, but pretty good for only 3 keywords.

After clicking OK, ScrapeBox will show us the good (with results) and bad (no results) keywords.

keywords statistics

The stats above are really helpful, as with more complex searches you can be much more effective by filtering the keywords.

Unfortunately, we are not finished yet. The results we see are coming from different, unique searches, therefore they are almost always heavily duplicated. Fortunately all we need to do is click “Remove/Filter” and “Remove duplicate URLs”.

remove duplicate urls

Let’s see how close we are to our desired 180 results:

unique domains

We’ve got 145 unique results. With only 3 keywords used, this is a really great result. There are probably ~35 more pages that we missed out there, but I’m sure that we’ve got all the unique domains on the list.

remove duplicate domains

Now let’s see the results:

remove duplicate domains results

With ~ 15 minutes of work, we’ve got the whole footprint scraped. 145 unique URLs with 21 domains.

In my opinion, scraping is a skill that is really important to anyone dealing with link audits. There are some SEO actions that you cannot do without scraping. The Orca Technique is a great example. It is not possible to implement it fully without scraping Google.

Scraping and de-duplicating is not all you can do though. Imagine that you want to see which of the domains above have already been disavowed. We can do that just by a really simple filtering.


8. Filtering the results

This is my favorite part. This is something that is really complicated (at least for me) to do with e.g. Excel. ScrapeBox couldn’t make it any easier.

Let’s go back to the example of one of my customers – This situation was also described in my “Squeeze more juice out of Google Webmaster Tools” case study. Working on their backlinks is quite complex, as the disavow file is huge.

Of course Link Detox is doing all the work for us, but I want to show you how you can filter the disavowed links out of the link list with ScrapeBox.

First, we need to load a large link list to ScrapeBox. A report from Google Webmaster Tools will be a good example here.

All we have to do to start is copy the backlinks from the Google Webmaster Tools CSV export to ScrapeBox.

GWT to notepad

After importing the URLs, let’s remove duplicates (yes – there are duplicate URLs in GWT exports).

de duplicate results

As you can see, there were 240 duplicated URLs.

Now what I would like to do is to filter out all the disavowed URLs.

To do that, we are going to use the “Remove/Filter” options.

remove duplicate urls

As you can see, when we utilize the options listed above wisely, we can filter out almost everything. To filter out all the disavowed links, we are going to use the “Remove URLs Containing” option.



  1. We need the disavow file. It is best to download it from Google Webmaster Tools using this link:
  2. Copy the CSV content to TXT (Notepad or Notepad ++) file (to get out of Excel as quickly as possible) ☺
  3. Use the “Replace all” option in Notepad to replace “domain:” with nothing. This way we get only a list of single disavowed URLs and domains.
  4. Save the TXT file to your hard drive as e.g. “Disavow-extremetacticaldynamics.txt”
  5. Go to “Remove/Filter” and click “Remove URLs Containing entries from…” and select your saved disavow file (without “domain:”).

list filtering

  1. Now we can see the filtered results. What we can see here is ~11k links removed from the list.

list filtering results

  1. We are done. Our list is filtered by our disavow file.

I think that the example above should give you an idea about the possibilities of processing your URL lists with ScrapeBox. This is a basic tutorial, but with just a few hours of playing with ScrapeBox you can become an advanced user and forget about Excel.


9. Scraping Google Suggest

This is one of my favorite uses of ScrapeBox. I think that Google Suggest holds a lot of interesting keywords that we can use for scraping, post ideas or keyword research. It is also one of the easiest features to use.



  1. Open ScrapeBox
  2. Click Scrape

scrape keywords

  1. Enter your source keywords

scrape keywords suggest
In this example, let’s use Link Research Tools as our source keyword. Usually we should use many more than just 1, but this is only to show you how this tool works.

  1. Select your Keyword Scraper Sources

keyword suggest settings
As you see on the screenshot above, you can also choose Bing, YouTube and many other engines. In this example, I will only use Google Suggest, though.

  1. Click Scrape

keyword suggest results

And we’re done. You can see the Google Suggest scrape on the right. To get more results, you obviously need more keywords, or you can play with Level (1-4) of scrape.


I often use it to monitor all the brand-related searches. You would be surprised how much info you can get with just 1 brand keyword (example below).

expedia example brand search

Just with typing in 1 keyword – “Expedia” and setting the level to 3, I got 321 brand related keywords searched in Google.



Link Research Tools is making a lot of the link audit workflow quite easy already. You would be surprised how many processes described here are actually happening in the background the moment you start any report in Link Research Tools. Unfortunately, there are unusual cases, when you need to do a lot of manual work. ScrapeBox is a good tool to organize and speed up your manual searches and link processing.

This was just a basic tutorial. I would be happy to create some more advanced SEO tools tutorials as well. If there is any area that you would like me to cover, please contact me or comment below!


This case study was written by Bartosz Góralewicz, CEO at Elephate, and proud user of Link Research Tools and Link Detox.

A word from Christoph C. Cemper

LRT Certified XpertSEOs live and die by their tools and processes. Fortunately for us, we have an expert SEO willing to crack open his safe of SEO tricks and share some of them with the rest of us. I hope you enjoyed this tutorial and now have many ideas for how to use his tips to simplify your processes.

Our goal is to provide our user community and clients with quality service and knowledge. Our LRT Certified Professionals and Xperts like Bartosz are key to achieving this goal.

I look forward to Bartosz’s future work, and I personally recommend working with him whenever you get the opportunity.


LRT Certified Xpert Bartosz Goralewicz

superhero smallWith the Superhero Plans you can perform link audits, link cleanup, link disavow boost, competitive research, professional SEO and backlink analysis for your own or your competitor's sites.

You can fix or avoid a Google Penalty! Learn more about how you can Recover and Protect with LRT.




Bartosz Goralewicz

Bartosz Góralewicz specializes in link audits. He does consulting mostly for corporate customers and large sites. You can find some of his case studies or interesting posts by clicking the posts tab. 


  1. @utahsites on August 27, 2014 at 13:46

    9 useful tips to use ScrapeBox for your SEO workflow –

  2. @RamonRauten on August 27, 2014 at 14:54

    RT @bart_goralewicz: ScrapeBox get his 2nd chance to make SEO world a better place 🙂 @Cemper@lnkresearchtool

  3. @mazur_w on August 27, 2014 at 15:56

    RT @bart_goralewicz: ScrapeBox get his 2nd chance to make SEO world a better place 🙂 @Cemper@lnkresearchtool

  4. @seoptimiser on August 27, 2014 at 16:29

    9 x useful tips to use ScrapeBox for your SEO workflow via @cemper

  5. @bart_goralewicz on August 28, 2014 at 12:52

    Scrapebox as a “must have tool” for your SEO workflow #SEO

  6. @mazur_w on August 28, 2014 at 14:00

    RT @bart_goralewicz: Scrapebox as a “must have tool” for your SEO workflow #SEO

  7. @ArturKosch on August 28, 2014 at 17:37

    ScrapeBox for Link Audits and Link Risk Management » via @Cemper#SEO #LinkAudit

  8. @LupeCuadra on August 28, 2014 at 18:43

    RT @cemper: RT @bart_goralewicz: ScrapeBox get his 2nd chance to make SEO world a better place 🙂 @Cemper@lnkresear…

  9. Sigbjørn Rivelsrud on August 29, 2014 at 09:37

    Great case study. I would love to see some case studies focused on using different tools, tricks and techniques to find areas of improvements for on-page SEO as well. Crossing my fingers you will come up with something shockingly smart 🙂

    • Bartosz Góralewicz on August 29, 2014 at 12:56

      Just as we talked on Facebook Sigbjorn, there are many tools and many complex workflows. I may write about that sometime, but it is a really specific stuff that maybe 1 – 2 % of the SEO community will ever need 😉

  10. @GoogleDokter on August 29, 2014 at 16:47

    How to use ScrapeBox for your SEO workflow – 9 useful tips

  11. @DUQUEredes on August 29, 2014 at 17:38

    RT @matthewbarby: How to use ScrapeBox for your SEO workflow <– not sure what I’d do without ScrapeBox.

  12. @NickLeRoy on August 29, 2014 at 18:18

    RT @MattWoodwardUK: How to use ScrapeBox for your SEO workflow – 9 useful tips

  13. @PhilTomm on August 29, 2014 at 19:25

    RT @venchito14: How to use #ScrapeBox for your #SEO workflow by @bart_goralewicz

  14. @imsalmansharif on August 29, 2014 at 20:15

    RT @matthewbarby: How to use ScrapeBox for your SEO workflow <– not sure what I’d do without ScrapeBox. #SEO

  15. @harmeyersteve on August 29, 2014 at 22:25

    How to use ScrapeBox for your SEO workflow – 9 useful tips

  16. @juanluismora_es on August 29, 2014 at 23:04

    RT @matthewbarby: How to use ScrapeBox for your SEO workflow <– not sure what I’d do without ScrapeBox.

  17. @rubingseo on August 30, 2014 at 04:35

    How to use ScrapeBox for your SEO workflow

  18. @corbax on August 30, 2014 at 09:41

    How to use ScrapeBox for your SEO workflow

  19. @PilarTorresW on August 30, 2014 at 11:36

    RT @nickgarner: How to use ScrapeBox for your SEO workflow:

  20. @Midas_UK on August 30, 2014 at 16:26

    9 useful tips to use ScrapeBox for your SEO workflow via @cemper

  21. @mobiloudapp on August 30, 2014 at 17:04

    How to use ScrapeBox for your SEO workflow – 9 useful tips

  22. @Pechnet on August 31, 2014 at 19:38

    SEO MUST HAVE ! 9 useful tips to use ScrapeBox for your SEO workflow via @Cemper#seo #scrapbox

  23. @bart_goralewicz on September 1, 2014 at 10:58

    My article about using ScrapeBox for White Hat SEO –

  24. @tomyches on September 1, 2014 at 12:10

    9 useful tips to use ScrapeBox for your SEO workflow vía @cemper

  25. @krystianwlr on September 1, 2014 at 16:14

    9 useful tips to use ScrapeBox for your SEO workflow via @cemper

  26. @rahulshukla29 on September 1, 2014 at 18:00

    How to use ScrapeBox for your SEO workflow

  27. @builtbysf on September 1, 2014 at 20:30

    9 useful tips to use ScrapeBox for your SEO workflow via @cemper

  28. Steve Smith on September 1, 2014 at 22:55

    I thought commenting on blogs was not powerful anymore. Aren’t they all no follow mostly?

    • Bartosz Góralewicz on September 2, 2014 at 10:16

      It seems that you didn’t read the article Steve

  29. @peppujado on September 2, 2014 at 17:40

    9 Trucos para salir de una penalización de Google utilizando “ScrapeBox”

  30. Michael Vittori on September 4, 2014 at 17:49

    Is it possibile using Scrapebox for discovering and extracting high traffic keywords of websites’ competitors?

  31. high quality on September 21, 2014 at 03:30

    When some one searches for his essential thing, thus he/she needs to be available that in detail, therefore
    that thing is maintained over here.

  32. @eyezeh on October 15, 2014 at 17:04

    9 useful tips to use ScrapeBox for your SEO workflow via @lnkresearchtool

  33. @ThatsNickQ on November 17, 2014 at 05:31

    “@lennybar: 9 Useful Tips to Use ScrapeBox for Your SEO Workflow #scrapebox #SEO”

  34. Stijn on December 26, 2014 at 15:19

    Hi, nice article. What do you mean by “Each Google search is 10 – 1000 results (depending on the setup)”.
    Which setup, do you mean some configuration in Scrapebox?
    A followup question. If a manual Google search, says 100.000 results for keyword X. Will scrapebox also return those 100.000 results (or at least close to it)?

  35. @cemper on March 6, 2015 at 12:55

    Google scraping workflow – how to find and test Google proxies #SEO

  36. @cemper on March 10, 2015 at 16:45

    Forget Excel, SEOs. For serious link audits, follow this ScrapeBox tutorial #SEO

  37. Morten Ruus on April 7, 2015 at 15:48

    Can you tell me a little bit more about the function “Scraping Google Suggest”. Can you scrape from

Leave a Comment

Clients speak for LinkResearchTools

William Sears - Growth & SEO at LinkedIn

William Sears

"A huge part of SEO success is your backlink portfolio.

 Not only do you need to understand where your links are coming from, you need to be able to take action to manage those links. LinkResearchTools is an indispensable collection of powerful tools to do just that. I’m a long-time customer of LinkResearchTools and highly recommend it to anyone who is serious about crushing it with links."

Kenneth Chan, Founder and CEO - Tobi


"Your tools are the best in the industry.

 The service is great.

Christoph, your passion is contagious."

Larry Markovitz -  Sr. Director of Organic Search at GroupM 


"Continuous improvement of our client's results

LinkResearchTools and specifically Link Detox gives us actionable insights that continuously improve our clients results in the search engines."


Read what other happy users of LRT say and see all the companies that licensed LRT.

Recover - Protect - Learn - Grow your SEO - Learn how LRT can help you.

Have you seen enough to make a decision?

Want to test with real data? Discounted trial for small websites. Just the export doesn't work.

LinkResearchTools is SEO Software for Link Analysis and Link Building

Trusted by the world’s most respected brands.

victoria' secret