Our web crawlers obeys the robots.txt commands

What is spbot?

OpenLinkProfiler.org offers information about the inbound links of websites. It answers the question: Who links to whom on the Internet?

To get this kind of information, OpenLinkProfiler.org uses a web crawler with the name spbot. Basically, it starts with a list of known URLs from across the entire Internet, then it fetches local links found as it goes. There are several advantages to this approach, most importantly that it creates the least possible disruption to the sites being crawled.

How can I block the spbot?

We will not index anything you would like to remain private. All you have to do is tell us. How? By using the official Robots exclusion standard. Example:

                              User-agent: spbot
                              Disallow: /

You can find detailed instructions on how to block web crawlers on Wikipedia and on the robots.txt page. Make sure that you don't inadvertently block other crawlers that you don't want to block, for example, the Googlebot. Our bot also supports the crawl-delay directive. We recommend that you use this directive if you want to limit the number of requests:

                              User-agent: spbot
                              Crawl-delay: 10

spbot is a polite web crawler

Before you block spbot, consider the following:

  • We play 100% by the rules and we follow the robots.txt protocol. Our crawler even supports unofficial extensions to the Robots exclusion standard, for example it supports wildcards in path names.
  • You have full control over the pages that will be visited by spbot.
  • The crawler is very bandwidth-friendly to your website: a) it supports gzip compression, b) it does not load any scripts, c) it does not load any images.
  • The crawler system is very polite to websites. Although we operate hundreds of crawl servers, only one crawl server hits the same website. And that one crawl server uses a delay between two hits. You can even specify the crawl delay in your robots.txt file.
  • The user agent carries the name of the web crawler so that you can block it by name.
  • The user agent carries the URL of this web page www.OpenLinkProfiler.org/bot.
  • On this page, we publish all used IP addresses to block our crawler directly (see below).
  • On this page, we offer a direct contact address (see below).

Here's our offer: you let our web crawler find out to which web pages your website links and in return we help you to find the pages that link to your website, how they link to your website and how these links can be improved to get better rankings on Google for your website.

Just enter your domain name in the search box at the top of this page to get an analysis.

How can I verify spbot is really spbot?

Other web crawlers can spoof the spbot user agent to make them seem legitimate. They seem to be coming from us but they don't. You can verify the IP addresses to make sure that the spbot that visits your site is actually from OpenLinkProfiler.org. We're currently using the hosting company Digital Ocean for our crawlers. Therefore, our IP addresses start with the following numbers:

  • 45.55.*.*
  • 95.85.*.*
  • 104.131.*.*
  • 104.132.*.*
  • 104.236.*.*
  • 107.170.*.*
  • 159.203.*.*
  • 162.243.*.*
  • 178.62.*.*
  • 188.226.*.*
  • 192.241.*.*
  • 192.81.*.*
  • 198.199.*.*
  • 198.211.*.*
  • 208.68.*.*

Digital Ocean may add new IP ranges at any time. The best way to block us is using the robots.txt file and the user agent name spbot.

Feel free to contact us: spbot _AT_ seoprofiler -DOT- com