Nginx – Block bots, crawlers, etc.

Blocking Bots, Crawlers, and Spiders in NGINX

As web traffic continues to grow, many website owners are increasingly faced with unwanted traffic from bots, crawlers, and spiders. These automated tools are used for various purposes, such as scraping content, overwhelming servers, and compromising security. Fortunately, NGINX, one of the most popular web servers, offers several ways to protect websites from these unwanted visitors. 

What Are Bots, Crawlers, and Spiders?

Before diving into blocking techniques, it’s important to understand the different types of bots, crawlers, and spiders:

  • Bots: Software designed to perform automated tasks on the web. This includes both legitimate bots, such as search engine crawlers, and malicious bots, such as spammers and attackers.

  • Crawlers: Bots that crawl websites to gather information. Search engines use crawlers to index content for search results.

  • Spiders: A type of crawler used to explore the internet and collect data for various purposes, often for indexing or scraping.

While many crawlers are used by search engines like Google, some bots and spiders can be harmful to your website. They may scrape your content, ignore robots.txt rules, or overload your server with requests, leading to slowdowns or even crashes.

To block the above mentioned nuisances, we use a self generated list of bots, crawlers and spiders based on data collected from the web.

One of our sources is CimTools.net.

We have assembled the data in a Nginx “conf” file  that we would like to offer it for free.

How to use it

  1. Copy the downloaded file in the “conf.d” folder of your Nginx installation.
  2. Add the following directive in your server’s  “conf” file (usually nginx.conf):
  3. if ($badbot) {
    return 444;
    }
  4. Reload / restart your server.
  5. Test it with: curl -A “Googlebot” http://yourwebsite.com
 
In the logs you will see that your request was denied and the connection was closed.
 
Enjoy ♥
This site uses cookies to offer you a better browsing experience. By browsing this website, you agree to our use of cookies.