Here at DigitalFlare we see a lot of traffic from 'bad robots', these bots cause problems such as high server load, server instability and can even lead to Apache crashing. If you run cPanel on your server firstly you should use mod_security, but if you wish to block specific bots globally at Apache level, the below solution is for you...

On cPanel servers you can't just edit httpd.conf file, it will be rewritten daily. However, you can edit it through WHM easily.

From WHM select: Apache Configuration > Include Editor > 'Pre Main Include' > Select All Versions then simply insert the code below:



<Directory "/home">
SetEnvIfNoCase User-Agent "MJ12bot" bad_bots
SetEnvIfNoCase User-Agent "coccocbot-image" bad_bots
SetEnvIfNoCase User-Agent "Baiduspider" bad_bots
SetEnvIfNoCase User-Agent "AhrefsBot" bad_bots
SetEnvIfNoCase User-Agent "SemrushBot" bad_bots
SetEnvIfNoCase User-Agent "DotBot" bad_bots
SetEnvIfNoCase User-Agent "AlphaBot" bad_bots
SetEnvIfNoCase User-Agent "ZoominfoBot" bad_bots
SetEnvIfNoCase User-Agent "ADmantX" bad_bots
SetEnvIfNoCase User-Agent "Heritrix" bad_bots
SetEnvIfNoCase User-Agent "Indy Library" bad_bots
SetEnvIfNoCase User-Agent "Mail.Ru" bad_bots
SetEnvIfNoCase User-Agent "rogerbot" bad_bots
SetEnvIfNoCase User-Agent "PHPCrawl" bad_bots
SetEnvIfNoCase User-Agent "BLEXBot" bad_bots
SetEnvIfNoCase User-Agent "magpie-crawler" bad_bots
SetEnvIfNoCase User-Agent "SeznamBot" bad_bots
SetEnvIfNoCase User-Agent "seoscanners.net" bad_bots
SetEnvIfNoCase User-Agent "ZumBot" bad_bots
SetEnvIfNoCase User-Agent "Yandex" bad_bots
SetEnvIfNoCase User-Agent "MaxPointCrawler" bad_bots
SetEnvIfNoCase User-Agent "Nutch" bad_bots
SetEnvIfNoCase User-Agent "Buzzbot" bad_bots
<RequireAll>
     Require all granted
     Require not env bad_bots
</RequireAll>
</Directory>

Click Update and then restart Apache. (In Directory section, you should specify the path to the location of your websites. On cPanel servers this is /home by default).

You may wish to edit the list of bots above, especially if your SEO company uses something like AHREFS or SEMRush. If you are seeing high load from such bots some will respect the robots.txt file - so you can simply set limits for such bots like this via your robots.txt file:


User-agent: SemrushBot
Crawl-delay: 20