Cynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 4 months agoHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizexternal-linkmessage-square57fedilinkarrow-up1106
arrow-up1106external-linkHow to block AI Crawler Bots using robots.txt filewww.cyberciti.bizCynicus Rex@lemmy.ml to Privacy@lemmy.mlEnglish · 4 months agomessage-square57fedilink
minus-squareNullPointer@programming.devlinkfedilinkarrow-up19·4 months agorobots.txt will not block a bad bot, but you can use it to lure the bad bots into a “bot-trap” so you can ban them in an automated fashion.
minus-squareDave.@aussie.zonelinkfedilinkarrow-up9·4 months agoI’m guessing something like: Robots.txt: Do not index this particular area. Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such. Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.
robots.txt will not block a bad bot, but you can use it to lure the bad bots into a “bot-trap” so you can ban them in an automated fashion.
I’m guessing something like:
Robots.txt: Do not index this particular area.
Main page: invisible link to particular area at top of page, with alt text of “don’t follow this, it’s just a bot trap” for screen readers and such.
Result: any access to said particular area equals insta-ban for that IP. Maybe just for 24 hours so nosy humans can get back to enjoying your site.