Bots, most prominently meta-externalagent (Meta's AI crawler) are hitting our public Pawtucket instance harder than usual lately, and I want to address this before it becomes a problem for our users. In the past we've had issues with server overload. I will add that I am not very experienced with server administration.
Facebook and Amazon to be the biggest culprits, though there are others. I am seeing a lot of this kind of thing in the httpd access log so maybe this is some kind of "bot trapped in the Browse page" situation:
57.141.0.16 - - [24/Jul/2025:14:11:06 -0400] "GET /Search/Objects/key/key/f0e2c0de69a4c520ea5f84d744055275/facet/subject/id/1123/view/list HTTP/1.1" 200 57999 "-" "meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)"
52.54.15.103 - - [24/Jul/2025:14:56:24 -0400] "GET /Browse/Objects/key/d2b43bdc51bf1d1da81e7160c2da455f/facet/entity/id/3531/view/list HTTP/1.1" 200 57999 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36"
I thought that ban_hammer.conf might help me out here (https://docs.collectiveaccess.org/pawtucket/ban_hammer) but I am not seeing any progress.
What I tried:
Can anyone offer advice as to what I might be doing wrong, or other options I might try?
We are running Providence 2.0 and Pawtucket 1.8.