AI bots everywhere. Does anyone have a good whitelist for robots.txt?

My niche little site, http://golfcourse.wiki seems to be very popular with AI bots. They basically become most of my traffic. Most of them follow robots.txt, and that's nice and all, but they are costing me non-trivial amounts of money.

I don't want to block most search engines. I don't want to block legitimate institutions like archive.org. Is there a whitelist that I could crib instead of pretty much having to update my robots file every damn day?


Comments URL: https://news.ycombinator.com/item?id=42861047

Points: 15

# Comments: 8

https://news.ycombinator.com/item?id=42861047

Creato 1mo | 29 gen 2025, 04:50:08


Accedi per aggiungere un commento

Altri post in questo gruppo

Show HN: Knowledge graph of restaurants and chefs, built using LLMs

Hi HN!

My latest side project is knowledge graph that maps the French culinary network using data extracted from restaurant reviews from LeFooding.com. The project uses LLMs to extract structure

3 mar 2025, 17:20:15 | Hacker news
Ask HN: Freelancer? Seeking freelancer? (March 2025)

Please lead with either SEEKING WORK or SEEKING FREELANCER, your location, and whether remote work is a possibility.

Please only post if you are personally looking to hire a freelancer or work a

3 mar 2025, 17:20:13 | Hacker news
Ask HN: Who is hiring? (March 2025)

Please state the location and include REMOTE for remote work, REMOTE (US) or similar if the country is restricted, and ONSITE when remote work is not an option.

Please only post if you pe

3 mar 2025, 17:20:11 | Hacker news