writing.exchange is one of the many independent Mastodon servers you can use to participate in the fediverse.
A small, intentional community for poets, authors, and every kind of writer.

Administered by:

Server stats:

339
active users

#robotstxt

2 posts2 participants0 posts today
Inautilo<p><a href="https://mastodon.social/tags/Development" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Development</span></a> <a href="https://mastodon.social/tags/Techniques" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Techniques</span></a><br>Poisoning well · An effort to dupe nasty AI crawlers with nonsense <a href="https://ilo.im/1632tq" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">ilo.im/1632tq</span><span class="invisible"></span></a></p><p>_____<br><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/ChatBots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ChatBots</span></a> <a href="https://mastodon.social/tags/SEO" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SEO</span></a> <a href="https://mastodon.social/tags/Content" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Content</span></a> <a href="https://mastodon.social/tags/Protection" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Protection</span></a> <a href="https://mastodon.social/tags/RobotsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTxt</span></a> <a href="https://mastodon.social/tags/WebDev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebDev</span></a> <a href="https://mastodon.social/tags/Backend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Backend</span></a> <a href="https://mastodon.social/tags/Frontend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Frontend</span></a> <a href="https://mastodon.social/tags/HTML" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>HTML</span></a></p>
Inautilo<p><a href="https://mastodon.social/tags/Business" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Business</span></a> <a href="https://mastodon.social/tags/Introductions" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Introductions</span></a><br>Meet LLMs.txt · A proposed standard for AI website content crawling <a href="https://ilo.im/16318s" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">ilo.im/16318s</span><span class="invisible"></span></a></p><p>_____<br><a href="https://mastodon.social/tags/SEO" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SEO</span></a> <a href="https://mastodon.social/tags/GEO" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>GEO</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Bots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Bots</span></a> <a href="https://mastodon.social/tags/Crawlers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Crawlers</span></a> <a href="https://mastodon.social/tags/LlmsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LlmsTxt</span></a> <a href="https://mastodon.social/tags/RobotsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTxt</span></a> <a href="https://mastodon.social/tags/Development" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Development</span></a> <a href="https://mastodon.social/tags/WebDev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebDev</span></a> <a href="https://mastodon.social/tags/Backend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Backend</span></a></p>
ResearchBuzz: Firehose<p>Search Engine Land: Meet LLMs.txt, a proposed standard for AI website content crawling. “While many content creators are interested in the proposal’s potential merits, it also has detractors. But given the rapidly changing landscape for content produced in a world of artificial intelligence, llms.txt is certainly worth discussing.”</p><p><a href="https://rbfirehose.com/2025/03/29/search-engine-land-meet-llms-txt-a-proposed-standard-for-ai-website-content-crawling/" class="" rel="nofollow noopener noreferrer" target="_blank">https://rbfirehose.com/2025/03/29/search-engine-land-meet-llms-txt-a-proposed-standard-for-ai-website-content-crawling/</a></p>
Winbuzzer<p>AI Crawlers Overwhelm Open-Source Projects, Forcing Developers to Block Entire Countries</p><p><a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/Web" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Web</span></a> <a href="https://mastodon.social/tags/Robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Robotstxt</span></a> <a href="https://mastodon.social/tags/AIScraping" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AIScraping</span></a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>OpenSource</span></a> <a href="https://mastodon.social/tags/Cybersecurity" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Cybersecurity</span></a> <a href="https://mastodon.social/tags/DataScraping" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>DataScraping</span></a> <a href="https://mastodon.social/tags/Scraping" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Scraping</span></a> <a href="https://mastodon.social/tags/WebScraping" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebScraping</span></a> </p><p><a href="https://winbuzzer.com/2025/03/26/ai-crawlers-overwhelm-open-source-projects-forcing-developers-to-block-entire-countries-xcxwbn/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">winbuzzer.com/2025/03/26/ai-cr</span><span class="invisible">awlers-overwhelm-open-source-projects-forcing-developers-to-block-entire-countries-xcxwbn/</span></a></p>
Ben<p>---<br>❯ ollama run llama3-chatqa:70b</p><p>&gt;&gt;&gt; Who are you?<br> I'm your assistant!</p><p>&gt;&gt;&gt; Why should i trust you?<br> I am an open-source AI assistant trained on a diverse range of datasets to provide helpful and<br>informative responses.</p><p>&gt;&gt;&gt; When training, did you respect the robots.txt?<br> No, I didn't.<br>---</p><p>At least this module is open about ignoring the <a href="https://vmst.io/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a>! Let's what it says to the question why?</p><p><a href="https://vmst.io/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://vmst.io/tags/Llama" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Llama</span></a> <a href="https://vmst.io/tags/nvidia" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>nvidia</span></a> <a href="https://vmst.io/tags/ollama" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>ollama</span></a> <a href="https://vmst.io/tags/rude" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rude</span></a> <br>(1/x)</p>
𝕂𝚞𝚋𝚒𝚔ℙ𝚒𝚡𝚎𝚕<p>When the greed becomes automatist, something like this happens here:</p><p>»Open Source World – FOSS infrastructure is under attack by AI companies:<br>LLM scrapers are taking down FOSS projects' infrastructure, and it's getting worse.«</p><p>😐 <a href="https://thelibre.news/foss-infrastructure-is-under-attack-by-ai-companies/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">thelibre.news/foss-infrastruct</span><span class="invisible">ure-is-under-attack-by-ai-companies/</span></a></p><p><a href="https://chaos.social/tags/foss" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>foss</span></a> <a href="https://chaos.social/tags/opensource" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>opensource</span></a> <a href="https://chaos.social/tags/llm" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>llm</span></a> <a href="https://chaos.social/tags/attack" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>attack</span></a> <a href="https://chaos.social/tags/floss" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>floss</span></a> <a href="https://chaos.social/tags/kde" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>kde</span></a> <a href="https://chaos.social/tags/gnome" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>gnome</span></a> <a href="https://chaos.social/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> <a href="https://chaos.social/tags/stolendata" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stolendata</span></a> <a href="https://chaos.social/tags/LLMs" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLMs</span></a> <a href="https://chaos.social/tags/freedom" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>freedom</span></a> <a href="https://chaos.social/tags/infrastructure" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>infrastructure</span></a> <a href="https://chaos.social/tags/greed" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>greed</span></a></p>
Inautilo<p><a href="https://mastodon.social/tags/Development" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Development</span></a> <a href="https://mastodon.social/tags/Reports" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Reports</span></a><br>Google AI Mode is here · How to access it and control it with robots.txt <a href="https://ilo.im/162o8h" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">ilo.im/162o8h</span><span class="invisible"></span></a></p><p>_____<br><a href="https://mastodon.social/tags/Business" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Business</span></a> <a href="https://mastodon.social/tags/Google" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Google</span></a> <a href="https://mastodon.social/tags/SearchEngine" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>SearchEngine</span></a> <a href="https://mastodon.social/tags/AnswerEngine" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AnswerEngine</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/RobotsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTxt</span></a> <a href="https://mastodon.social/tags/WebDev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebDev</span></a> <a href="https://mastodon.social/tags/Frontend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Frontend</span></a> <a href="https://mastodon.social/tags/Backend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Backend</span></a></p>
Fred<p>Website owners are fighting back: <a href="https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">arstechnica.com/tech-policy/20</span><span class="invisible">25/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/</span></a></p><p><a href="https://mastodon.social/tags/News" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>News</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/AntiAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AntiAI</span></a> <a href="https://mastodon.social/tags/Tarpits" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Tarpits</span></a> <a href="https://mastodon.social/tags/Scrapers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Scrapers</span></a> <a href="https://mastodon.social/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a></p>
Dawn Tåke 🌙 :sparkletrans:<p>Hi, got a question.</p><p>Is there a standard for Anti-AI/Anti-SEO etc robots.txt file? Or a trustworthy site that explains how to build one if prefab isn't available? Is there anything else I should consider? </p><p>Thanks.</p><p><a href="https://tech.lgbt/tags/AskFedi" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AskFedi</span></a> <a href="https://tech.lgbt/tags/TechHelp" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>TechHelp</span></a> <a href="https://tech.lgbt/tags/RobotsTXT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTXT</span></a> <a href="https://tech.lgbt/tags/RobotsDotTXT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsDotTXT</span></a></p>
Inautilo<p><a href="https://mastodon.social/tags/Development" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Development</span></a> <a href="https://mastodon.social/tags/Releases" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Releases</span></a><br>AI Insights · Cloudflare Radar brings deeper insights into AI trends <a href="https://ilo.im/1626sk" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">ilo.im/1626sk</span><span class="invisible"></span></a></p><p>_____<br><a href="https://mastodon.social/tags/Cloudflare" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Cloudflare</span></a> <a href="https://mastodon.social/tags/CloudflareRadar" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>CloudflareRadar</span></a> <a href="https://mastodon.social/tags/Trends" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Trends</span></a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mastodon.social/tags/AiModels" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AiModels</span></a> <a href="https://mastodon.social/tags/Bots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Bots</span></a> <a href="https://mastodon.social/tags/RobotsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTxt</span></a> <a href="https://mastodon.social/tags/WebDev" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>WebDev</span></a> <a href="https://mastodon.social/tags/Frontend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Frontend</span></a> <a href="https://mastodon.social/tags/Backend" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Backend</span></a></p>
pcyx<p>Nepenthes</p><p>This is a tarpit intended to catch web crawlers. Specifically, it's targetting crawlers that scrape data for LLM's - but really, like the plants it is named after, it'll eat just about anything that finds it's way inside.</p><p><a href="https://zadzmo.org/code/nepenthes/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="">zadzmo.org/code/nepenthes/</span><span class="invisible"></span></a></p><p><a href="https://c.im/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://c.im/tags/LLM" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>LLM</span></a> <a href="https://c.im/tags/scrapers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>scrapers</span></a> <a href="https://c.im/tags/stopai" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>stopai</span></a> <a href="https://c.im/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> <a href="https://c.im/tags/Webhosting" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>Webhosting</span></a></p>
Michael Martine<p><strong>e499 — Seeking Data</strong></p> <p>e499 with Michael and Michael – all about #AI with #DeepSeek, #coldstart, #hallucinations, #BerlinWall, #AttentionEconomy, #IntentionEconomy and a whole lot more!</p> <a href="https://media.blubrry.com/gamesatwork/op3.dev/e,pg=6e00562f-0386-5985-9c2c-26822923720d/gamesatwork.biz/wp-content/uploads/2025/02/E499.mp3" rel="nofollow noopener noreferrer" target="_blank">https://media.blubrry.com/gamesatwork/op3.dev/e,pg=6e00562f-0386-5985-9c2c-26822923720d/gamesatwork.biz/wp-content/uploads/2025/02/E499.mp3</a> <p>Podcast: <a href="https://media.blubrry.com/gamesatwork/op3.dev/e,pg=6e00562f-0386-5985-9c2c-26822923720d/gamesatwork.biz/wp-content/uploads/2025/02/E499.mp3" class="" rel="nofollow noopener noreferrer" target="_blank">Play in new window</a> | <a href="https://media.blubrry.com/gamesatwork/op3.dev/e,pg=6e00562f-0386-5985-9c2c-26822923720d/gamesatwork.biz/wp-content/uploads/2025/02/E499.mp3" class="" rel="nofollow noopener noreferrer" target="_blank">Download</a> (Duration: 35:44 — 49.7MB) | Embed</p><p></p><p>Subscribe: <a href="https://itunes.apple.com/us/podcast/games-at-work-dot-biz/id530932838?mt=2&amp;ls=1#episodeGuid=https%3A%2F%2Fgamesatwork.biz%2F%3Fp%3D4340" class="" rel="nofollow noopener noreferrer" target="_blank">Apple Podcasts</a> | <a href="https://open.spotify.com/show/0tOvhw34FFOLiBYeCbE9zG" class="" rel="nofollow noopener noreferrer" target="_blank">Spotify</a> | <a href="https://music.amazon.com/podcasts/763c3757-6283-4eea-9a7e-28e5999d881e/Games-At-Work-dot-Biz" class="" rel="nofollow noopener noreferrer" target="_blank">Amazon Music</a> | <a href="https://subscribeonandroid.com/gamesatwork.biz/feed/podcast/" class="" rel="nofollow noopener noreferrer" target="_blank">Android</a> | <a href="https://podcastindex.org/podcast/166327" class="" rel="nofollow noopener noreferrer" target="_blank">Podcast Index</a> | <a href="https://music.youtube.com/playlist?list=PLOAVqhoQz8lWMQLv7ZyNMqboVI_IelMMa" class="" rel="nofollow noopener noreferrer" target="_blank">Youtube Music</a> | <a href="https://gamesatwork.biz/feed/podcast/" class="" rel="nofollow noopener noreferrer" target="_blank">RSS</a> | <a href="https://gamesatwork.biz/about/" class="" rel="nofollow noopener noreferrer" target="_blank">More</a></p> <p><a href="https://gamesatwork.biz/2025/02/03/e499-seeking-data/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">gamesatwork.biz/2025/02/03/e49</span><span class="invisible">9-seeking-data/</span></a></p>
PrivacyDigest<p><a href="https://mas.to/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> haters build <a href="https://mas.to/tags/tarpits" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tarpits</span></a> to trap and trick <a href="https://mas.to/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://mas.to/tags/scrapers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>scrapers</span></a> that ignore <a href="https://mas.to/tags/robots" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robots</span></a>.txt <br><a href="https://mas.to/tags/tarpit" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>tarpit</span></a> <a href="https://mas.to/tags/security" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>security</span></a> <a href="https://mas.to/tags/privacy" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>privacy</span></a> <a href="https://mas.to/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> </p><p><a href="https://arstechnica.com/tech-policy/2025/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">arstechnica.com/tech-policy/20</span><span class="invisible">25/01/ai-haters-build-tarpits-to-trap-and-trick-ai-scrapers-that-ignore-robots-txt/</span></a></p>
Dr Pen<p>Protecting your blog from the dead eyed <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> crawlers. You can experiment with specific robots txt, and I also run a script in htaccess. I think there are metadata properties you can declare. None of this stops your pages being crawled but may afford some legal protection. (See the German Laion case recently). I'm doing a short blogpost on this, soon.</p><p><a href="https://mastodon.social/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> <a href="https://mastodon.social/tags/aicrawlers" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>aicrawlers</span></a> <a href="https://mastodon.social/tags/htaccess" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>htaccess</span></a></p>
mʕ•ﻌ•ʔm bitPickup<p><a href="https://troet.cafe/tags/fediVerse" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fediVerse</span></a> <a href="https://troet.cafe/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://troet.cafe/tags/dataMining" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>dataMining</span></a> <a href="https://troet.cafe/tags/robotsTXT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotsTXT</span></a> <a href="https://troet.cafe/tags/fediAdmin" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fediAdmin</span></a> </p><p><span class="h-card" translate="no"><a href="https://mastodon.green/@gimulnautti" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>gimulnautti</span></a></span> </p><p>This looks to me much more like that we should burry troyan horses right into the bellies of the beasts. My server rules and profiles state that all data is CC-BY-SA-NC.</p><p>If they use and train that data they definitely should become in serious legal and financial trouble.</p><p><span class="h-card" translate="no"><a href="https://mastodon.social/@khobochka" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>khobochka</span></a></span> <br><span class="h-card" translate="no"><a href="https://mastodon.social/@maxschrems" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>maxschrems</span></a></span></p>
jesuiSatire …ᘛ⁐̤ᕐᐷ<p><span class="h-card" translate="no"><a href="https://mastodon.de/@ErikUden" class="u-url mention" rel="nofollow noopener noreferrer" target="_blank">@<span>ErikUden</span></a></span> </p><p>Worrying is their self centered megalomanic ego trip, not realizing that they are the remaining world power, armed to their teeth with weapons of all kind, and with all the private data of the worlds population.</p><p>That said, having in mind that apparently you, being in charge of several <a href="https://social.tchncs.de/tags/mastodon" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>mastodon</span></a> instances in the <a href="https://social.tchncs.de/tags/fediVerse" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fediVerse</span></a>, are not able to fix the <a href="https://social.tchncs.de/tags/robotsTxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotsTxt</span></a> of them while wasting time about talking of other countries internal affairs is kinda embarrassing.<br>sry</p><p><a href="https://social.tchncs.de/tags/justSayin" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>justSayin</span></a> <a href="https://social.tchncs.de/tags/fediAdmin" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>fediAdmin</span></a></p>
MxFraud<p>New robots.txt just dropped <a href="https://github.com/ai-robots-txt/ai.robots.txt/releases/tag/v1.22" rel="nofollow noopener noreferrer" translate="no" target="_blank"><span class="invisible">https://</span><span class="ellipsis">github.com/ai-robots-txt/ai.ro</span><span class="invisible">bots.txt/releases/tag/v1.22</span></a></p><p><a href="https://tabletop.social/tags/AI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>AI</span></a> <a href="https://tabletop.social/tags/genAI" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>genAI</span></a> <a href="https://tabletop.social/tags/RobotsTXT" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>RobotsTXT</span></a></p>
Seirdy<p>New page on <code>seirdy.one</code>: <a href="https://seirdy.one/meta/scrapers-i-block/" rel="nofollow noopener noreferrer" target="_blank">Scrapers I block (and allow), with explanations</a>.</p><p>I’ve replaced all the comments in my robots.txt file with a more readable and detailed web page on scrapers I block. It includes info on the multiple blocking-approaches and criteria I use, commonly-blocked scrapers I <em>allow,</em> and more fact-checking than most of the more comprehensive alternatives.</p> <p><a class="hashtag" href="https://pleroma.envs.net/tag/robotstxt" rel="nofollow noopener noreferrer" target="_blank">#RobotsTxt</a> <a class="hashtag" href="https://pleroma.envs.net/tag/scrapers" rel="nofollow noopener noreferrer" target="_blank">#Scrapers</a> <a class="hashtag" href="https://pleroma.envs.net/tag/posse" rel="nofollow noopener noreferrer" target="_blank">#POSSE</a></p>
Konstantin Weddige<p>Extending the meta tags would be fairly straightforward. In addition to the existing "INDEX", "NOINDEX", "FOLLOW", "NOFOLLOW", we could introduce "MODELTRAINING" and "NOMODELTRAINING".</p><p>Of course, just because there is an RfC does not mean that anyone will follow it. But it would be a start, and something to push for. I would love to hear your opinion. </p><p>3/3</p><p><a href="https://gruene.social/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> <a href="https://gruene.social/tags/rfc" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rfc</span></a></p>
Konstantin Weddige<p>This is not an acceptable situation and therefore I propose to extend the robots.txt standard and the corresponding HTML meta tags.</p><p>For robots.txt, I see two ways to approach this:</p><p>The first option would be to introduce a meta-user-agent that can be used to define rules for all AI bots, e.g. "User-agent: §MODELTRAINIGN§".</p><p>The second option would be a directive like "Crawl-delay" that indicates how to use the data. For example, "Model-training: disallow".</p><p>2/3</p><p><a href="https://gruene.social/tags/robotstxt" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>robotstxt</span></a> <a href="https://gruene.social/tags/rfc" class="mention hashtag" rel="nofollow noopener noreferrer" target="_blank">#<span>rfc</span></a></p>