It seems like there’s a new reason to be frustrated at AI almost every day, a reason that bubbles up through the primordial soup of festering unconsciousness and into the light of anger. There’s the effect on the economy, job thievery, copyrighted content thievery, and the simple fact that it’s just not right that lifeless neural networks should be able to convince anyone they’re conscious. And now, there’s another reason to be mad at the ethereal AI buggers: they’re clogging up server bandwidth without permission.
Yes, I was aware that some companies scrape websites for content to train their AI models, and yes, I was aware that they sometimes do this without a website’s permission. But I didn’t consider the impact this might have on the servers running these websites. CEO of iFixit, Kyle Wiens, is here to let all of us know that this does, in fact, occur, as they ask AI company Anthropic: “Do you really need to hit our servers a million times in 24 hours?”
Assuming Wiens isn’t massively exaggerating, it’s no surprise that this is “typing up our devops resources.” A million “hits” per day would do it, and would certainly be enough to justify more than a little annoyance.
The thing is, putting this bandwidth chugging in context only makes it more ridiculous, which is what Wiens is getting at. It’s not just that an AI company is seemingly clogging up server resources, but that it’s been expressly forbidden from using the content on its servers anyway.
There should be no reason for an AI company to hit the iFixit site because its terms of service state that “copying or distributing any Content, materials or design elements on the Site for any other purpose, including training a machine learning or AI model, is strictly prohibited without the express prior written permission of iFixit.” Unless it wants us to believe it’s not going to use any data it scrapes for these purposes, and it’s just doing it for… fun?
Hey @AnthropicAI: I get you’re hungry for data. Claude is really smart! But do you really need to hit our servers a million times in 24 hours?You’re not only taking our content without paying, you’re tying up our devops resources. Not cool.July 24, 2024
Well, whatever the case, iFixit’s Wiens decided to have some fun with it and ask Anthropic’s own AI, Claude, about the matter, saying to Anthropic, “Don’t ask me, ask Claude!” It seems that Claude agrees with iFixit, because when it’s asked what it should do if it was training a machine learning model and found the above writing in its terms of service, it responded, in no uncertain terms, “Do not use the content.”
This is, as Wiens points out, something that could be seen if one simply accessed the terms of service. This makes me wonder whether at least some AI companies might rather ask for forgiveness than permission, and therefore not bother checking the ToS in the first place.
As a sidenote, I can see that iFixit’s website has Claude disallowed in its robots.txt file, expressly forbidding it from crawling the site (something that, unfortunately, any “bad bot” can ignore). This entry could’ve been there already, but I like to imagine iFixit’s only just added it to make a statement against a naughty bot, a statement wrought upon the bot by its own effective admission of guilt.