It says they used ethical anonymization, but we’ve seen other scrapers are always completely in violation of Discord’s TOS.
So did Discord cooperate, or give special authorization for this collection? It wouldn’t appear that they could do so, if privacy belongs to their users at all.
User bots (including hacked clients) are officially banned by the TOS, which addresses that concern.
The only acceptable API usage is via bots that server owners choose to invite. And while it might be legally OK (if the bot's own TOS says it), I promise no server owner is expecting an invited bot to slurp up every message for use in a data set, whether that be for academic purposes or a potential stalking/"dirt" database.
I highly doubt this is the most ethical instance of data collection.
IIRC data slurping (for exporting) is also not allowed bot usage.
> B. API Data Sharing & Retention
> You will not share API Data with any third party, except in the following circumstances, subject to compliance with the Terms and applicable laws and regulations: (i) with a Service Provider; (ii) to the extent required under applicable laws or regulations; and (iii) when a user of your Application expressly directs you to share their API Data with the third party (and you will provide us proof thereof upon request).
I'm not sure what you mean by "prevent". A TOS is a legal document designed to put down rules and a legal basis for the service.
I don't know what a "guild" is, if it's some Discord thing, and you don't say whether this is a good-faith human who joins, or a bot operator, intending to scrape. The hypothetical is irrelevant here; what is germane is that the expectation of privacy by the individual participants, and the terms which bind people who use that service.
The TOS clearly didn't prevent the use of API, but it may indeed prohibit such scraping, or threaten repercussions for people who break the terms, especially for someone who republishes the data. Your example of a simple download dump doesn't seem to involve republication, and that seems to be the major issue with scrapers.
>The hypothetical is irrelevant here; what is germane is that the expectation of privacy by the individual participants, and the terms which bind people who use that service.
How can you have an expectation of privacy in a public forum? Where did this bizarre disorder originate, where people knowingly put their writing out there for literally anyone to read, then turn around and start talking about "expectations of privacy" when they realize what it entails?
Well unfortunately it originated in the human condition, my friend.
I take it back about "expectation of privacy". Perhaps that is an outmoded concept.
Humans used to sort of have a default expectation of privacy. Being that gossip, slander and libel were sins and crimes, we could often safely gather in a room and isolate ourselves in a select group, and share our thoughts openly.
Most humans could go into a living room with their family, a pub or bar, a classroom, or a treehouse, and say/do things that were shared only by the local group of gathered humans. You could go into a public park and speak to a fire hydrant. It was not usual, or possible 100 years ago, for the news media to go around with recorders and cameras and record/preserve/transmit/broadcast everything everyone said in every place they were doing it.
Expectations of privacy were just sort of... humankind's default setting. And so betrayals were sins and crimes. And we sit alone at our keyboard looking at a screen. It feels private, all right. Where are we really? Where are our words being carried? We can't know anymore.
Unfortunately we've built online and virtual worlds around paradigms that imply privacy or confidentiality, but don't actually afford it. You can go into a "chat room" or a "forum" or change your "privacy settings" but they mean nothing. Nothing at all. Because everything we're sending across the net can be perfectly recorded, preserved, retransmitted, and it's no longer gossip, it's just business.
> Where did this bizarre disorder originate
I don't believe that any other living organism has had to deal with the complete and total collapse of "privacy" like humans in the 21st century. Surely, termites in Australia don't know, and couldn't care, about what's going on with honeybees in California.
And here we have people calling it a bizarre disorder. Yes, it's mistaken and misguided, but who can call it unreasonable?
So did Discord cooperate, or give special authorization for this collection? It wouldn’t appear that they could do so, if privacy belongs to their users at all.