Hacker News new | past | comments | ask | show | jobs | submit login
Post Ghost Shutdown: An Open Letter to Twitter (postghost.com)
239 points by doctorshady on July 9, 2016 | hide | past | favorite | 82 comments



Twitter has been shutting down API access to clients that don't honor deletions for years now. I don't know why this is news. Note: somehow deleted tweets get covered anyway. If someone is sufficiently public of a figure, Twitter's policy doesn't save them from being publicly shamed when they tweet something the world doesn't approve of.

In fact you could easily argue that European law requires Twitter to disable this access. If I want to delete my online presence, surely being able to actually delete my tweets is as important as being able to delete things from the Google-cache.


>>>I don't know why this is news.

if you read the post by the company they clearly explain why this should be news. the fact that twitter allows at least one company to do this already IMO is enough to make ti news.

Why can Politwoops exist but not this site?


Because they're doing different things? Also clearly outlined?


"delete my tweets".

Would you say a Major Influencer owns his public presence? Does Donald Trump own his words? Does J. K. Rowling? (as per the examples given in the letter). I honestly don't know.

We hope you’ll consider the fact that as Twitter has become a dominant platform of communication, verified users with huge follower bases influence the public dialog as much as elected officials, and should be accountable for their public statements on Twitter just as they are for public statements they make anywhere else.

I find it hard to believe that people agree with a Developer Agreement and Policy such as Twitter's. Do you? An 'agreement' orders you to delete content because it says so, and you agree with it? What if I use the API to flag content I want, then wget the relevant html. Would an API agreement have any say in that? Must all screenshots of deleted tweets be deleted? Articles about deleted tweets? Are the original tweeters forced to deny having tweeted what they did, unless Twitter undeleted theirs tweets? Even if taken to court?

Sir, did you, or did you not, tweet that you want to acquire jell-o and take a bath in it

Well, just check the API. It says I didn't tweet such thing

Case closed

Should I tweet my orders to my army of nuclear powered androids, and then delete it, so that no proof exists?

Furthermore, the API exists in the world. The deleted tweet is a detail of the API and of the twitter infrastructure, that allows deletion. However, the fact that a specific tweet was deleted does not belong to the API, but to the meatspace. Are we to deny that a tweet was deleted? More than that, are we to deny that a specific tweet was sent at a specific date?

Maybe the question really is, can a Developer Agreement and Policy change the world?

How is this enforceable? It is new both because of Politifacts existence, and because we are letting an API agreement apply where it does not belong.


In fact you could easily argue that European law requires Twitter to disable this access

But you'd be wrong. The right to be forgotten does not apply to public figures making political statements.


Why doesn't PostGhost just record tweets using the browser? I'm sure it can be done without much problem. This is blatant abuse by Twitter. Public tweets are public, and the readers should be able to record/screenshot/save them, without Twitter having a say in the matter.

If they don't allow using the API for that, use the browser directly.


Using a 'browser' from their base-of-operations in an automated manner would fit the definition of scraping, which is not allowed by Twitter's TOS.


What are they going to do if they break the TOS? It's not like the developer agreement where they'll shut down your API key - the best they can do is try and blacklist your IP Address


See Craigslist vs 3taps. It's not clear if scraping is legal either.

https://en.wikipedia.org/wiki/Craigslist_Inc._v._3Taps_Inc.


Wow, google must be on some pretty shaky ground then...


No, they're not. To quote from the linked Wikipedia article:

> the court held that sending a cease-and-desist letter and enacting an IP address block is sufficient notice of online trespassing, which a plaintiff can use to claim a violation of the Computer Fraud and Abuse Act

If someone wants to block Google, they can simply add a robots.txt entry. An IP block would work too. (Sending a cease-and-desist letter might or might not also work, but it doesn't really matter - context and intent matters in court; Google accepting the standard robots.txt as a 'go away' signal should be more than sufficient.)


Think about The Internet Archive!


Google/Alphabet is very close to the Obama administration, so it is unlikely they would be prosecuted: https://theintercept.com/2016/04/22/googles-remarkably-close...


In todays shaking legal ground, it would not shock me if Twitter could not get some Agency in the federal government to Arrest the founders/programmers/executives of the site under CFAA and charge them with criminal felonies


Sue?


Scrape using an IP address registered to a corporate body located in an unfriendly jurisdiction.


Twitter will just block it. Scraping should be done from multiple random addresses from the networks all around the globe, at random intervals. I guess, the only case is voluntary botnet, where users sympathetic to the cause would run limited (so, not prone to abuse) proxies. (To avoid users messing with tweets, just proxy TCP connections to the mothership and let it verify TLS.)

However, this would hinder the service, as with the polling model it may not always notice all the tweets.


luminati.io

Zing.


Luminati shut downs the accounts if the owners of the scrapped websites complain about it.


Perhaps. But it would be up to Twitter to prove such a thing is actually occurring, and it would be difficult or impossible to prove that if setup properly. The information on the site could come from anywhere, since tweets can be viewed even when someone is not logged in.

It is worth noting that TMZ and other sites have been storing and detecting deleted celebrity tweets for years, and still do it to this day. If you're not using the API, Twitter has no affirmative way of knowing how you obtained the information.


> "in an automated manner"

I wonder where that line is drawn. A person using their browser, clicking a link to load a web page initiates a hundred different automated routines[1]. Is that not automation? Is it then automation to have a click initiate another click or two more clicks? Where do you draw the line?

[1]: https://github.com/alex/what-happens-when


The spirit of the rule should be self-evident.


And yet it isn't.

edit: the spirit of the rule is to grant Twitter control over when to apply the rule. Is that what you meant?

Is [F5] + [CTRL+S](for save) automated? Is a browser auto-refresh capability? is `$ watch wget` (or something to that effect) automated? getpocket.com? an IFTTT recipe? A crawler?

See what happens?


It is.

Automated means: without human intervention. So anything that accesses more than one URL without human intervention would violate the TOS.

I'm quite amazed that this would have to be spelled out. If you're going to break someone's TOS then at least be aware of it. That helps later on when you have to face the consequences. Just like I'd like to know when I'm speeding, rather than to pretend to be ignorant of the limit. That way I'd know my exposure.


It is like you are happy with how soon you truncate "intervention".

Let me know when you find a non human "[F5] + [CTRL+S]". Or link me to the alien crawlers you seem to imply. Are there amazon instances of crawlers owned by non-human entities?

Do you consider the "death by bullet" to be caused by humans? By the transfer of momentum of lead? By gunpowder combustion? By human fingers pressing triggers? By time? It time killing humans? Oh no.

How long must the causal chain be for you to consider it non human?

> [F5] + [CTRL+S]

> a browser auto-refresh capability

> $ watch wget

> getpocket.com

> an IFTTT recipe

> A crawler

I can put all of these in motion. I can also write a macro that triggers them. Is there a difference? Twitter seems to want to be the arbiter of it (if it makes a difference).

That is the only spirit of the law: We reserve the right to consider your actions wrong.


So if I just visit twitter without logging in, do I have to abide by their ToS?


You may not do any of the following while accessing or using the Services: (i) access, tamper with, or use non-public areas of the Services, Twitter’s computer systems, or the technical delivery systems of Twitter’s providers; (ii) probe, scan, or test the vulnerability of any system or network or breach or circumvent any security or authentication measures; (iii) access or search or attempt to access or search the Services by any means (automated or otherwise) other than through our currently available, published interfaces that are provided by Twitter (and only pursuant to the applicable terms and conditions), unless you have been specifically allowed to do so in a separate agreement with Twitter (NOTE: crawling the Services is permissible if done in accordance with the provisions of the robots.txt file, however, scraping the Services without the prior consent of Twitter is expressly prohibited); (iv) forge any TCP/IP packet header or any part of the header information in any email or posting, or in any way use the Services to send altered, deceptive or false source-identifying information; or (v) interfere with, or disrupt, (or attempt to do so), the access of any user, host or network, including, without limitation, sending a virus, overloading, flooding, spamming, mail-bombing the Services, or by scripting the creation of Content in such a manner as to interfere with or create an undue burden on the Services.

Relevant part: "You may not do any of the following while accessing or using the Services: (...) crawling the Services is permissible if done in accordance with the provisions of the robots.txt file, however, scraping the Services without the prior consent of Twitter is expressly prohibited"


But the thing is, I didn't read or agree to the ToS at all. It was never presented to me and content was presented on my screen when I visited twiiter.com, or followed a link. I am not bound by terms I did not agree to.

At least that is my layman logic. But we live in a world where ~50 page long EULAs exist, so layman logic may not be applicable in this case.


True. Consider also whether adding something like "Each request to this server constitutes agreement to a $50 fee which is selectively enforced at our discretion" would make a scraper liable when they didn't agree to the terms.

If the law truly agrees that acceptance of undisclosed terms is granted simply by virtue of access, it opens the door to rampant abuse.

Though, if you want to try the protecting yourself in a similar approach, add a header to every request in your browser that says "By responding to this request, you accept that I do NOT agree to your terms and you are willing to serve me anyway." At least in this case they would actually receive your terms with the request. I wish there was an RFC for this.


The way this works isn't that you do something against the TOS and then they try to sue you out of existence. The way it will work is that Twitter will notify you "Hey what you're doing is against our TOS". At which point you'll be aware and if you continue your actions you're now liable.


I'm guessing the word to pay attention to is "accessing". So I guess as long as you are accessing the site you are under ToS.


Which doesn't matter if you are based in a country that doesn't respects US law.


If your nation has a extradition Treaty with the US I would be cautious as well...

US has been very good of late about charging people that have never been to the US for violation of US laws online. It is bullshit but it seems many nations refuse to protect their citizens from harassment by the US legal system


Copyright issues don't go away if you capture via browser.


Copyright belongs to the person tweeting, not to Twitter (they just get a license from the person tweeting). And fair use specifically mentions "criticism, comment, news reporting, [...] scholarship, or research", several of which apply to reporting on deleted tweets from prominent public figures.

This isn't a copyright issue at all; it's exclusively a ToS issue. Twitter can choose to prevent someone from using their API. They'd have a harder time forcing the takedown of the existing content.


It's a 1st amendment issue. If someone says something in a public forum what they said becomes public domain.


Yes, and what do you think of the person who deleted a tweet will say about you using their copyright? I would argue it is a copyright issue because by deleting content a user is saying they don't want anyone to access it. Why would Twitter then want to turn around and allow access to it through an API?

As a user I think this is the right tack to take. There is no possible advantage for Twitter to have sites that make the delete feature not really work.


Copyright is the right we give to content creators so that they can financially benefit from their work. It is not a generic "I control everything I have ever said" power.

In particular, "fair use" specifically doesn't require people to get permission from a copyright holder:

http://fairuse.stanford.edu/overview/fair-use/what-is-fair-u...

Discussing the tweets of public figures is clearly "commentary and criticism", which would give things like PostGhost "substantial non-infringing use" (which I believe is the relevant legal threshold).


> and what do you think of the person who deleted a tweet will say about you using their copyright?

How could this possibly be relevant? Copyright won't give them any grounds to stop you from saying "he's trying to walk it back now, but he said <embarrassing statement> earlier".


> Why would Twitter then want to turn around and allow access to it through an API?

Then? No, this is about getting it through the API while it's still up.


Obviously, but with the restriction you also delete it when it's no longer in the API. It would be ridiculous if Twitter helped developers to embarrass/blackmail their users.


If it is stored for purposes of commentary or news reporting, then it would fall under fair use and neither Twitter nor the original author have any power to stop you from republishing the tweets as long as it is considered fair use.


You can't undo saying something to a hundred thousand people. You can demand everyone forget, but it won't work.


Big Brother, 1984


When someone's an verified user, they're a public figure and have an outsize impact on the Twitter discourse (they tend to have a lot of followers). The things they say are inherently newsworthy. If the archiving is done for the purpose of scholarship or commentary it sounds like a compelling argument for it being fair use.


I think that one of the things that an account should have to trade in for being a Verified Account is the ability to delete tweets. Being a public figure and having the notoriety that is implicit in having a Verified Account, the account should have to stand by what it tweets, or at least not be able to modify the public record and break references to its content.


I like this idea, but I think a big function of the checkmark is reassuring people that an account belong to who it says it belongs to. For example, @realDenaldTrump is not verified (even though people still fall for it).

That said, Twitter administers the checkmark in puzzling and thoroughly non-transparent ways.


I think a big function of the checkmark is reassuring people that an account belong to who it says it belongs to

That is in fact the only reason Verified Accounts exist, and is the reason that non-celebrities or people otherwise lacking notoriety can not get Verified Accounts. It's not worth Twitter's effort to make my account, or any random user, Verified, it seems. But as Twitter's role continues to expand in the public sphere, it is really the public who gives out Verified Accounts (through making someone famous), and the public should have requirements, such as a certain level of consistency, from the accounts it gives the power/influence of the Verified Account to.


Then do it anonymously


What about a browser plugin where users could click and it takes a copy from their screen of the tweet they're seeing and sends it to the a project?

Having multiple users send in the same tweet could count as additional validation that it was not edited.


This could work, but there's two separate problems to solve: knowing when a verified account tweets, and capturing the tweet in a way that doesn't provably infringe on the TOS. A crowdsourced capture solves the second stage, but only if someone in the crowd happens to be browsing the right page at the time an update is made.

To make this work, someone can set up a 'shell company' that signs up for the Twitter API for a seemingly legitimate reason, and captures the actual events of verified accounts tweeting. Then, this service sends out events to all the crowdsourced clients to browse to the Twitter page and capture the content, and submit it to the project.

The behavior of the crowdsourced clients would be undetectable from a normal web user, and there would be a layer of indirection between the client signed up for the Twitter API and the curators of the submitted tweets.


I thought the Library of Congress was archiving all tweets? Perhaps I'm mistaken, or perhaps they do archive all tweets, but this archive isn't available via API such that PostGhost could rely on it? Or perhaps deleted tweets are also deleted from the LoC archive?


They were planning too, but never actually did it. http://www.politico.com/story/2015/07/library-of-congress-tw...



This is what I make of the situation:

- Foo tweets

- Bar takes a snapshot of Foo's tweet

- Foo deletes tweet

- Bar displays Foo's deleted tweet on own website

Twitter tells Bar to shut up.

(Twitter would, however, continue to store the deleted tweet. It wouldn't display it, though.)


I wonder how politwoops is allowed to function when postghost is deleted.


They address this in their open letter- the politwoops api credentials appear to have been reauthorized based on their tracking a political figures, whereas postghost tracks a broader range of verified users spanning journalism and media



What is this trend with Reddit, Twitter and Facebook suddenly going full gestapo and [REDACTED] everything? These kind of services are very important and if the companies don't want to play ball, then they should be circumvented to preserve the record.


The sad truth is (in the case of Reddit anyway) that allowing people to freely publish their opinions on your platform hurts your brand and your advertising revenue.

This is why /r/creepshots was taken down, despite being completely legal and within the Reddit rules. It damaged the brand.

Same reason why the "hate speech" against Islam was taken off of /r/news.


>The sad truth is (in the case of Reddit anyway) that allowing people to freely publish their opinions on your platform hurts your brand and your advertising revenue.

They're a user generated site (UGC). Truth is that their brand was mostly in the hands of their users from day one. And it's extremely hard to wrestle that control out of the hands of your users without suffering some pushback. In come cases, this pushback can even kill a site by making community disenfranchised.

Reddit has been doing a lot of censorship lately and people are slowly leaving that site. By exerting more free speech restrictions, they're starting to alienate large swaths of their userbase.

Here's an example of Reddit censorship in action: https://www.washingtonpost.com/news/the-intersect/wp/2016/06...

And then there's this: https://www.washingtonpost.com/news/the-intersect/wp/2016/06...

And this is just creepy:

‘We know your dark secrets. We know everything.’ Boasts Reddit CEO Steve Huffman

http://thenextweb.com/socialmedia/2016/05/30/reddit-knows-yo...


> they're starting to alienate large swaths of their userbase.

Alienating racists, woman-haters, and pedophiles! Heavens, what an awful thing to do!


This behavior has still alienated people who hate every one of the subreddits that has been tanked, and were never a member. There are people of the opinion that a place like Reddit should as a policy extend the same kind of freedom of speech that many enjoy (or at least in principle should enjoy) from government interference. I don't always agree with them, but they have a logic that has nothing to do with secretly loving racists, woman-haters, or pedophiles.

Personally as long as Stormfront, etc.'s ISPs don't get hit by government orders to shut them down I don't mind Reddit deciding to nix such stuff on their site. Social pressure is in some instances a good way to push change that we for whatever reason don't want to mandate legally. But if the discourse surrounding the recent political upheavals that the US and UK have experienced has demonstrated anything, it's that immediately accusing anyone who even considers disagreeing with you of secret racism does nothing but preclude anything productive from happening.

Even if you think such tactics are fair game, they are consistently ineffective. Save such things for people you actually know enough about to know they are racist, sexist, etc.


I imagine he was referring to normal people who wanted news about the Orlando shooter but instead found thread after thread of [DELETED].

https://news.ycombinator.com/newsguidelines.html

>Avoid gratuitous negativity.


Although I detest that incident, I haven't seen any evidence that this went further up the chain than renegade moderator(s). This doesn't seem like the kind of thing Reddit's staff would get involved in so quickly and so heavy-handedly.


Have you seen the threads before the deletion? There was a huge brigade of "DEPORT ISLAM" and "Yes all muslims" comments – over 90% of the comments were that, and /r/the_donald ensured they would be constantly upvoted, too.

If such a thread happened here, you can be sure @dang would nuke them, too.


If you have a significant % of the population screaming X, you don't have a problem. You have a population that believes in X.

of course, https://en.wikipedia.org/wiki/Silent_majority and https://en.wikipedia.org/wiki/1%25_rule_(Internet_culture) still apply.

What I am saying is, terribly as it may seem, hate can also be belief.


Yes, but belief is not something that is supposed to be on that subreddit. The subreddit is for a balanced discussion about news.

(And that belief is very similar to "The Third Wave" experiment – I recommend reading the books or movies to try to understand the right-populist movement in many countries today)


Thank you for the books, will look them up.


It’s an experiment from the US from the 60s I think of a teacher trying to understand how the general population could support the nazis – the events really happened, but the book differs from the reality in most parts except the ending, while the movie differs only in the ending, but there it’s very different from what really happened.

After all, though, it’s well worth it, and should be standard material in all schools.

If you want a more technical view, there’s also scientific writeups about "The Third Wave".


It's also that Reddit isn't a place without culture. If people don't want something on their platform, they're under no obligation to do so.

Granted , Reddit is a bit wider in scope than Something Awful (for example) but it's still a community in itself that can make decisions on content.


What's the purpose of going through the API? Why wouldn't someone just set up crawlers for public figures in countries outside of jurisdiction.

To be clear I'm not promoting that somebody do this, just wondering why there may not be a viable alternative that does it this way.


Because it would still be against the Terms of Service and you still could get very much sued.


IANAL but how would a foreign company get sued under US Law?

I understand that it would violate the ToS if they conducted this directly through Twitter, and that may or may not be a big deal depending on the local Jurisprudence and other factors- but say an application was using the Twitter API to reproduce the tweets of persons of interest, and prompting deleting them as required-- what then is to prevent our foreign fictitious company from scraping THAT service?

Twitter should not be asserting itself in this kind of overreach. Personally I look forward to it's continued decline in value.


As my first lawyer explained to me, "Anybody can sue you for anything. The question isn't 'Can they sue me?', it's 'Can you afford to go the distance?'"

Twitter has plenty of lawyers, and can afford to hire plenty more, including lawyers in whatever country you set this up in. They could go after owners, workers, hosting companies, ad providers, network providers, and anybody else with a significant connection.

Who knows if they'd bother. But I don't think anybody should expect a simple dodge to make them completely safe.


Brings some questions into my mind:

Can the archive project or similar crawl twitter to save content?

What kind of checks and balances should social media networks have? Supone they be regulated?


Archive projects usually listen to robots.txt


ArchiveTeam on the other hand is a bit more aggressive:

http://www.archiveteam.org/index.php?title=Main_Page

Though they generally do last minute emergency backups of closing sites, so this probably isn't a job for them.


PostGhost might have been violating the european right to be forgotten. If this was the case then it was a good move on Twitter's part.


Maybe it’s time to give Sublevel a new try. It’s not perfect, but it’s orders of magnitude faster than Twitter.


Your comment history is filled with references to whatever Sublevel is, if you're affiliated you should be disclosing that in your comments. Additionally, a Twitter competitor has nothing to do with the current conversation unless it's also widely used by public figures to broadcast to millions of people.


That's some crappy name.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: