Hacker News new | past | comments | ask | show | jobs | submit login
What a graph of 8,000 fake Twitter accounts looks like (shkspr.mobi)
199 points by edent on March 9, 2015 | hide | past | favorite | 60 comments



We've done some analysis as well, it will be interesting to compare notes - let's connect. Best bots have real-sounding names, a photo, a bio - and they randomly inject human comments. Found huge networks built almost entirely on IFTTT, they don't Retweet master accounts (that would only facilitate detection), they fetch the text and re-post it as their own original message.

One area we looked into recently, was Russia's propaganda machine (after discovering strange patterns following the recent murder of Nemtsov). A typical spam account looks like the link below. Unlike the accounts in your screenshots, these accounts sometimes post thousands of messages a day. The one below has posted 190,000 times so far, in Russian:

https://twitter.com/mashabardaeva

Now, if you copy any of her tweets (it's a "she"), and paste it into the Search field, you will see the "rebroadcasters":

http://i.imgur.com/MVsga8X.png


if anyone from Twitter is reading this, connect with me regarding the problems we uncovered, I have practical suggestions on how to make your service better.


There do seem to be multiple approaches to these zombie accounts. Lots of different ways to avoid Twitter's rate limits and spam detection.


Twitter, Facebook... Why do you think they would like the dummy accounts removed? As long as they are not overdoing it, they bump the statistics. I know personally people that run hundreds of thousands accounts on Facebook for marketing etc, yet they say FB does very little difficulties to them as long as they stick to some limits in posting, sharing and liking.

Unfortunately reaching 1 Billion users while closing one eye on bots is very tempting.


Astroturfers need to have their internet privileges permanently revoked.


because if it's blatant, they may be liable.


As long as they will show "we have deleted 100K bot accounts a day since a year" they wont. No one will blame them for not inventing "foolproof" or workable spam filter, even if it is for small team of security engineers. All they need is some number of abusive accounts to ban, so statistics are right.


Thousands of messages per day should warrant immediate termination, what real life person would ever tweet steadily multiple times per minute?


Personally, I agree with you, Twitter is very much like an economy... currently, some much of the Supply (publishing) is being manufactured, it's exceeding Demand, so now the Demand (real human readership) has to be manufactured with more bots... Vicious cycle. All this results in a bad experience for the average user. Twitter is a fantastic platform, it just has certain qualities that are now becoming more of a problem than before.

As for immediate automatic termination for thousands of messages per day - that's a gray area, because Automation (IFTTT etc).

There are also people making living off Twitter, some of the top "influencers" or "internet marketers" post hundreds times a day. You look at someone like Marc Andreessen (I can't see @pmarca, he blocked me on Twitter:)), his total tweet count is approaching 50,000 - some may argue even this isn't possible for a typical person with a job. He pulls it off though.

Going back to the topic of bots: thousands of messages per day is extreme, and that's not what most smart bots do. The biggest effect is resulting from ones that have learned to live and act well within Twitter guidelines, which doesn't just make them harder to detect, it makes them harder to ban.

Here's a random account, it posts a million times a year, or 2,700 times per day:

https://twitter.com/toohsuite


Social Media unfortunately is a new favorite propaganda playground for nation states, both to cause instability and revolution in enemy nations (e.g. the USAID attempt in Cuba [1]), to push their narrative in another country (e.g. US use of Twitter to try to deradicalize Muslims [2]), and to cause confusion to disrupt the narratives of others.

What can be done about this? This article is the first step: preliminary fingerprinting of the low hanging fruit.

[1] http://www.washingtonpost.com/lifestyle/style/usaid-effort-t...

[2] http://minerva.dtic.mil/doc/samplewp-Lieberman.pdf


So, would you all like to see an example of some really sophisticated ongoing twitter spam, that normally stays undetected?

On the one hand, it's our competitor (so I understand my motives could be questioned). On the other hand, I feel companies that employ these practices (and VC firms that support them) should be called out.

WDYT?


Disclaimer:

We built a product for researching competition. Part of its core functionality is monitoring Twitter activity related to a given company and determining where their popularity is coming from, and who is influencing it.

Naturally, we’re testing the product by monitoring other companies in the Competitive Intelligence space, so we know what we’re up against. The list of companies we added was determined by questions we’d been asked by prospects, “how does iTrend compare against X in terms of Competitive Intelligence?”.

One company, Owler, immediately stood out due to some very unusual patterns. All of their top promoters appear to be their own accounts. Each account is going through companies' Twitter handles alphabetically, sending them a message based on what appears to be a series of predetermined templates.

Details with screenshots, if anyone is interested:

http://blog.itrendcorporation.com/2015/03/09/researching-com...


As long as you disclose your conflict of interest upfront, the facts are the facts if you stick to them.


Show us.


The list of trending topics are infested by these bot networks since a very long time. Here, an example from Turkish TT: http://i.imgur.com/kMXK2zu.png (Notice the telephone number separated by "x"s. "Call this number to get on the TT list")

I think Twitter would have stopped these much earlier if they could.


>I think Twitter would have stopped these much earlier if they could.

If Twitter clamped down on bots on the site/network it could end up revealing a very unflattering (to investors) picture of the sites true user traction.

The site has struggled to grow to its targets and has a significant issue with people creating accounts and quickly abandoning the site.

So Twitter has a very large incentive to do nothing about the bots.


While in terms of metrics bots might help Twitter's numbers look better I would say Twitter has a large incentive to do something about bots.

1. Bots degrade the experience for the regular user. The site feels less authentic with bots, and the quality of the content overall goes down. If you're trying to get eyes on Twitter, bot content does not help you.

2. Bots are not helpful when looking at your ad metrics. Bots are not actually viewing ads and are not interacting with them. If you're goal is to get actual eyeballs on ads, and showing those ads have impact then bots make your site look weaker.


Re-IPO twitter was all about growth. There 1.0 API made making a spambot ridiculously easy. 1.1 is only a slight improvement. One problem is that line between real and bot has been blurred. There are many real accounts out there whose owner have attach bots to them. I frequently get bot follows only to see that I have been unfollowed in short order. The obvious hope is that I will reciprocate the follow and not notice the unfollow. But often these are coming from real accounts. One of the downsides of being a programmer and being interested in CS type things is that a the twitter circles I would be interested are loaded with these semi-faux/semi-real accounts. The result is that the spam to signal ratio is much too high.


I'm just using a simple tactic: if the user follows more than 10k users, that means that he has no interest in my tweets and will probably never interact with me.

So, if I like what he is really doing, sure, I might follow him back. Otherwise, no way.


Turkish TT scene is so fucked up and TT list is so useless, I have changed them to show trends from Greece. One or two are in English time to time but most are in Greek, and I can't even read the characters let alone the words. Problem solved.


Here's a medium post last year about buying bots for those interested:

https://medium.com/i-data/fake-friends-with-real-benefits-ee...

Thanks for making the data set available.


That's a useful post. I wonder what they used for the visualisations. Anyone know?



I always wondered how much is in the interest of Twitter to get rid of fake accounts.

At the end of the day, the "Who to follow" box indicates that the more connections each node has, the better.


>> I always wondered how much is in the interest of Twitter to get rid of fake accounts. <<

Mr. Eden there found out how to identify them quite nicely without having access to internal Twitter tech, so my guess is Twitter could have done the same, but hasn't. So I think the answer is, "not much."


I wonder if Twitter's hand can be forced to deal with bots by users running an automated bot identifier + spam reporter. Maybe a botnet of bot reporters... :)


Ha, that does sound like fun. I think I might attempt to build something like that.


If you do, I will be happy to contribute the criteria for "fakeness". malatortsev at itrend dot tv


These sorts of articles always raise an interesting question in me. Where is Twitter?

I get a fair amount of bogus traffic, robots scraping for site vulnerabilities, black hat SEO's trying to find every blog that allows anonymous comments, etc. And I'm reasonably aggressive about shutting those folks out. If you're abusing the TOS I try to shut you down, if you complain we'll talk about it, if you don't well you stay shut down.

So for anyone in Twitter, these are pretty clearly bogus accounts. Bogus enough to support mass banning and waiting for the 'real' ones (if they exist) to complain. So why not just ban/suspend them? Does anyone even try to weed out robots?

Obviously there is a market alignment issue, Twitter likes to quote 'number of accounts' or 'number of signups' or 'monthly active users' as a metric for "goodness". And suspending all the robots would make all of those numbers go down. But it would make their actual user experience better right? Or would it strip away the 'purchased hordes'[1] and suddenly make their users depressed that they had lost a big chunk of their followers that day?

One could be very pragmatic about such things, and consider Twitter to essentially be a game, where the score is "followers" and the rules are flexible (basically anyone Twitter doesn't suspend/ban is a legit point in your score), and as a game it has recruited some interesting high level players (like major brands). When you look at it that way, it makes them look like geniuses, "Hey look, we got General Motors playing Twitter and their in-game purchases are going through the roof!"

[1] http://techcrunch.com/2012/07/31/caught-blue-handed-someone-...


There are smarter bots, which just retweet/favorite the tweets contains some special tags or words and follow the owner of the original tweet. Many owners of the original tweet will follow back, they are not bots, they are really people.


Almost every online marketer does this type of automation. As you said, they goal is to get real people to look at their account, and to hopefully follow them.

This is mild, comparing to some of the other activity that goes on.


Some ones market nothing, their intention is just to get more followers. Maybe they will marketing something later.

I know such a bot, it gets 100+ new followers every day, all of them are really people.


How do you know they are really people? That behaviour is exactly what one would program a bot to do when interacting with unknown accounts.


By your definition, the spam bots in that article are not bots either because an actual human being created that bot, and made it follow other spam bots.


The value of the spam bots in that article is really small. The only benefit I can think of is to sell them as followers.


Is this a number station? https://twitter.com/googuns_staging


That's actually an example that I am Okay with.

Machines can be a benefit or a hazard etc... This one is harmless (and potentially useful to some), because:

- it's not 'Following' anyone

- it's not Retweeting other accounts (not on the surface at least)

- you don't get to see it on your timeline (unless you're one of the followers), so it's not cluttering anything

- it's not promoting any products, so it's not ending up on any third-party reports that agencies will then bill to the advertiser.

I like machines, and I think machines should be on Twitter (if they play by the rules), and they can be incredibly effective.


I have one account active for a few years now (I've created it in 2010 or 2011, not sure). I didn't tweet anything at a time and started using it (and Twitter altogether) somewhere in November 2013. It already had ~300 followers -> all of them bots (like this one: https://twitter.com/rachid82369439). I have no idea how I got them. Maybe I had some specific keyword in my profile summary that they were looking for. I honestly don't remember at all.

After using Twitter for more than a year, I have seen some slight impact from having those 300 Twitter bots. Because I am only following around 150 people (and around 500 Twitter profiles follows me), whenever I follow someone new they feel kind of privileged and they think that I have a some kind of influence like a steady number of favorites and retweets per tweet. After they consider that, they usually follow me back.


You may be interested in this: https://www.academia.edu/6932933/Twitter_Who_gets_Caught . Section 7.2 has a study of a spambot social graph.


What is the business model behind motivating someone to deploy Twitter bots?


It's all about ad views. Twitter's investors are interested in the MAU (monthly active users) which is measured as number of users who log in once in the month. Showing a growing, active user base means the company is seen as more valuable by said investors.


I don't buy this. Twitter has a lot to lose if they were caught creating fake accounts.

However, other people creating fake accounts certainly might benefit them for the reason you suggested - but these bots were not made by Twitter.


selling twitter followers to companies, or people that need a boost


I understand markuns's theory that it would benefit Twitter (ad views), but why would I, if I were a Twitter user, care if I have 10,000 bots "following" my messages? Bots don't buy anything. Do Twitterers get paid by the follower or something?

(not being facetious, I'm honestly curious, as someone who does not use Twitter or Facebook or any of these other social promotion services)


If you are a popular TV show, your social tv rating ends up in your Nielsen reports, enabling you to charge more for advertising. Ad agency gets paid for their promotional work, etc.

As an individual with lots of followers, you could advertise paid gigs, like "I will mention your product to my 100k followers for $10". Then somebody like Kim Kardashian can charge $100k or more per "endorsement".

There's a whole vaporware "value chain" here.


The more followers you have, the greater your social status.


There's a very real benefit to having a lot of followers.

The user withdavidle posted this above, which is a great article: https://medium.com/i-data/fake-friends-with-real-benefits-ee...


What makes a bot acceptable on Twitter?

There are a ton of fascinating Twitter bots, some humorous, some creative, some little experiments, and some impenetrable to anyone but the creator.

But there are also a lot of spam bots that exist for anti-social causes - to boost a followers count, to spam a product, to (presumably) link to viruses or literal scams.

Does anyone know how Twitter decides which bots are unacceptable?


is it likely that these bots show as unique visitors to twitter? I don't know if the bots need to use unique ips or anything to not get caught. Is it likely that twitter charges advertisers when a bot views an ad? I would be extremely interested in extra analysis of how this affects twitter's core business.


First of all, there are different types of bots.

Some are (or virtually indistinguishable from) real people, typically in Asia, performing simple tasks a la Mechanical Turk.

The bigger networks though appear to be completely automated, and what's really important is that they they don't look "fake" in a sense that they look, and act, like a typical user.

Spam is a grey area, since automation is so wide-spread now. Should be noted that some of the top marketers on Twitter are very prolific. Guy Kawasaki has posted 138,000 tweets since 2007 - that's more than 60 tweets a day, every day. Don't mean to pick on him, there are hundreds like him. I've seen accounts posting thousands of messages a day. Simple act of promoting something very actively from the same address isn't enough, it's done by legitimate marketers all the time.

Keep in mind there are different sources of ad revenue, and there are multiple parties involved. Twitter will count an ad impression when the message is viewed "above the fold", for example your iPhone Twitter app displays the promoted message in the active portion of the screen, which implies you can "see" it. I believe that bots do not have a significant effect on this particular type of revenue. However, many commercial deals involve a different definition of 'impressions':

     Number of tweets promoting the product x number of followers the promoting accounts have
... and that's where bots play a huge role.

So, this is getting too big for posting here, if anyone is interested in this, I can blog. This topic is surprisingly complex.


People like Kawasaki often hire assistants to tweet for them -- not necessarily Mechanical Turk mediated, legit jobs as social media managers.


I once wrote to him asking if he was a bot, got an immediate reply back saying No.

Point I was trying to make - even thousands of messages per day can be explained and justified, and if you follow someone like that, it's up to you to filter the noise.


I would be very interested in a blog post on this


It's hard to say. The bots make posts - and the metadata shows that they do so using the Twitter website. Either they've got a sweatshop full of people doing the clicking, or they're automating their access to the site.

In any case, it would be fairly easy to simulate an "organic" pattern of use.

No idea if they're clicking adverts.


On the topics that we tracked, the user agent was almost exclusively IFTTT.


I'm currently seating on some research on twitter bots (that I found… while writing a twitter bot). I have a very basic bot detector, that I'm looking to improve… but I didn't think of using graph analysis. Very interesting.


I've seen these before. Not sure it's the same, but in my case it was an elaborate way of follow-spam: they followed each other and a few levels deep there is a master node which tweets the spam link, all the others retweet it.


Those are the least sophisticated spam networks, based on what we've seen. Following each other and properly retweeting makes it easy to plot the network and relationship within.

The bigger (and more effective) ones designed around a few rules, we found:

- accounts they "follow" are often a diversion, they are usually selected at random

- who they truly "follow" (i.e. actively listen to) is defined outside of Twitter

- they don't do "retweets" correctly, because that would also make connections visible; instead, they post messages fed to them externally as their own original tweets

- IFTTT plays a big role in simplifying automation while keeping the true relationships "invisible" to Twitter and people using Twitter APIs


I find this inspiring. I love when I see someone doing something very poorly but obviously having success with it. (The twitter-bots, not the research)

It's time for a new side project.


Nice bit of work.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: