Call me crazy, but things going UP is generally good. I don't see how you can logically describe something as having an "increase in worsening satisfaction".
"AT&T is ranked 1st out of 244 brands"... they must be AWESOME...oh wait, no they're not.
It says it's a customer satisfaction index, when it's actually a customer DISsatisfaction index.
Also, I don't mean to sink the boots in but...
"that is awesome that you got SVU to discuss the boycott of the pedophile book on Amazon! I cannot wait to see how it goes!"
Is that a good comment or a bad comment?
This is not how sentiment analysis work (or should work). I worked on something similar (Naive Bayes based sentiment analyzer https://github.com/mohitranka/TwitterSentiment).
I also work for a company which is in the same space as groubalcsi.com (Brand/Product opinion mining)
Sentiment analysis is not a classification problem (like spam detection), but it is an identification problem, because sentiments are always associated with an entity (and attribute, if specified).
For example, a tweet saying, "Dell is not as good as apple" requires to identify entities (Dell and Apple) and associate sentiments to them (Negative and Positive, respectively). It is incorrect to try to associate sentiment (whatever it may be) to the tweet itself.
Interesting but possibly flawed exercise. It would be good to show the entire set of brands sorted from bottom (i.e., good) to top (i.e. bad).
I sorted the data and present here two groups:
1. This is a sample of supposedly the most satisfying, from the best on down (er, up): TGI Fridays, Best Western, Zenith Electronics, JVC, Chili's, Denny's, Hampton Inn, Olive Garden, Applebee's, Sams Club, Yahoo, AOL.
2. By contrast, here is a sample of some of the worst, listed from the top (high dissatisfaction) on down: Wikipedia, Apple, Nokia, Facebook, Volkswagen, YouTube, Amazon, Nike, Sony, Ikea, Range Rover, Rolex, Porsche, Google, Netflix, Louis Vuitton, CNN, American Express. Wall Street Journal, Intel.
Group 1 and 2 do not overlap in their scores.Meaning that Intel (the best of the worst) is at 404, with a higher dissatisfaction rating than AOL (the worst of the best).
This grouping does not make sense to me, because if you showed me the two lists above and asked which of these two sets had better satisfaction scores, I would have picked Group 2 over Group 1.
What could explain this? Perhaps there is demographic skew, in that down-market brands (Dennys, Sams Club, Zenith) are not talked about as much among upscale social media people, who would rather complain about Apple, Sony, and Porsche.
Or perhaps there is a mismatch of expectations. People expect the premium brands to deliver more, and complain loudly when they fall short in the slightest. And conversely, perhaps people expect a mediocre experience with downmarket brands.
What are the units of dissatisfaction used throughout the page? How do they map to the y-axis of the dissatisfaction graph? What sense of scale do I need to have to understand the units? Is a 945 bad? How bad? Is hate linear? Since AAPL scores roughly half as much as AT&T does that mean that the average Twitterer hates AAPL half as much? What happens if someone scores a perfect 1000? Can they be hated no further?
What time zone is the next update measured in? What makes your classifier 'Bayesian' besides just using something called a 'Naive Bayes Classifier'? What is the 90% accuracy determined from? Why should I care? Is a 24-hour improvement in customer satisfaction a significant thing? How quickly does hate fluctuate? What is your uncertainty in each of these measurements? Is there an overall brand hate level that I can compare these things to? How are they affected by overall sentiment toward companies?
------
It's an interesting complementary site to your primary interest in Groubal. I'm just a skeptic to methods in sentiment analysis in general. To analyze data properly is very hard. Applying tools to observe what happens is still interesting though.
But I'm not sure I learned a whole lot to see graphs proclaiming that Twitterers dislike AT&T, Time Warner, Banks, Internet Providers, and Zynga. Tylenol and Enterprise were interesting to find though. I have no idea what it means for Tylenol to be 100 units less hated though.
So perhaps what you should tune your ML stuff to seek out is not just some difficult to quantify measure of dissatisfaction, but instead look for things like Tylenol and Enterprise where people might not expect themselves to have such trouble with the brand. In such a case, it becomes automatic, insightful rabble-rousing instead of methodologically sparse hate-ranking.
If anyone's interested, we're using the Google Graph API for all the graphs (the spark lines and the big transparent ones at the top), and the Bayesian stuff is based on the PHP work I wrote up here: http://danzambonini.com/self-improving-bayesian-sentiment-an...
EDIT: Also, we're not really using it yet, but I thought it was interesting how you can also easily calculate the 'agreement' on sentiment by using the MySQL STDDEV function (or similar) to work out the variance in sentiment.
Thanks so much - I knew I could rely on HN'ers to find these things. I'm hoping the Rogers one is a one-off (we're just adding it this morning), but I'll double check all of this. Thanks again.
EDIT: just fixed the 'TED' (company doesn't exist) issue. Thanks!
EDIT2: just fixed the Rogers issue too. Thanks! (Plus, I love Coda for making my life easier/faster for versioning and uploading changes!)
pretty cool. I would change high meaning bad and low meaning good, unless its a rank out of the total. Its a bit counter-intuitive. Why do you just place an emphasis on dissatisfaction instead of giving the option to look at both?
Yeah, certainly the 'high = bad' thing is something we grappled with (and still do). The site is a sister-site to a consumer-complaint/petition website (http://www.groubal.com/), hence we're more interested in measuring/highlighting who is doing 'badly'. But yes, this could be done in a more intuitive way (showing the 'bottom' of a graph that had the axis in the traditional orientation, for example).
Amount of dissatisfaction is too mushy IMHO. Just title it "Crapometer" or "Hate-o-meter".
Better yet, just flip the Y-axis. I'd think that it would be easier to get a company to pay to improve upwards. Do you really have to match the sister site?
Something came to mind that will skew this heavily. People will mention a company by name for mainly one of two reasons, either to complain or to tell people about some cool new thing. If someone mentions a ubiquitous company like Google, Verizon, etc, it's usually to complain. They're probably not telling the world about the wonders of Google search. On the other hand, if someone mentions a smaller company it's probably the cool-new-thing factor.
Is the time span so short just because of initial lack of data? If not, I think it would be useful to change the span of the graph to more than a week, to understand the long term trend. Fore some of the lines I see high fluctuations, so the graph is not so meaningful.
I don't know if it's intentional, but I would expand your scope beyond complaints. If your math is good you could have a very nice reputation tracker and analytics package in general.
It picks up on any Twitter/FB updates that we can measure a discerning 'sentiment' for, so although it's limited to companies/brands at the moment, it could certainly be used for other things.
Just an example, some of the latest sentiment on Google, that would suggest why it hasn't got a great ranking (though also, it's not a terrible ranking):
Ok, my phone or Google Voice is unable to pick up calls when I press '1' to accept, and I can't connect via Gizmo5. WTF?
Google gps sucks all of a sudden
WTF? I open up Google and the first thing i see is "Will Justin Bieber get naked for Love Magazine?" and im like WHAT???
"AT&T is ranked 1st out of 244 brands"... they must be AWESOME...oh wait, no they're not.
It says it's a customer satisfaction index, when it's actually a customer DISsatisfaction index.
Also, I don't mean to sink the boots in but... "that is awesome that you got SVU to discuss the boycott of the pedophile book on Amazon! I cannot wait to see how it goes!" Is that a good comment or a bad comment?