Hacker News new | past | comments | ask | show | jobs | submit login
What Is a Data Scientist? (forbes.com/sites/danwoods)
38 points by sweetdreamerit on Nov 28, 2011 | hide | past | favorite | 19 comments



A data scientist is two things:

1. A business analyst who has traded Excel for a Unix command prompt and a scripting language, because (a) there's more data now than before, and (b) data is harder to clean because of (a).

2. A mythical unicorn creature that has spawned a conference, multiple books, and thousands of page views for O'Reilly, Business Insider, Forbes, etc. They're all "data-driven organizations" that double their page views any time the word "data" is whispered. As a result, they parse shitty articles into four ad-driven pages and publish books on "Data Mining Your Cat" or "Machine Learning Models for Organic Dietary Schedules."

Here are some better questions that will save you some time:

1. What is this article? Buzzword soup mixed with terrible journalism.

2. How do I become a data scientist? Don't--instead, build a good foundation in statistics and keep coding. You'll be happier and wealthier for it.

EDIT: removed a bunch of snark. Not all of it, obviously. ;)

EDIT 2: added some snark back. I really hate this "data scientist" meme.


I second your snark, and also find "data scientist" to be analogous to "food cooker", but I've also come to realize that there is a need for branding whatever-it-is as separate from "business analyst" because otherwise expectations from non-technical customers get very confused.

I interact with many buyers of buzzword-laden "big data" (snark on that one reserved for another day) and "data science" offerings. If you tell them you're a business analyst they will make many hidden assumptions based on the overall population of people using that title. Among those assumptions:

1. The business analyst will probably know SQL, but if we want to do something like analyze our query logs it will cost us developer time.

2. The business analyst will have domain knowledge about financials and maybe user engagement, but probably not anything else. The business analyst will not be able to tell us, for example, the characteristics of our 99-th percentile slowest page loads.

3. The business analyst is not hip and modern, because everything is about the "big data" these days.

Realistically, the language of modern business data applications requires a name for someone who is not just a business analyst as we knew them from the 90s. The people offering the services need a name to differentiate themselves when applying for jobs, and the customers need a way to indicate that they are not looking for "just a business analyst." We seem to have ended up with a rather foolish bit of terminology to serve this need, which is not surprising given that it emerged to facilitate an asymmetric market of the clueless looking for the knowledgeable.


what's with all the resentments? A few years back, titles like UX designer, UI designer, front-end developer, and back-end developer were non-existent too. Aren't they all just designers or programmers?

As a maturing technology tree branches out, more and more specialists will be born.


Totally agreed. Those titles serve to differentiate between responsibilities given to designers and programmers.

So how does a data scientist's responsibilities differ from those of a business analyst?


From a 30k feet point of view, the difference is that a business analyst or data miner focus on answering questions with data; whereas, a data scientist focus on asking questions with data.


What does "asking questions with data" mean?


Quoting from the article,

"On one side, I’ve been working on building products, like the recommender system, Talent Match, modeling and finding ways to empower users to use LinkedIn through their products. Groups You May Like was another product I started.

The other side is finding interesting stories in the data. It’s exciting to be able to tell stories collected from the careers of 120 million professionals, and trying to learn what that data can tell us about the world at large. That has given us stories such as, “top times of year to be promoted,” “overused buzzwords” and “top CEO names.” Industries are ebbing and flowing, and there are a lot of insights about the world at large from people’s profiles when they are combined and aggregated."


A data scientist is to a business analyst as a Senior Certified Java Enterprise Solutions Architect is to a programmer.


When I see

It is a new and emerging field, and it has to do with adapting to demand.

I know I don't have to read anymore. This sounds like describing your job as an elevator operator as "managed a complex interface to adapt to customer demands by efficiently handling their transport."

She answers it in the first sentence: Monica Rogati: By definition all scientists are data scientists.

That's it. Article done. No more SEO bullshit. For that matter, all statisticians are, too.


A "data scientist" is a business analyst who has never worked in industry before and is unaware that there already is a job title for what they do.


I rather thought it was a euphemism for Statistician. Which often gets a bad rap from the whole "lies, damn lies and statistics" type thinking.


Or quantitative analyst, but that sounds like you're participating in some kind of banking fraud.


I agree. There are different values in data to be discovered not just business.


That's interesting. I always thought business analysts worked on processes, not data. I've never met a business analyst who I'd consider an expert in statistical analysis. Focus was always process, and they (not to be demeaning) didn't seem to have the intelligence required of what people consider to be a data scientist (i.e. heavy math, scripting, etc).


Some do, some do R/SPSS/SAS/MATLAB etc etc. It's a broad role. If anything, so-called data scientists lack the big picture view and are just junior analysts in (unsurprising since they think they invented what they do).


The article seems to describe a person who does data mining for living. I guess "data scientist" sounds better than "data miner".


I've seen a wide variety of job descriptions for these jobs. Sometimes it sounds like a data miner, sometimes a business analyst, sometimes an accountant. It's hard to know what a Data Scientist is until you look at the specific job description.


I work in scientometrics, which is like econometrics but with scientific papers and patents (patents are not optimal, but it's an adequate proxy for innovation). It's useful to detect new trends in science and adjust funding accordingly.

We do a lot of data mining, information extraction from very organic sources, data visualization, etc. It's a growing field and it's very challenging and we have access to incredible datasets.

I think I fit their definition of a data scientist, but I'll never use that title. It's pompous and sounds something like someone in marketing would make up (and I worked as a market research analyst for a number of years). My title is simply research analyst.

Data scientist is bad, but the worst title I've seen was: web strategist & tactician


The term is pretty broad. I think the difference between a Data Scientist and a Statistician is the emphasis Data Scientists put on machine learning, databases, and distributed processing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: