Insanely simple Cloud-based replacement for statsd + graphite

nhashem · on July 18, 2012

Very cool service!

Question about the architecture though: what were the reasons for using HTTP as opposed to UDP (which is typically how these stats collection servers receive data)? It looks like it's possible to keep system load manageable since you aggregate the data and space out the HTTP requests, but why do this instead of just blasting the server with UDP requests?

spolu · on July 18, 2012

Since we are cloud-based, our servers are remote to our customers'... and since there's no UDP availability over WAN... Then we had to find a solution to make it work nicely with HTTP. Does it make sense?

theatrus2 · on July 18, 2012

Since when does UDP not work over a WAN/the internet?

spolu · on July 19, 2012

Definitely talked too fast! You're right, it definitely works. It's a little bit shadier on WAN but the main reason for the pre-aggregation is that the fire+forget behavior is much harder to scale as a service and would wastefully cost a lot of WAN bandwidth.

Once you have a pre-aggregation and only periodic updates... UDP does not make sense anymore.

HTTP also good for browser clients which could make sense.

shepik · on July 19, 2012

Some time ago I wrote a system to collect performace statistics for my company's internal use (we have 100+ application servers - 4-6core 2xcpu hardware servers - and 1 server running the statistics). And then we released it as open-source. Our system shows not only by-service- or by-source-script- statistics, but also which services are used by a particular script. Drawback: all information is in russian.

pics: http://imgur.com/a/idFF6 https://github.com/mambaru/btp-daemon - and google-translated text https://gist.github.com/3139890

Anyone interested? Is it worth to translate it to english?

spolu · on July 19, 2012

That looks cool!

anrope · on July 18, 2012

Does the name mean anything? I guess it's an anagram of statsd, but it's tough on the eyes.

Maybe I'm just not seeing them, but this seems to be missing some of my favorite graphite features.

- Combining multiple metrics on a single graph

- Filters (e.g. lowestCurrent)

- Graph labels and resizing

Is there a feature like graphite events?

spolu · on July 18, 2012

It's an acronym for statsd indeed. Tough on the eyes yet easy to remember I think... Not the best name ever I admit :)

- Combining graph is one of the biggest missing feature at the moment. We'll have simultaneous hover very soon to kind to help with that, and we'll work on combining the graphs right after! (useful for scale of course!)

- On timers you have the main "filters" built-in: min, max, 0.1 / 0.9 percentile and average. The filters are more an UI challenge than anything else so we should be able to iterate on that. We wanted to get something out fast!

- Graph resizing: same as above. The sky's the limit, we're using d3.js and it's pretty awesome but we wanted to limit ourselves to the core features for this first release!

Thanks a bunch for your comments they're really useful!

arrsingh · on July 19, 2012

Do you support aggregating the data and showing graphs for percentiles? So I'd like to send you latency stats and then get a graph for latency at the 99th percentile (or 90th percentile) or whatever.

spolu · on July 19, 2012

Yep for timers (latency). And we have the data for counters and gauges as well.

You can even configure client-side which percentile you want, by default 10th / 90th

moe · on July 18, 2012

Sorry to be that guy, but I don't like linkbait headlines.

If you spend a lot of work then perhaps you might be a "replacement for graphite" in about 1 year.

Today you're barely a prototype.

spolu · on July 18, 2012

Agreed! We are trying to communicate on a clear vision, but at the same time we already are a replacement for statsd + graphite in PaaS (and some IaaS) environment where there is no UDP availability... So we're not entirely lying :)

moe · on July 19, 2012

trying to communicate on a clear vision

That's fine. Then how about "wants to become a graphite replacement" or something in the headline?

we already are a replacement for statsd + graphite

Excuse me? As it stands your frontend apparently can't even plot two lines in the same chart.

spolu · on July 18, 2012

Funny how there are more people hitting the login button than people hitting the big demo button... that's hackernews :)

lifeisstillgood · on July 18, 2012

ok, I'll bite, signing up now But How do I call you from inside my server? Are there libraries other than JS?

spolu · on July 18, 2012

It's a rest based API with a very thin library. We're working on Ruby, Python and Java as we speak. Which language are you working with, I can keep you posted as soon as we have the driver ready!

tehnorm · on July 18, 2012

Do you have any documentation on the rest API at this point? Interested in rolling our own client library.

spolu · on July 18, 2012

We'll be working on it tomorrow. Please drop us a line at team@dattss.com so that we can come back to you when it's ready.

In the meantime please have a look at http://github.com/teleportd/dattss-sdk to see how the nodeJS client works!

Cheers

heretohelp · on July 18, 2012

Oh come on man, don't tell me it's free. Tell me what numbers I can go up to before you give me the "phone call" (if that, you could go Google and just shut stuff off without any notification or warning of any kind) telling me I've hit some sort of API limit.

The ambiguity is like that Oracle Linux debacle.

I don't even really care if you intend on living in fantasy-land and never intend on charging for the service (get real), at least give me some idea of what the limitations are.

I like the live demo though, good idea.

I want to use your service but you need to set up some clearly defined boundaries. I'm not some sort of enterprise-wonk trying to hold you to an uptime contract, I'm a startup guy trying to make certain he's not hinging stuff his company uses on something overly uncertain.

spolu · on July 18, 2012

Free for now. For now, we're thinking of a $1-2 / stat / month price point with free first 5 stat/months.

Still a lot of discovery to do before we get a clear pricing. Not a priority RIGHT NOW, will become one rapidly.

We're free because we're in beta. Won't stay free. It's a promess. We may open source the whole thing though. Not contradictory at all.

Glad you liked the demo! :)

Nikkau · on July 18, 2012

The greatness of graphite is the ease of graphing thousands of stats in no time and without headache.

$1 per stats isn't prohibitive?

spolu · on July 18, 2012

Yep it might be. We still need to discover that! We might cap it to a certain price!

heretohelp · on July 18, 2012

Thank you for the quick response!

$1-2 / stat / month with no limit on traffic/storage?

Also the visualization UI is freakin' sweet!

spolu · on July 18, 2012

traffic is not a big issue since we have pre-aggregation client-side. So you can aggregate 5 event / hour or 10 millions it's all the same for DaTtSs... and it does not kill your network stack

As far as retention is concerned let say we aim for 1 month history. Might be more if people say they want it but we believe the value is in the analysis of what is happening now.

Make sense?

heretohelp · on July 18, 2012

So far, just need a Python client now :)