In the past few years, there has been a surge in number of log management solutions — Loggly, LogDNA, Scalyr, Sumo Logic. Which one is being used by you / your company?
Papertrail, super friendly and insightful support.
So let me elaborate. Mostly what will you get from support is: "We are fixing problem", but in our case they were specific, "We have problems with Heroku logspout connection, 'heroku log' should still work." And the other time we went a bit over limit so they upped plan for free for a short period, se we could figure about what the problem was. Alerts are also what we use the most (no limits, no delays) which cannot say for the other providers.
Having tried self-hosted and cloud logging solutions, Papertrail is one of the services that we subscribe to that I feel pangs of gratitude when we receive our monthly billing notification.
Some of the things I liked about Papertrail:
- Super easy to setup (& automate setup). Just a dozen lines of setup in rsyslog.conf
- Also ships with a tiny client written in go if you want to tail specific files
- Sensible pricing. Charges per GB, which for us at least, correlated with how much business we were doing. Now we'd gladly pay more for the same service
- Great dashboard/UI. Having started with loggly, we were used to slow, unresponsive, unintuitive dashboard. As loggly grew, the dashboard increased in complexity. Papertrail by contrast is fast, simple and made sense (at least to us). It's quite surprising how simple it is and yet performs it's job very well. Although we don't use the live log tailing so much, the log grouping & notification system is very intuitive and easy to setup.
- Easy S3 archival. Took about 10 minutes to setup
Using Papertrail as well, but I really miss my terminal and awk, grep and less here. Or even regex search. I know I can download the archives, and I do that, but that puts the output of all in one file. Saving different topics in different files just makes sense, IMHO...
Still getting used to the ways of the cloud, I suppose...
Biggest downside to papertrail is the price. You can log important stuff for a reasonable price. If you want to debug log everything from a fairly big site? You are looking at a papertrail bill 10x your hosting bill.
Not bad idea to send info+ to papertrail, and debug off to S3 or similar directly so you can dig when something weird happens..
We build our own. All events are published on Tank( https://github.com/phaistos-networks/TANK ) , and we have a bunch of consumers that consume from various Tank topics. They process the data and either publish to other Tank topics(to be consumed by other services), or they update state on various services.
- For data exploration, we use memSQL. We keep the last day’s worth of data there(we DELETE rows to keep the memory footprint down), and because most of the time it’s about understanding something that has happened recently, it’s almost always sufficient. Each row contains the event’s representation as JSON, and we also have a few more columns for faster lookup. memSQL’s JSON support is great(we used mySQL for that but it was too slow), so we can take advantage of joins, aggregations, windowing etc.
- For data visualisation, we use ELK (but it’s pretty slow), a tool our ops folks built (“otinanai”: https://github.com/phaistos-networks/otinanai) and we have a few smaller systems that generate graphs and reports.
- For alerts and tickets, our ops folk built another tool that monitors all those events, filters them and executes domain-specific logic that deals with outliers, notifications routing, and more.
This solves most of our needs, but we plan to improve this setup further, by monitoring even more resources and introducing more tools(Tank consumers) to get more out of our data.
Great tools, congrats. The name 'otinanai' though is rather dodgy (for those who knows what it means) although I can see it stems from [...] designed to graph anything.
We're actually building (and using) a log alternative called OverOps (https://www.overops.com), it's a native JVM agent that adds links to each log warning / error / exception that lead to the actual variable state and code that caused them, across the entire call stack. Disclaimer: I work there, would be happy to answer any question.
I'm really excited about Takipi (I guess OverOps now) but the per-JVM pricing kills it for us, especially as we look to moving to microservices. Any plans for alternative pricing, such as per-GB or per-exception?
Hey, no, not at the moment, but we'll be able to offer discounted prices depending on your specific requirements. If you're working with a startup we also have discounted pricing for that. Drop up an email on hello@overops.com and I'll make sure someone follows up
One thing it has over ELK worth mentioning is that it has authentication in the open source version. Also able to configure it more through the web interface than ELK.
It's a lot easier to set up and manage in my experience. The interface is very user friendly. It all depends on your specific needs of course. ELK is a bit more performant I think, although I have never compared them properly. Both have good scaling capabilities.
which is a PITA to do multi-line grep... variable field based searching and multi-system cross-comparison... but sure.. if it works for you.. then great
The one I really wanted to use/like was http://scalyr.com. However even after their redesign, I still can't use their query language. With LogEntries, it's pretty natural.
I was used to Google Cloud logs (comes for free with Appengine).. Now I"m working with an AWS based system with an ELK stack... Its ui is horrible. Finding the right log entries is a hell. And it often breaks and somebody has to update it.. I hope we can move to some log cloud provider soon.
Logmatic.io was not mentionned but we are more known in Europe so far. Disclaimer I work there.
We invest a lot in analytics, parsing/enrichment and fast & friendly UX. We try to be a premium solution for the same reasonable price as others and our users tend to say great things about us (eg http://www.capterra.com/log-management-software/ ). Happy to answer if you have any questions. :)
We are using Logmatic.io in my team (switch from logentries). Our stack is based on mesos/docker with plenty of microservices. Sending logs, building analytics are very easy, and clickable dashboard just amazing.
Logentries. Not sure if I'd say I'm satisfied, but I haven't found anything better.
Pros:
* Decent Java logging integration (some services treat things line-by-line, this is a deal breaker for things like multi-line Java exceptions)
* Reasonably priced
* Alerts are kinda nice
* Hosted
Cons:
* Sometimes UI maxes my Chrome CPU
* Live mode not stable at all
* UI is clunky to say the least. It's not always clear what the context of a search is, the autocomplete is obnoxious. I heard they have a new UI coming out sometime, who knows when
LogDNA: powerful, easy to get started and still improving. Using in parallel with Papertrail and instead of Logentries (which we had horrific problems with earlier in the year).
we had an awful time with logentries: "live" mode never working, the search facility is bizarre, over charging us, terrible UX. been with LogDNA for about 2 months and we are quite happy with it
We use EK but not L, instead writing own daemon that rsyslog sends loglines to and bulk inserts them into ES. We use kibana & grafana for visualization. We index approx 20k log-lines per sec (at peak) w/o a sweat (whereas logstash would choke up fairly often). A little over half a billion log lines a day - retained for a week - costs us around $800/mo on GCE (for storage & compute).
We use ENK instead. N as nxlog, open source release is great for many backends. Unlike Logstash, it's fast, written in C. Scriptable, no downtime for reconfig. Extendable with Ansible&co through include files.
Whatever solution you use to store your logs, I would suggest to generate them as events. This will help you to reconcile two important aspects that have been separated for too long, with no real reason : logging and analytics. It may require a little bit more effort but I believe it's worth it.
Interesting! this is something that I have thought over often. My own experience of log aggregation is limited to ELK stack and Loggly (for a brief time) where the setup worked fine but the workflow didn't. We just stopped browsing logs after a while. Although, a giant centralised system for logs sounds incredible convenient, making sense of them starts to become a huge problem and then, it's just easier to ignore log system.
I am sure the solutions discussed here have features to overcome this (filters / alerts) but IMHO, we'd be better of collecting less things — limited app events that have fixed formatting and are easier to make use of in debugging and monitoring.
If you come to think at it, logs are app events. You actually want to collect as many events as you can, with respect to performance, and analyze the hell out of them.
You would also get all sorts of benefits, because then you can corelate the app events with user events, and it may be easier to track bugs and unintended behaviour.
Previously used logentries and papertrail, but they became expensive as our log volumes got larger and flexibility was missing.
Now we use self-hosted ELK (elasticsearch, logstash, & kibana) and I'm not itching to go back to any of the hosted services. It's not as good as something like papertrail for tailing log streams live (although that isn't very useful at larger scale) and the UI of Kibana does take a bit of getting used to though.
We use https://github.com/gliderlabs/logspout to forward all our docker logs to Papertrail... it's like you are watching your nodejs services running in your terminal. Seamless experience.
I used sumo logic at some moment.
I was able to query months of data, it was very powerful.
You can achieve additional optimization by processing logs as they get ingested. Very well thought product.
I have also have used ELK, but I have to say that SumoLogic felt like a superior product by far.
Right now we do allow @gmail emails. Few years ago initially when we started, our initial focus was larger clients with a lot of personal touch. We later added self-serve model: public prices, credit cards, easier setup of popular data sources... As a startup its really tricky to address all segments at once.
We're hosted on AWS and used Papertrail. Found it super useful but it got really expensive. Since the new cloudwatch UI improvements, we're down to only using Cloudwatch logs. The UI still sucks quite a lot, but not enough to justify tripling logging costs.
Hi all,
The stack which we used in our organisation is,
1) Fluentd - for log line transporting
2) Elastic search - for indexing
3) Kibana - for viewing (remote log viewer)
Missed it as well. I am a freelance Splunk consultant and use it on a daily basis with great results for all my clients. Price can be an issue but user friendliness and number of features almost alway win.
Papertrail is beautiful for watching loglines roll in.
ELK (logstash, self-hosted) is... consuming. The software is free, but it takes a lot of compute resources, and isn't trivial to come to grips with (setup or daily use). If you can spare the staff-hours, ELK can be pretty powerful, though.
So let me elaborate. Mostly what will you get from support is: "We are fixing problem", but in our case they were specific, "We have problems with Heroku logspout connection, 'heroku log' should still work." And the other time we went a bit over limit so they upped plan for free for a short period, se we could figure about what the problem was. Alerts are also what we use the most (no limits, no delays) which cannot say for the other providers.
Good work Papertrail, if you are reading this.