Request logging and web vitals for Vercel apps

Roritharr · on May 16, 2022

Axiom is an awesome choice for this, I'm wondering why it isn't more commonly used already. It's imo the most sensible solution out there at the moment for the vast majority of projects.

madeofpalk · on May 17, 2022

What tsdb does axiom use? I presume it's just hosted prometheus or influxdb or something like that?

seiflotfy · on May 17, 2022

CTO here... we built our own homegrown tsdb that works on top of S3. Coordination free ingestion, Serverless querying :D

mikercampbell · on May 17, 2022

Interesting. What made you believe that a homegrown solution was better than existing alternatives?

I'm not being critical, I'm genuinely interested

seiflotfy · on May 17, 2022

Great question (and apologies for the length of the answer)! Through our previous experiences in building services around tsdbs & relying on tsdbs to host monitoring data, we kept hitting the same issues around ingest, retention, and querying capabilities.

Existing solutions, and those architected in a more traditional way than Axiom, would require highly-coordinated nodes running on expensive VMs that would be bound by cpu, memory, and storage depending on your use case:

- Want to store TBs of data for months/years and query any piece of that at any time? Prepare to have expensive SSDs or wait for 'archived' data to swap in from cheaper storage.

- Want to run a query that combines N datasets, calculates aggregations, over TBs of data, and then compare that against data with the time shifted back a week, month, or year? Great, fire up some heavy VMs with enough cpu, memory, and bandwidth to compute all that.

- What if you use case varying greatly in each dimension (how much ingest you'd use, how much storage you'd need, and how heavy your queries would be) across a day, week, or month? You'd have to find a way to adapt to changing requirements or just scale up for the worst case x2 and just pay the $$$.

With Axiom, we had three key goals:

- Hyper-efficient, co-ordination free, schema-less, and index-free ingest (~1.4TB/day on a $5/mo container)

- Cheapest storage for hot, cold, archived and warehoused data: all data is stored in object storage, highly compressed & ready to query. 10 seconds or 10 years old data is the same, already as cheap as possible.

- Serverless querying that can expand from 0 to as-much-as-AWS-will-allow depending on what your query needs + how many of your team are querying at once.

The above was achieved through a lot of trial-and-error, tweaking, testing, and head-scratching, but we're really proud of what we've built. We can run super-fast ad-hoc queries, we've eliminated the need to think about retention, we have live streaming, and we have an incredible query language inspired by Microsoft's Kusto (Splunk users will be familiar!).

We still have a lot I'd like to see done, but I think we really do have something unique :)

mikercampbell · on May 20, 2022

I love this and I love the response!!

And I'm glad you went to the lengths you did in your explanation!

Those specs are _insane_. What was your stack other than S3?

Do you have plans on PaaS/SaaS-ing your tsdb?

I've kind of taken it on as my life's work to study the software development process when it comes to "Product vs Platform" solutions. It's something I'm passionate about, and I'm forming strong opinions, loosely held for now.

So often in my (still short) career I've encountered so many "we couldn't find the right fit and so we built our own"-isms, and with very, very few have I ever felt like it was necessary.

Here is the rough draft of my opinion on internal solutions vs 3rd party/FOSS solutions. It's in the heat of building a series of internal solutions that aren't necessary, given the constraints and resources available.

https://mikercampbell.bearblog.dev/build-tech-or-product/

I'm currently working on a blog post that explains how (this time with more tech than just opinion) 95% of tech problems could easily be solved by 5% of the technology out there.

With the traffic and demands you have, you are of the 5% of tech problems where it isn't improbable that you couldn't find a tech to meet your needs.

I just need more experience in the field of tooling before I feel confident putting my foot down anywhere and drawing the line. So I'd love to talk more if you're open to it!

krono · on May 16, 2022

Created my own logdrain for some side-projects hosted at Vercel. It's a fun weekend project and definitely makes you appreciate the many parts and complexities that go into creating a "proper" solution such as what Axiom looks to be!

Slightly off-topic but Vercel's docs are really awesome these days. Good job Leerob and co. it has all come a long way in only a few short years. Not much left for me to bitch about ;)

leerob · on May 16, 2022

Thank you! Appreciate the kind words. If there's anything we can improve, you can always reach me at lee at vercel dot com.

seiflotfy · on May 16, 2022

Leerob and co are fantastic :D The feedback we got from the Vercel team while building the integration was invaluable :D

Also feel free to join our slack for more feedback :D

leerob · on May 16, 2022

I've been using Axiom on my site (leerob.io) for a few months now, really enjoyed it. It was far easier to set up than other observability tools and I even created some custom alerts.

njpatel · on May 16, 2022

Neil here, CEO of Axiom, glad it’s working out for you! Would love to hear what custom alerts you’ve set up.

Our goal is to provide some out-of-the-box alerting tuned to Vercel deployments as we learn more about how devs are using Axiom & Vercel!

theobr · on May 16, 2022

We've been using Axiom at Ping for awhile now and it's 100% the drop-in logdrain we've been looking for.

Couldn't be simpler. Team's great too. Highly recommend

sorenbs · on May 16, 2022

Really excited to see this launch!

We use Axiom at Prisma for various things and love it. Happy to answer questions if anybody is interested. Either here or on email.

tcbyrd · on May 17, 2022

Something I’ve been thinking about lately around all these external observability tools. Does streaming logs count towards your outbound bandwidth from providers like Vercel/Netlify/etc? One huge benefit to using everything the cloud provider gives you is all of it stays “on-net”, so to speak. I wonder if I use a tool like this or Honeycomb, am I not only monitoring ingest costs but also egress from the app provider. A lot of these solutions are hosted on top of AWS, so it would be awesome to find out they zero-rate traffic between them, but I feel like that’s probably not the case?

njpatel · on May 17, 2022

In the case of Axiom & Vercel, you'll only be paying for ingest (if you need over 2GB/day) and no bandwidth costs. I believe the latter is the case with all of Vercel's observability partners (and with most services providing a log-drains service).

It's a fair point, though...at some point, and with enough TBs of traffic, it could get expensive to provide the log drain support. One thing we've been working on (and more on this in the coming weeks), is putting together a benchmark of log-drain mechanisms to try and work out what would be the most efficient combo of compression/cpu/mem to use for log-drain providers.

Our hope is we can work on making it efficient enough that we can encourage more services to provide log-drains (and therefore allow us to make great experiences around those logs!).

lizthegrey · on May 18, 2022

Ask your observability provider about PrivateLink, or talk to your account rep at AWS about discounting traffic.

But also, you can sample some of the data within your own VPCs (see Honeycomb's Refinery or Lightstep's Satellites)