Hi HN! I’m Peter, the co-founder of OpenMeter (
http://openmeter.io). We are building an open-source project to help engineers to meter and attribute AI and compute usage for billing and analytics use cases. Our GitHub is at
https://github.com/openmeterio/openmeter, and there’s a demo video here:
https://www.loom.com/share/bc1cfa1b7ed94e65bd3a82f9f0334d04.
Why? Companies are increasingly adopting usage-based pricing models, requiring accurate metering. In addition, many SaaS products are expected to offer AI capabilities. To effectively cover costs and stay profitable, these companies must meter AI usage and attribute it to their customers.
When I worked at Stripe, my job was to price and attribute database usage to product teams. You can think about it like internal usage-based pricing to keep teams accountable and the business in the margins. This was when I realized that it’s challenging to extract usage data from various cloud infrastructure components (execution time, bytes stored, query complexity, backup size, etc.), meter it accurately, and handle failure scenarios like backfills and meter resets. I was frustrated that no standard exists to meter cloud infrastructure, and we had to do this on our own.
Usage metering requires accurately processing large volumes of events in real-time to power billing use cases and modern data-intensive applications. Imagine you want to meter and bill workload execution on a per-second granularity or meter the number of API calls you make to a third party and act instantly on events like a user hitting a billing threshold. The real-time aspect requires instant aggregations and queries; scalability means to able to ingest and process millions of usage events per second; it must be accurate—billing requires precise metering; and it must be fault tolerant, with built-in idempotency, event backfills, and meter resets.
This is challenging to build out, and the obvious approaches don’t work well: writing to a database for each usage event is expensive; monitoring systems are cheaper but inaccurate and lack idempotency (distributed systems use at-least-once delivery); batch processing in the data warehouse has unacceptable latency.
Companies also need to extract usage data from cloud infrastructure (Kubernetes, AWS, etc.), vendors (OpenAI, Twilio, etc.), and hardware components to attribute metered usage to their customers. Collecting usage in many cases requires writing custom code like measuring execution duration, listening to lifecycle events, scraping APIs periodically, parsing log streams, and attributing usage of shared and multi-tenant resources.
OpenMeter leverages stream processing to be able to update meters in real-time while processing large volumes of events simultaneously. The core is written in Go and uses the CloudEvents format to describe usage, Kafka to ingest events, and ksqlDB to dedupe and aggregate meters. We are also working on a Postgres sink for long-term storage. Check out our GitHub to learn more: https://github.com/openmeterio/openmeter
Other companies in the usage-based billing space are focused on payments and basically want to be Stripe replacements. With OpenMeter, we’re focusing instead on the engineering challenge of collecting usage data from cloud infrastructure and balancing tradeoffs between cost, scale, accuracy, and staleness. We’re not trying to be a payment platform—rather, we want to empower engineers to provide fresh and accurate usage data to Product, Sales, and Finance, helping them with billing, analytics, and revenue use cases.
We’re building OpenMeter as an open-source project (Apache 2.0), with the goal of making it the standard to collect and share usage across many solutions and providers. In the future, we’ll offer a hosted / cloud version of OpenMeter with high availability guarantees and easy integrations to payment, CRM, and analytics solutions.
What usage metering issues or experiences do you have? We would love to hear your feedback on OpenMeter and to learn from which sources you need to extract usage and how the metered data is leveraged. Looking forward to your comments!
We run millions of tiny VMs. Each gets billed on a number of dimensions: egress, runtime (per cpu / memory combo), storage / io. We also have other metered services: ssl certificates, IP addresses, etc.
The thing is, we _already_ have metrics for everything we want to bill. They're in a time series DB (VictoriaMetrics, in this case). Sending a shit ton of events to yet-another-system is complicated and brittle.
Your k8s pods example is a good source of my hives. Anything that runs on a timer to fire off events is going to get bogged down and lose data at our scale. And we're not very big!
This is a somewhat solved problem for metrics stacks. It's relatively straightforward to scale metrics scrapes. And once we have that infra, it's pretty easy to just start pushing "events" to it when events become useful. We don't end up needing Kafka and its complexity.
My dream for billing is: a query layer on top of a time series database that takes data I already have and turns it into invoices.
One thing about your post that struck me – the last mile to billing and reporting is the thing we're most interesting in buying. It's less specialized. There also aren't any products out there that have really figured this out, I don't think (we've evaluated pretty much all of them).
Usage tracking and reporting is a thing we're ok building, because it's core to our product.