Hacker News new | past | comments | ask | show | jobs | submit login

As a relative outsider to the observability space, I have always wondered this:

Is observability/telemetry only about engineering-related issues (performance, downtimes, bottlenecks etc.) or does it include the "phone-home" type of telemetry (user usage statistics, user journeys)? Looking through the websites of most of the observability SaaSes it seems to only talk about the first. Then how do people solve the second? Is it with manual logging to the server from the client?




I think usage statistics tend to require more retention time to discover user behavior and understand how to optimize revenue. In the general case people probably won't care much whether their widget was running at X% CPU on Dec 5th 2019 but they might care more about what percent of users did Y action on that date. When I worked on an observability team (not as an expert but as a general swe) we had two metrics pipelines; one was strictly usage statistics which came from the client, the other was purely server metrics but a subset of them were considered usage metrics which were aggregated and sent down with the client metrics for the folks upstairs.


It is sometimes the second. Apollo (the GraphQL one) uses OpenTelemetry for tracing and monitoring reasons but also for usage tracking. When was a field last used, what frequency is it included in queries, etc.


I would think anyone trying to put an OTEL source on an embedded device (IoT) was out of his goddamned mind. OTEL assumes data sources have ample hardware and particularly memory. It periodically summarizes all traffic since startup instead of streaming things as they happen.


i assume you're talking about metrics. you should be able to switch aggregation mode to delta


Have you gotten that to work? I tried several times, got everything from zero stats to high CPU.

It might work now, it definitely did not as recently as about six months ago. At some point I had to get other things done this year besides OTEL.


we do otlp (delta) -> otel collector collector -> other process that does temporal aggregation


I believe right now this type of telemetry data is for whitebox monitoring for backend components


They are working on RUM support in OpenTelemetry




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: