As a relative outsider to the observability space, I have always wondered this: ...

renlo · on Jan 13, 2024

I think usage statistics tend to require more retention time to discover user behavior and understand how to optimize revenue. In the general case people probably won't care much whether their widget was running at X% CPU on Dec 5th 2019 but they might care more about what percent of users did Y action on that date. When I worked on an observability team (not as an expert but as a general swe) we had two metrics pipelines; one was strictly usage statistics which came from the client, the other was purely server metrics but a subset of them were considered usage metrics which were aggregated and sent down with the client metrics for the folks upstairs.

rafamct · on Jan 13, 2024

It is sometimes the second. Apollo (the GraphQL one) uses OpenTelemetry for tracing and monitoring reasons but also for usage tracking. When was a field last used, what frequency is it included in queries, etc.

hinkley · on Jan 13, 2024

I would think anyone trying to put an OTEL source on an embedded device (IoT) was out of his goddamned mind. OTEL assumes data sources have ample hardware and particularly memory. It periodically summarizes all traffic since startup instead of streaming things as they happen.

arccy · on Jan 13, 2024

i assume you're talking about metrics. you should be able to switch aggregation mode to delta

hinkley · on Jan 14, 2024

Have you gotten that to work? I tried several times, got everything from zero stats to high CPU.

It might work now, it definitely did not as recently as about six months ago. At some point I had to get other things done this year besides OTEL.

arccy · on Jan 14, 2024

we do otlp (delta) -> otel collector collector -> other process that does temporal aggregation

mongrelion · on Jan 13, 2024

I believe right now this type of telemetry data is for whitebox monitoring for backend components

wdb · on Jan 13, 2024

They are working on RUM support in OpenTelemetry