As a relative outsider to the observability space, I have always wondered this:
Is observability/telemetry only about engineering-related issues (performance, downtimes, bottlenecks etc.) or does it include the "phone-home" type of telemetry (user usage statistics, user journeys)? Looking through the websites of most of the observability SaaSes it seems to only talk about the first. Then how do people solve the second? Is it with manual logging to the server from the client?
I think usage statistics tend to require more retention time to discover user behavior and understand how to optimize revenue. In the general case people probably won't care much whether their widget was running at X% CPU on Dec 5th 2019 but they might care more about what percent of users did Y action on that date. When I worked on an observability team (not as an expert but as a general swe) we had two metrics pipelines; one was strictly usage statistics which came from the client, the other was purely server metrics but a subset of them were considered usage metrics which were aggregated and sent down with the client metrics for the folks upstairs.
It is sometimes the second. Apollo (the GraphQL one) uses OpenTelemetry for tracing and monitoring reasons but also for usage tracking. When was a field last used, what frequency is it included in queries, etc.
I would think anyone trying to put an OTEL source on an embedded device (IoT) was out of his goddamned mind. OTEL assumes data sources have ample hardware and particularly memory. It periodically summarizes all traffic since startup instead of streaming things as they happen.
Is observability/telemetry only about engineering-related issues (performance, downtimes, bottlenecks etc.) or does it include the "phone-home" type of telemetry (user usage statistics, user journeys)? Looking through the websites of most of the observability SaaSes it seems to only talk about the first. Then how do people solve the second? Is it with manual logging to the server from the client?