Why and how GitHub is adopting OpenTelemetry

wdb · on May 26, 2021

It's a really nice solution. I am using it to send traces and metrics via a standardised API in Go, and Node.js. During development I am sending without sampling it to Jaeger but in production I am sending ratio sampled traces to Google Trace but you can use Honeycomb or Lightstep. For metrics, I am sending them to local Prometheus during development, and cluster prometheus in production.

It's nice to see all the traces through all the different services of one API request and all its log items it generated. Using a custom span exporter to ensure PPI is stripped away as much as possible for span attributes.

Using W3C trace context and baggage to send some more data along, e.g. sending the trace id, span id etc into pub sub events to ensure to connect them with the trace.

Loving it! For now, I think it's worth the costs of adding tracing to services.

rektide · on May 26, 2021

A lot of very confused weird comments here.

This article is about a fairly large sized tech company adopting a fairly recent but increasingly mainstream & popular tool that helps them understand their operations. It'll give them a standard way to see what their computers are doing, across their various systems.

OpenTelemetry is one of the key emerging cloud standards, and I expect many many many many more articles like this going forwards, from all kinds of companies.

As for concerns about user tracking & privacy, that's not generally what these operational tools are used for. I haven't heard of a single case of them being used for user tracking or behavioral analytics. Thusfar these are purely operational tools, to understand the health of systems, to debug & typically to understand what happens to an incoming request as it works through dozens of systems & services to get processed. That said, I tend to think over time the importance of this distributed after-the-fact log we are building is going to become inverted. That we will start to see the potential to harvest these records for analytics, and more generally, to forward-feed them into other processes to automatically build & advance Event Sourcing systems. Right now these systems are relatively pure & good, but what's really at stake here is that we've been doing computing blind, with no record of what's happened, and OpenTelemetry is a key first step in lifting that veil of ignorance as to what computing has happened. We are beginning to capture the data of what compute occurred. Many things will emerge as we open this box.

infogulch · on May 26, 2021

Even other responses to your comment are confused. The primary use case is to trace requests that go through your backend of distributed microservices. The fact that it can also be used to collect PII-adjacent data as in sibling's example is about as relevant as the fact that it uses the HTTP protocol. I didn't see any similar comments raging against the HTTP protocol on the last curl post that came through here.

People here are blindly rage-triggered by the word "telemetry" without even taking 2 seconds to glance through to see what it actually is.

cogman10 · on May 26, 2021

Agreed. This comment section is bizarre for HN. It's sort of like getting scared that "Prometheus" collects "metrics". Or that "Kafka" tracks "events".

These aren't scary things.

handrous · on May 26, 2021

"Telemetry" has come to be associated with activities that definitely qualify a product as spyware (though they've become fairly common—thanks, "data-driven" marketers).

From reading the OpenTelemetry site, that doesn't seem to be their main, stated purpose, but thefounder's post in another part of the thread leads me to think it may, in fact, see that kind of use, too.

[EDIT] damn, sorry, downvoters, for explaining the reason this is getting knee-jerk negative reactions from people, when someone expressed confusion about it. Again, I think the main reason is the word used, and what it's mostly associated with now, among some folks.

infogulch · on May 26, 2021

Every tool is also a weapon, obviously. We haven't stopped making hammers because they can be used to hit people.

cogman10 · on May 26, 2021

The sorts of stuff collected by opentelemitry are benign. Can it be weaponized? Of course, but that's like using a screwdriver to kill a fly. If I wanted to collect evil "telemitry" I wouldn't touch opentelemitry because it isn't useful for that.

It's like being afraid your browser profiler is being used to spy on you. Could it do that? Sure. but there are so many easier ways to accomplish the same task.

handrous · on May 26, 2021

It's why people are making these assumptions about the term "telemetry". In certain circles it usually means spyware. One of those circles includes front-end web, which is pretty well-represented on HN.

[EDIT] and, especially, that's why they're jumping to the conclusion that github is gearing up to do more spying on its users, which is the part that I think people are bothered by, not the existence of this software package.

cogman10 · on May 26, 2021

OpenTelemitry isn't useful for spying purposes. It just isn't. If github wanted to spy on users they have way more direct methods that don't involve collecting the time your browser executes a function or waits on an AJAX request.

handrous · on May 26, 2021

Cool, that's fine, I was just explaining why it's getting that reaction from people. Telemetry means different things to different people and a (quite new) use is basically just a euphemism for "watching over people's shoulder while they use our software"—but, among some people, that's the main use they encounter, so they assumed that's what this is.

Lazare · on May 27, 2021

It's a bit disappointing, really. There are people commenting who clearly have no idea what the tool even is, haven't even tried to figure it out, haven't read the other comments already here.

I'd really hope for better among the HNews readership.

yannoninator · on May 26, 2021

Wait, you said:

> I haven't heard of a single case of them being used for user tracking or behavioral analytics.

and then you said:

> That we will start to see the potential to harvest these records for analytics…

So this can be used to gather analytics of any sort of data, such as spying then? This is still worrisome.

geofft · on May 27, 2021

In that sentence, "these records" means "information about how GitHub's servers are behaving," not, say, "user data."

It is technically true that such a tool could be used on any sort of data, in the same way that, say, Perl or SQL can be used on any sort of data.

AlexandrB · on May 26, 2021

> As for concerns about user tracking & privacy, that's not generally what these operational tools are used for.

I'm not sure that the intended use matters. Unless the telemetry systems are carefully designed to not capture PII at all they become yet another channel that must be secured as if they are collecting PII. For example, see Windows 10 telemetry vs. HIPPA: https://hipaaone.com/2015/09/22/windows-10-and-hipaa/

AYBABTME · on May 26, 2021

I suspect you're tripping on the flowers of the carpet. Telemetry here is not a word meaning "call home and report user data", it means "telemetry" in the sense that a Mars rover sends back positioning & orientation data back to NASA for control. The usage of this tool is equivalent to having a query engine for application logs, or a statsd server to render graphs.

It's just a protocol to see how systems are performing, but a better one than statsd or logs.

random3 · on May 26, 2021

They are agnostic data collection protocols/ systems / libraries. Just like a protocol they could do PII collection and unlikely to prevent that.

Ultimately the intended use matters more than the tech. And in this case it seems purposely built to measure performance.

michaelperel · on May 26, 2021

For anyone interested in learning how to use Open Telemetry for distributed tracing in go, I recently made a demo app to share with some friends: https://github.com/michaelperel/otel-demo . To run, clone it & docker-compose up

hendry · on May 27, 2021

Amount of "otel" instrumentation required in your code by LOC looks ridiculous to me.

Isn't it a better approach to just use logs?

e.g. defer ctx.WithField("path", path).Trace("opening").Stop(&err)

from https://medium.com/@tjholowaychuk/apex-log-e8d9627f4a9a

chrisz4 · on May 28, 2021

Once setup, I found that amount of code required for creating a simple trace via otel and logging is roughly the same.

Refer to: https://github.com/michaelperel/otel-demo/blob/master/cmd/se...

51: Start a span. Equivalent to one line of log 60: Add a event. Equivalent to one log 69: Set attribute. This retroactively add attribute to whole span since the start. While log don't have exact same effect, one line of log can be used here. 74: RecordError. Equivalent to one log

I haven't compare amount of code to setup a proper logger which connected to correct infra with amount of code to setup otel yet. Still, I don't think it gonna make much difference.

In general, I won't mind either approach if I get a great visualizer. The main reason I would choose OpenTelemetry is I get trace visualizers for free and I can switch to better visualizer anytime that I want.

aseipp · on May 27, 2021

Your counterexample is misleading, because a single simplistic function call is what happens in most of the code, e.g. creating spans for an arbitrary point in the code is like, 1 function call[1] after setup. I don't see anything egregious here. Of course, you also have to record where the span ends, but that's part of the game you're playing when you move beyond logs. You have to record that.

What you're probably looking at is all the boilerplate set up, e.g. configuring the provider and backend to point to the right stuff. It reminds me of SL4J, which isn't actually an insult. It's just boilerplate, because people want a lot out of their logging and tracing systems.

Demo applications like this are often easily misleading because there are only like 50 lines of "business logic" and 50 lines of tracing setup, so it makes it seem like the tracing is excessively hard. But those two things don't scale the same way. In a large application where tracing is really valuable, you'll have 100,000 lines of business logic, and still only 50 (or maybe like 100) lines of tracing setup, per application.[2] Actual usage at the call sites remains only a line or two in most cases, and easy to add as you need, where you need it, just like a logger.

It is also worth noting in other ecosystems like when I played with tokio_trace, I found integrating tracing easy, even at the very start. So some of this definitely involves the "philosophy" of the client library.

[1] https://github.com/michaelperel/otel-demo/blob/master/cmd/cl...

[2] I guess if the 100,000 LoC running your business is split into 2000 microservices with 50 lines each, then yes, it may be excessive.

bassdropvroom · on May 27, 2021

Opentelemetry is a lot more than that. It can handle traces across services.

hendry · on May 27, 2021

You can log and use a tool like AWS LogInsights to see the trace across services..

No code changes needed.

bassdropvroom · on May 27, 2021

Great, so now link each request across services, combined with the timings of each individual component within each service. LogInsights is for logging, not tracing. Logging is not a replacement for tracing.

hendry · on May 28, 2021

link request with X-Amzn-Trace-Id https://docs.aws.amazon.com/elasticloadbalancing/latest/appl... and yes you will have timings also in the structured log.

I am effectively getting trace level insights from my logs. I must be doing something wrong!

avinassh · on May 27, 2021

How much changes I need to make if I want to add open telemetry in my existing application?

kortex · on May 27, 2021

Mandatory "do some research before jumping to freaking out at GitHub for 'spyware.'"

Yes technically you could use OT to exfiltrate data; guess what, you can do it with http headers, too.

Been using OpenTelemetry for python and golang. It's great. I really like using Jaeger for tracing across small ML webapps. We have users on the scale of dozens so I just trace everything without sampling. Way, way easier than going through logs.

Also used python + jaeger to narrow down some wifi issues on my home network. Basically run a server on each device and round-robin a request through each and look at the flame graph. Turns out, some Macs' location preferences can cause 500-800ms outage multiple times an hour, as it tries to scan know AP ssids or something. Yeah, dumb.

weberer · on May 27, 2021

They problem is that Microsoft continuously used the term "telemetry" as a euphemism for the dystopian corporate spyware they shipped with Windows 10. So now to the average end user, the two concepts are the same.

This is compounded by the fact that Github is now a Microsoft company and Microsoft has been caught in the past hiding "telemetry" in programs compiled with Visual Studio.

https://old.reddit.com/r/cpp/comments/4ibauu/visual_studio_a...

kortex · on May 27, 2021

I get what you are getting at, but It's not really a euphemism. Telemetry means literally "distant measurement." You can use telemetry to track sensors in a rocket, a request across a service mesh, a traceback from a crash in a user program, fine-grained details about everything a user does in Windows, or a straight-up keylogger. All are forms of telemetry.

The VS telemetry injection is scary as hell though.

Telemetry is like a knife. Context and location matters. Great in the kitchen, bad when lodged in your back.

Sharing this from myself downthread:

Telemetry is the in situ collection of measurements or other data at remote points and their automatic transmission to receiving equipment (telecommunication) for monitoring. The word is derived from the Greek roots tele, "remote", and metron, "measure".

All spyware is, by definition, telemetry. Heck, the whole field of spycraft ( espionage ) is telemetry of sorts. Not all telemetry is spyware.

iddan · on May 27, 2021

It’s exciting an organisation like GitHub gives a vote of confidence in an open standard for tracing. Making OpenTelemetry an industry standard will help everyone to develop better backend services

thefounder · on May 26, 2021

Off topic: At some point my wasm app was consuming excessive memory bringing the browser to a halt. The issue was the OpenTelemetry package(Go-lang) used by various Google sdks. Forking/sanitizing the sdks fixed the issue.

tignaj · on May 26, 2021

Please submit a bug at https://github.com/open-telemetry/opentelemetry-go/issues We want to make the SDKs rock-solid.

ehershey · on May 26, 2021

What do you mean by “sanitizing” the sdk’s? Did you remove the OpenTelemetry package?

thefounder · on May 26, 2021

>> Did you remove the OpenTelemetry package?

Exactly!

tedk-42 · on May 27, 2021

[flagged]

Datagenerator · on May 27, 2021

Removed code is debugged code, according to the infamous suckless website and the creator of ZeroMQ. (1)

1) http://suckless.org/philosophy/

tedk-42 · on May 27, 2021

OK but not related to the issue mentioned by the poster. Good code isn't minimalistic, else we'd be writing all our code in a code golf style with fewest possible characters.

Funny that my comment got flagged - If you really believe removing something fixes your problem, what about other people who could have the same issue as you or the people that don't have your issue and used the library successfully?

You've effectively removed yourself as the possible cause to the problem and you probably are the problem if no-one else has reported the issue. You're the problem for not reporting it and you're the problem for complaining about it 'not working' when you did so little to try socialise a fix or look into it more deeply.

StreamBright · on May 26, 2021

I think it would be great to show the performance impact of these SDKs because it is one of the really important aspects of monitoring (being non-intrusive).

crandycodes · on May 26, 2021

As someone who's needed to maintain complex, high-performance database drivers that needed to work across a bunch of different platforms, I've been following them and their predecessors of OpenTracing/OpenCensus. The problem that's always been interesting to me as a library maintainer is consistency across platforms and well maintained multi-platform libraries.

I hadn't really found an acceptable solution that would work across Java, Node.js, browser, and so on. We'd invented our own formats and then we owned all the integration problems with various monitoring tools. I left the team before we started to adopt, but they've started doing it and it looks like it's help with reducing integration burden. I also think using someone else's opinionated library can help avoid bikeshedding on concepts not related to your core value.

jeffbee · on May 26, 2021

I've noticed that orgs where I've worked vary between being totally insensitive to observability cost to being real hardasses about it. But I think most smaller shops are falling into the former category. I've even heard in meetings crazy shit like "It's very low overhead, only about 5%" which would get you laughed out of the office at, say, Google. Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation, which means that there will be a schism between people who are happy in the otel ecosystem and people who can't use it on cost grounds, who probably will splinter into distinct home-grown solutions.

drewbug01 · on May 26, 2021

> Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation

Curious what you mean about the design of OpenTelemetry precluding efficient implementation?

legulere · on May 26, 2021

It’s also about where your costs are. Google has much less revenue per compute-time/requests/whatever than most other companies. If you target the business to business field your computing costs are usually negligible while your developers cost a lot of money. Throwing more hardware at the problem is often the most economical solution.

pm90 · on May 26, 2021

You're correct that 5% increase in resource usage is probably not noticeable for most orgs.

Its important to know what audience you're building for. I believe the audience for otel consists largely of companies that don't look anything like Google. So its fine to sacrifice that last bit of perf gain if it means the code is easier to use and maintain.

FWIW, it also appears that companies like Google would fork or reimplement such systems anyway.

austinlparker · on May 27, 2021

I want to point out that OpenTelemetry is explicitly designed to decouple the API from the SDK, allowing for other people to reimplement parts of the SDK as needed while maintaining compatibility with not only other SDK components, but also the overall ecosystem. This was one of the major changes we introduced as part of the OpenTracing/OpenCensus merger.

ungzd · on May 27, 2021

> Unfortunately (to me) the focus on ease-of-use has meant that OpenTelemetry concepts are structured in such a way to preclude even the possibility of a very efficient implementation

Looks like Opentelemetry (at least its precedessor, OpenCensus) is originated from Google. From OpenCensus website (https://opencensus.io/):

> OpenCensus and OpenTracing have merged to form OpenTelemetry

> OpenCensus originates from Google, where a set of libraries called Census are used to automatically capture traces and metrics from services.

Original internal Google tracing system was probably designed for scale. And opentelemetry's design is probably based on that internal system.

So, maybe poor performance is just an implementations issue.

jeffbee · on May 27, 2021

OpenCensus bears no resemblance whatsoever to Google Census, except for the name. Census at Google is wired tight as hell. Any time it showed up on the first page of fleet-wide profiles it would get hammered back down. At the same time it also has more capabilities than its open source successors. Unfortunately, nothing has ever been published about it.

Philip-J-Fry · on May 26, 2021

Well, the second you do any database call or other service call you've already spent 100x longer doing that than you have recording some timings.

These clients will usually buffer the stats in memory and push them out asynchronously. Performance is definitely affected but I'm pretty sure it's negligible for most cases.

Best practice would be to reduce tracing ratio in production too. So most requests are literally just a timing.

sethammons · on May 26, 2021

it is a solid call out. One of our teams had to rip out and completely re-think our integration with OpenTracing because of allocations in the client at the time. I believe that's been fixed.

malkia · on May 26, 2021

Look into envoy/istio - e.g. these introduce side-processes (sidecars) where your process talks to, and these create the traces for you at some perf cost. There are proxies for some of the existing services - like mysql - https://www.envoyproxy.io/docs/envoy/latest/configuration/li...

I haven't used it myself, I've used census, and now looking into OpenTelemetry (though from the least finished version - C++). Had mixed success with it in the past, but trying again. Also not looking at all into side-cars, etc. - We compile all our internal tools, so adding this inside is where I'm getting into.

I've had several times (while at Google), being asked by an SRE that I would call on issues, and they would request to bump the sample tracing from minimal defaults (was it 1 in 100,000 or million - forget) sometimes to 1:1 - for say 30 seconds. This way they'll receive on their end (in their systems that we use) flags to sample too, and at the end get full logs.

Usually the whole trace is visible in few minutes. There were few UI's (nothign like zipkin/jaeged/others outside) - some of them with very "imgui"-like hacky (in good sense) view - like programmer art all over (which I loved - it was much more condensed than standard zipkin/jaeger).

You could've marked something as important, and it'll retain for longer period - otherwise - poof - soon gone. Also it would collect info sometimes directly from the machine it was in (rather than wait to populate).

Obviously, I don't know the details - I was just an user, or more like - allowing (when oncall) trace sampling to be bumped by the SRE - so they would get more info. It's what hooked me actually, because how else would one get everything end-to-end.

Surprisingly it's also useful for single apps, where you have threads (or concurrency tasks, like with TBB/ConCRT) doing nested parallel_for's or spawning jobs, and you want to get idea what's going on. The only tricky bit is how to get your "context" propagated from one thread to another (also not readily done).

It's one thing that the "golan" got right with their context for example.

So it's really awesome, but probably really hard to get right the first few times.

Thaxll · on May 26, 2021

envoy/istio do not replace telemetry in application because they only see what's pass through them and don't know anything else. You're missing a lot if you only instrument through a proxy.

domlebo70 · on May 27, 2021

What is that app they are using in the screenshot?

apk · on May 27, 2021

It's Lightstep (https://lightstep.com).

TheRealWatson · on May 27, 2021

Recently acquired by ServiceNow

dgildeh · on May 27, 2021

I did a blog on using OpenTelemetry with Python/Google Cloud Run and source code here if anyone is interested:

https://davidgildeh.com/2021/03/08/running-python-openteleme...

I need to update the code as I didn't use the BatchSpanProcessor and as a result all my calls are taking 2-3 seconds vs. a more reasonable 300ms.

My only issue so far is the stability and maturity of the project. Now they're at V1 stability of the main SDK looks better, but the instrumentation libraries are still all over the place and need some cleaning up/maturing.

I like the vision though and we're keeping an eye on the project as it matures.

devops000 · on May 27, 2021

What are good solutions for charts and alerting for data sent from OpenTelemetry?

speedgoose · on May 27, 2021

Is OpenTelemetry the right tool ? In my team, we started using it but we kept Prometheus and Grafana to have charts and alerts.

devops000 · on May 27, 2021

Why not using Datadog?

speedgoose · on May 27, 2021

We are in the health care industry, so we are not keen about sending our data to third parties. We also prefer open-source technologies.

joeblubaugh · on May 27, 2021

The "simplest" self-hosted tools are probably Prometheus (https://prometheus.io) and Jaeger (https://jaegertracing.io), but they're big distributed systems. Running them may or may not be worth the cost, though you might be using something like Prometheus already.

I really would love to see a simpler set of tools that works better for small / medium companies to self-host if they really can't send data to a vendor for some reason.

usr1106 · on May 27, 2021

Because it's proprietary IIRC?

flemhans · on May 27, 2021

Is it production ready yet? From what I could gather it seems still pretty experimental. But I guess GitHub adopting it would cement the standard a bit.

jonatan-ivanov · on May 27, 2021

It's an interesting move given that OTel is not stable yet. :o

kortex · on May 27, 2021

Why is this controversial? OT is great but it's not totally production ready imho. I tried integrating it last year in Python and had almost weekly breaking changes to the internal API calls. I mean, it was a 0.x, so they were well within their liberty, also they don't owe me anything. But it was a bit of a false start for me.

Finally they released a 1.0 and the interfaces are more stable, but the docs are still incomplete or stale in some places. I currently use it in a minor way.

Frankly I can't wait because I love the tools and ecosystem, but I personally would be hesitant to put it into full blown production at such a big company.

drewbug01 · on May 27, 2021

One of our goals with early adoption is to contribute back the things needed to make it stable, performant, and useful for everyone. It might make things more difficult for us at times, but we feel we can manage the burden.

It’s obviously not our only goal, but we are in a position to help make the system better for everyone and sincerely wish to do so.

hobofan · on May 27, 2021

And why exactly is that interesting? Tech companies often use new products before they are stable, how else would they become battle-tested?

On the website it even says: "OpenTelemetry is in beta across several languages and is suitable for use. We anticipate general availability soon."

jonatan-ivanov · on May 27, 2021

I guess because you might not want to go to prod with something on GH scale that is not stable and still subject to have breaking changes?

Fyi: that quote has been there for a while.

hobofan · on May 27, 2021

It's usually the opposite: You might not want to go to prod with something like this on an SMB scale where you have little resources to navigate tech debt.

On a Github scale where you have more engineers available, it's often worth it to adopt cutting-edge alpha/beta software for their novel features, as you generally also have more expertise/mechanisms to ensure stability of the overall product and run less risk of being suffocated in tech debt.

infogulch · on May 26, 2021

For everyone blindly rage triggered by the presence of the bytes "t-e-l-e-m-e-t-r-y", cogman10 elsewhere ITT [1] summarized that concern well:

> It's like being afraid your browser profiler is being used to spy on you. Could it do that? Sure. but there are so many easier ways to accomplish the same task.

So take a second to straighten out your panties and then actually look at the thing first. It's just an open-source (!) APM protocol that competes with (edit: more like, adjacent to) DataDog, DynaTrace, New Relic etc etc. It's not even a suitable tool for spying.

[1]: https://news.ycombinator.com/item?id=27296419

nonbirithm · on May 26, 2021

It's interesting how names can send signals without any substance of the actual product being relevant. (The thread over 'bro pages' was one example.) This specific instance is rather revealing of one of the biases of some HN users.

I look at myself and I find that many opinions on technology that I hold have clearly been shaped by how much time I've spent on HN. Data-exfiltration telemetry is one example. But I now think about a lot of things I might not have given much thought to before, like how being backed by a VC can shape the direction of a service for the worse. I also find that I'm unnecessarily hardline on topics like paid services versus open source, and shut out anything positive people have to say about advertising as a revenue source.

I sometimes imagine what would happen if I was born inside the borders of a different country. Some countries have very nationalistic citizens. How much of my thought processes are a result of the people and signals I surround myself with? How many of those signals will reach me whether or not I consent to seeing them (advertisements being one example)?

I think I could have been a very different person if I was raised even a hundred miles from where I actually grew up, despite having a similar chemical makeup. That makes me be more conscious about where I'm likely to seek information, and to try not to look for opinions that I already agree with entirely.

userbinator · on May 27, 2021

The knee-jerk reactions are unfortunately because the word "telemetry" has been perverted into having that connotation by those who don't want to call what they're doing "spyware". It certainly doesn't help to see that word along with "GitHub", who is now owned by a company which is doing exactly that.

Other relevant euphemisms that have appeared within the past few years: "analytics", "diagnostics", "usage reporting", "connected experiences", "experience improvement".

wdb · on May 26, 2021

It doesn't really compete Datadog, Dynatrace, Honeycomb, Lightstep, new Relic are all partners in defining the specification and help implementing it. Most of them already allowing ingesting traces over the OLTP protocol

porker · on May 27, 2021

Yes, where it helps is every programming language and framework can have one integration with OpenTelemetry, rather than custom ones for each separate telemetry system.

cogman10 · on May 26, 2021

Doesn't even compete with them :) Each of those have opentelemetry adapters.

You'd write your opentelemitry traces throughout your code and at some top level point in your app you configure and say "Hey, OpenTelemetry, report to New Relic".

By itself, OpenTelemetry does nothing.

It's usefulness is that someone writing a lib can add OpenTelemetry calls throughout and anyone using that lib can then collect metrics/traces into whatever metrics solution they are currently usings (be it DynaTrace, DataDog, or New Relic).

rektide · on May 26, 2021

I tend to think computing itself has been in kind of a dumb rut. Right now, OpenTelemetry is just logging & tracing, imagined as tools for ops, but over time, I fully expect we begin to see this as an event-stream in itself, something we can use for Event Sourcing, to watch & trigger new compute based off of.

That's not in the cards today. But longer term, I think "knowing what computers are doing" is big business. And, I am very sad to say, eventually it will the obvious & logical way to spy on folks too. Because it will be the obvious & logical way to do many many many things, not because it's tech that's built or intended for spying. But right now computing is ephemeral, we don't persist any of the stack traces we compute through, and fundamentally, tracing really is about distilling out & keeping higher level stack traces. It's something computing needs to have been doing, that will bring us to a radically higher level of understanding (but it also does have some scary uses).

It's been a couple years, but tracing is a powerful new basis with which to start re-engaging the "Turning the datqbase inside out"[1] / "I <3 Logs" view of data-storage (& computing) that had some buzz for a bit. We're not at all here yet, but I think it's coming.

[1] https://martin.kleppmann.com/2015/11/05/database-inside-out-...

cogman10 · on May 26, 2021

Who will be doing this spying?

It certainly wouldn't be opentelemetry itself as that's just a interface you add adapters to.

Are you thinking a man in the middle would spy? How would that work? This information is pushed over secure connections on the backend likely in a VPN. On the front end, it'd be transmitted over HTTPS. Shouldn't we be more fearful of information collected from DNS than encrypted data sent over HTTPS?

Or are you thinking the metrics aggregators are going to do the spying? How would that impact their business model? Do you think a company would continue to pay the likes of new relic if they were caught giving access to metrics data to outside groups? Do you worry about postgres or prometheus sharing your data with 3rd parties? What about SQL server?

What is the risk model and how would it be different from say the risk model of making an REST call or using a 3rd party library?

Or is it just that "because this is well integrated throughout, it could be used to spy"? Because, generally speaking, these traces don't have enough information to identify what individual users are doing to the system. Even if they did, that wouldn't be a great way to track a user, you'd simply put that tracking information right at the front end of the system. Plumbing it from one end of the system to the other gives little value for a spy. It's adding a bunch of noise to the question you'd want to ask "what are the user's behaviors with our product?"

And even if the demand is there, why would you do this through tracing an not a purpose built spy tool. Wouldn't it be easier for a nefarious lib writer to make a plugin purpose built to collect evil data? If that sells, why wouldn't a tech company buy that instead of buying a solution which combs 3 layers of separation to get worse answer? Why wouldn't google analytics still exist?

rektide · on May 26, 2021

We're forming the best, most complete, competent view of computing-that's-happened that we've ever formed.

You have a bunch of weird straw men that I don't get. I tried to de-emphasize the role of behavioral analytics & user-tracking, because I think it's just one small part of what this will be used for. But I am fairly confident we will eventually start to do more user-tracking via these systems. I've used half a dozen different user-tracking products at various companies, and they all read like ultra-low-fi versions of the ops tools. Ops tools have been evolving, at a far faster rate, far more in the public domain, and at some point, it just wont make sense to instrument your product twice.

I want to re-iterate that I see this as one of the smallest, least interesting aspects of a coming Event-Sourcing-powered-by-tracing world. There's far more profound implications for what could happen to computing in general here (a de-wiring of the request/response microservice world & a shift towards async, reactive systems). But even today, it feels to me like folks work very hard to draw a distinction between their ops tools & their behavioral analytics tool. As a developer, they often work, I often interface with them in very very similar fashion. The desire to draw the distinction has felt illogical, and felt unsupported. Especially as the ops tools advance, I think it will be harder to reconcile the idea that there are & ought be separate systems of tracking/viewing/understanding.

thrixton · on May 27, 2021

I thoroughly agree about the event-based-computing model but telemetry (as per the current common definition) won't drive this far enough. We need to take the telemetry model one step further and have the easy facility to insert shims at set levels (e.g. between the business layer and the data store) with open auth to be able to insert subscribers in an open manner to the entire payload. This should then be mandated by any org implementing a product and especially any gov org to prevent vendor lockin and allow easy extensibility. Same concept, but more data.

OldGoodNewBad · on May 27, 2021

[flagged]

CommanderData · on May 27, 2021

This is more like application logging and tracing.

At Github's scale their setup and micro services depth would necessitate some type of correlated based tracing like this.

Heck even if your architecture is 1 layer thin OpenTelementry is still useful than crawling through logs every time there's an issue.

heinrichhartman · on May 27, 2021

I think you are confusing "telemetry" here. This is not about Cookies or "phoning home" or Advertizing. This is about collecting logs, metrics, traces from (Micro-)Services running inside your own DC/Cloud for operational purposes (e.g. troubleshooting incidents).

yannoninator · on May 26, 2021

[flagged]

scrollaway · on May 27, 2021

You're getting downvoted because you are complaining about something you don't understand, which puts into question the "after reading this" part of your post.

Jumping on the first opportunity to rant about telemetry without even checking what the article is talking about will do that.

yannoninator · on May 27, 2021

then why name it open "telemetry" in the first place? you know it has an extremely bad connotation to spyware.

> which puts into question the "after reading this"

> without even checking what the article is talking about will do that.

please don't do that, this is against the HN guidelines.

https://news.ycombinator.com/newsguidelines.html

kortex · on May 27, 2021

Telemetry is the in situ collection of measurements or other data at remote points and their automatic transmission to receiving equipment (telecommunication) for monitoring. The word is derived from the Greek roots tele, "remote", and metron, "measure".

Why name it the precise thing that it literally is?

Telemetry is used everywhere: rockets, drones, airplanes, boats, water meters, microservices, applications, and yes spyware, bots, viruses, and keyloggers. All spyware is telemetry by definition: "remote measuring." Not all telemetry is spyware.

yannoninator · on May 27, 2021

And how do you know if GitHub is spying on you with this system? Am I able to "opt-out" of this telemetry on GitHub?

In a closed source backend system like GitHub (Microsoft) how do you exactly know that this tool is also being used for spyware?

I wouldn't be surprised if this 'OpenTelemetry' system is being used for exactly that.

scrollaway · on May 27, 2021

You're doing it again.

"Telemetry" doesn't have spyware connotations. You choose to view it as such. Telemetry means taking remote measurements.

OpenTelemetry facilitates this. It doesn't facilitate Github installing spyware on your system. It's an internal tool. Something you'd understand, even as someone unfamiliar with it, if you would actually have paid any attention to the article.

And OpenTelemetry IS in fact open source, so your argument, confusing as it is, falls flat on that too.

Note that it's very hard to give you the benefit of the doubt when you're so incredibly hostile to a word you've chosen to view as bad, it sounds like telemetry is just a trigger word for you.

yannoninator · on May 27, 2021

The fact that this thing is open source is entirely irrelevant.

It's quite simple. You don't know what Github is doing privately with this tool running in their backend systems.

I still don't understand the praise and the excitement of telemetry (open or not) at all from engineers.

They may think that they are doing this 'tracing' or whatever that is, when in-fact it is still spying.

Again, can I opt-out of this 'telemetry' on GitHub? If I cannot and I did not agree to this, then this is very troublesome.

scrollaway · on May 27, 2021

> You don't know what Github is doing privately with this tool running in their backend systems.

What of it? You don't know what they're doing regardless. You're attacking a strawman, because of its name. Please for the love of everything, take a step back and understand what you're even arguing about.

YOU are not the target of said telemetry. Their frickin' servers are.

yannoninator · on May 28, 2021

> YOU are not the target of said telemetry. Their frickin' servers are.

Extremely naive claim.

What data do you think is included in their traces/metrics/logging with this telemetry tool? Do you know more of this?

You should know that a single blogpost doesn't tell you the entire story. What data are they collecting with this? Are these metrics/traces/data transparent? if not, why has GitHub not made this transparent?

Again, Am I able to "opt-out" of this telemetry on GitHub?

If I cannot, then they are spying. pure and simple.

yunohn · on May 26, 2021

I believe you (and many others here) have misunderstood the article. They are talking about telemetry on their backend systems to track performance metrics and the like. Not the usual telemetry that HN loves to argue about.

intricatedetail · on May 27, 2021

When I wrote that once MS gets GH it will turn onto another spying service and I got rained with downvotes. Looks like it's finally time to use something else. Shame the tech community is okay with such dystopian development.

Animats · on May 26, 2021

Does this put "telemetry" in Git itself? If you're just using git to access Github, is it snooping on you?

vbsteven · on May 26, 2021

No, OpenTelemetry is focussed on gathering metrics in distributed systems. Metrics, log aggregation and tracing of requests from the entrypoint (load balancer) all the way to the backend services and data stores.

Take a look at the Jaeger and Zipkin websites and it should be pretty clear what it is used for.

cogman10 · on May 26, 2021

And, it should be stated, that it isn't even really about gathering metrics. It is about providing a standard interface to gather metrics. (I know you know this, but with the confusion here, I figure I should go into details).

The point of OpenTelemitry is to make it so you could write the places you get your traces/metrics in one part of code and configure which backend system it is collected into in another part.

So, for example, you'd add a `trace("my slow thing"){ be slow }` into your code and later add `report to zipkin` or `report to Jaeger` in another part. The place where you trace "my slow thing" doesn't care about how to interact with the backend system or which backend vendor is ultimately used. You can start using prometheus, zipkin, new relic, Jaeger, whatever, just so long as they have an OpenTelemetry adapter you are golden.

The analog is SLF4J in Java.

stefan_ · on May 27, 2021

That makes zero sense. Of course accessing a GitHub remote with git means accessing a huge distributed system and involves "load balancer all the way to the backend services and data stores". If anything, that is the core of GitHub and the webui is bolted on.

It's baffling to me, what do people think happens when they pull/push from a GitHub repo?

vbsteven · on May 27, 2021

I interpreted the original question to be concerned about phone home telemetry in the git binary.

cosmojg · on May 26, 2021

Huh? The day GitHub has that much control over the Git project, the world of software as we know it will end.

GitHub ≠ Git. Far from it.

4f77616973 · on May 26, 2021

Add to this that it’s Microsoft owned.

mikece · on May 26, 2021

Are there any examples of companies making a contractually binding pledge to never use any telemetry data for anything than improving the application/service... NEVER for creating marketing profiles or for surfacing ads?

handrous · on May 26, 2021

They're not using "telemetry" as a euphemism for "shipping spyware to end users to record their actions on their own machine", in this case. This appears to be a very fancy log aggregation product that has also eaten logging/tracing related configuration deployment functionality.

cogman10 · on May 26, 2021

It's not even that. It's the interface that other logging/tracing apis can plug into. It's the "SLF4J" of tracing/metrics gathering.

Why is this useful? Because it lets people writing libraries provide the ability for users of their libraries to track performance information without dictating to them what performance tracking tools they should use.

wdb · on May 26, 2021

There is the web-packages in Javascript that creates traces for click events etc. but also the track how much time requests took on the client-side. Or render times of the UI e.g. Vue/React components

See: https://github.com/open-telemetry/opentelemetry-js-contrib/t...

postpawl · on May 26, 2021

In the context of this article - opentelemetry is for performance tracing on the backend. The call to the endpoint would already end up in access logs, and this would just give more detailed performance metrics on queries and function calls.

adamcstephens · on May 26, 2021

This is likely more in the realm of observability than what you’re thinking. eg metrics, logs and traces. Could you store marketing data in here? Sure, but that’s not really what opentelemetry is built for.

enriquto · on May 26, 2021

even if they did, that would't mean anything. They could easily construct an argument that surfacing ads serves to improve the application (with a few extra rethoric steps in between).

pdkl95 · on May 26, 2021

If a company is serious about this type of promise, they should take away their ability to change their mind in the future with a Ulysses pact[1]. See this[2] older post where for a bit more detail from a talk by Cory Doctorow.

>> The answer to not getting pressure from your bosses, your stakeholders, your investors or your members, to do the wrong thing later, when times are hard, is to take options off the table right now.

[1] https://en.wikipedia.org/wiki/Ulysses_pact

[2] https://news.ycombinator.com/item?id=20411018

enriquto · on May 26, 2021

How would that be implemented in practice, for the case of telemetry? The data is held by a third party? Couldn't they always "buy" that third party? Is there a standard Ulysses mechanism for data compartmentalization?

chippy · on May 26, 2021

It seems that, in the majority of cases, we are left with trust at the end of the day. They might say "we will never do X" but if someone buys them, then the "we" becomes something different, and the new company is free to do whatever. With github, it's that paragon of good behaviour Microsoft that controls things.

These promises need to be written down in the articles of association in a company, and I have never seen this.

sebow · on May 26, 2021

Considering GH is MS-owned, i'm even more reluctant to use it.

Anything that leaves your computer and enters hardware owned by MS should make you think twice about using a certain service/product.

Same goes for almost every other company out there.Also FOMO is not an argument, there are plenty of alternatives for almost everything out there.If it's comfort people are worried about then they should stick to iPads and ditch software-dev, because trends like these are why the quality of software has been declining drastically in the last decade.

These streamlined processes of using telemetry for the "product's good" are just a way to compensate that.

jetpks · on May 26, 2021

I think you have a misunderstanding of what OpenTelemetry is. It helps you understand what your backend software is doing with insights like how long database queries take to complete.