More

babel_ · 2024-06-29T15:44:01

I think you're misreading a singular opinion as occurring between two disparate points here.

The initial phrase was > doctors hurt people for years while learning to save them

It's then a separate reply from someone else about deaths from errors/malpractice.

So, nobody seems to be expressing the mentality you are, correctly, lambasting (at least so far as I've read in the comments). But, as it is relevant to the lessons we'd all want to pass back to ourselves (in my opinion, it's because we wish to hold ourselves accountable to said lessons for the future), let's address the elephant in the comment.

Normalising deaths, especially the preventable ones, is absolutely not something that anybody here is suggesting (so far as my read of the situation is).

Normalising responsibility, by recognising that we can cause harm and so do something about it, that seems far more in-line with the above comments.

As you say it yourself, there's a time and a place, which is undoubtedly what we hope to foster for those who are younger or at the start of learning any discipline, beyond programming, medicine, or engineering.

Nobody is saying that the death and loss is acceptable and normalised, but rather that in the real world we need a certain presence around the knowledge that they will occur, to a certain degree, regardless. So, we accept responsibility for what we can prevent, and strive to push the frontier of our capacity further with this in mind. For some, that comes from experience, unfortunately. For others, they can start to grapple with the notion in lieu of it through considering the consequences, and the scale of consequence; as the above comments would be evidence of, at least by implication.

These are not the only ways to develop a mindset of responsibility, fortunately, but that is what they can be, even if you find the literal wording to suggest otherwise. I cannot, of course, attest to the "true" feelings of others, but neither can anyone else... But in the spirit of the matter, consider: Your sense of responsibility, in turn, seems receptive to finding the areas by which such thinking can become justification for the very thing it would otherwise prevent, either as a shield purpose-built for the role or co-opted out of convenience. That too becomes integral, as we will always need to avoid complacency, and so must also promote this vigilance to become a healthy part of the whole for a responsible mindset -- lest we become that which we seek to prevent, and all that.

Exactly as you say, there's a greater problem, but this thinking is not necessarily justification for it, and can indeed become another tool to counter it. More responsibility, more mindfulness about our intents, actions, and consequences? That will prove indispensable for us to actually solve the greater problem, so we must appreciate that different paths will be needed to achieve it, after all, there are many different justifications for that problem which will all need to be robustly refuted and shown for what they are. Doing so won't solve the problem, but is rather one of many steps we will need to take.

Regardless, this mindfulness and vigilance about ourselves, as much as about each other, will be self-promoting through the mutual reinforcement of these qualities. If someone must attempt to visualise the staggering scale of consequence as part of developing this, then so be it. In turn, they will eventually grapple with this vigilance as well, as the responsibility behoves them to, else they may end up taking actions and having consequences that are equivalent to the exact mentality you fear, even if they do/did not actually "intend" to do so. The learning never ends, and the mistakes never will, so we must have awareness of the totality of this truth; even if only as best we can manage within our limited abilities.

goosejuice · 2024-06-30T01:30:44

> A recent Johns Hopkins study claims more than 250,000 people in the U.S. die every year from medical errors. It is ok to be an imposter.

Is what I was originally replying to, which, at least to me does seem to imply preventative deaths are some kind of training tax during ones 'imposter stage'. Perhaps not the intention of the poster.

Thanks for your thoughtful reply.

babel_ · 2024-06-21T15:45:13

Having ended up with a critical bug in the SAT solver I wrote for my undergrad thesis, it really can be a challenge to fix without clear logs. So, always nice to see a little love for contribution through issues and finding minimal ways to reproduce edge cases.

While we do mention how good issue contributions are significant and meaningful, we often forget how there's often more to it than an initial filing, and may overlook the contributions from those that join lengthier issue threads later.

(Oh, and yes, that critical bug did impact the undergrad thesis, but it could be worked around, however meant I couldn't show the full benefits of the solver.)

babel_ · 2024-05-27T11:29:49

Many startups seem to aim for this, naturally it's difficult to put actual numbers to this, and I'm sure many pursue multiple aims in the hope one of them sticks. Since unicorns are really just describing private valuation, really it's the same as saying many aim to get stupendously wealthy. Can't put a number on that, but you can at least see it's a hope for many, though "goal" is probably making it seem like they've got actually achievable plans for it... That, at least, I'm not so convinced of.

Startups are, however, atypical from new businesses, ergo the unicorn myth, meaning we see many attempts to follow such a path that likely stands in the way of many new businesses from actually achieving the more real goals of, well, being a business, succeeding in their venture to produce whatever it is and reach their customers.

I describe it as a unicorn "myth" as it very much behaves in such a way, and is misinterpreted similarly to many myths we tell ourselves. Unicorns are rare and successful because they had the right mixture of novel business and the security of investment or buyouts. Startups purportedly are about new ways of doing business, however the reality is only a handful really explore such (e.g. if it's SaaS, it's probably not a startup), meaning the others are just regular businesses with known paths ahead (including, of course, following in the footsteps of prior startups, which really is self-refuting).

With that in mind, many of the "real" unicorns are realistically just highly valued new businesses (that got lucky and had fallbacks), as they are often not actually developing new approaches to business, whereas the mythical unicorns that startups want to be are half-baked ideas of how they'll achieve that valuation and wealth without much idea of how they do business (or that it can be fluid, matching their nebulous conception of it), just that "it'll come", especially with "growth".

There is no nominative determinism, and all that, so businesses may call themselves startups all they like, but if they follow the patterns of startups without the massive safety nets of support and circumstance many of the real unicorns had, then a failure to develop out the business proper means they do indeed suffer themselves by not appreciating 5000 paying customers and instead aim for "world domination", as it were, or acquisition (which they typically don't "survive" from, as an actual business venture). The studies have shown this really does contribute to the failure rate and instability of so-called startups, effectively due to not cutting it as businesses, far above the expected norm of new businesses...

So that pet peeve really is indicative of a much more profound issue that, indeed, seems to be a bit of an echo chamber blind spot with HN.

After all, if it ought to have worked all the time, reality would look very different from today. Just saying how many don't become unicorns (let alone the failure rate) doesn't address the dissonance from then concluding "but this time will be different". It also doesn't address the idea that you don't need to become a "unicorn", and maybe shouldn't want to either... but that's a line of thinking counter to the echo chamber, so I won't belabour it here.

babel_ · 2024-03-28T13:59:52

So, from the perspective I have within the subfield I work in, explainable AI (XAI), we're seeing a bunch of fascinating developments.

First, as you mentioned, Rudin continues to prove that the reason for using AI/ML is that we don't understand the problem well enough; otherwise we wouldn't even think to use it! So, pushing our focus to better understand the problem, and then levy ML concepts and techniques (including "classical AI" and statistical learning), we're able to make something that not only outperforms some state-of-the-art in most metrics, but often even is much less resource intensive to create and deploy (in compute, data, energy, and human labour), with added benefits from direct interpretability and post-hoc explanations. One example has been the continued primacy of tree ensembles on tabular datasets [0], even for the larger datasets, though they truly shine on the small to medium datasets that actually show up in practice, which from Tigani's observations [1] would include most of those who think they have big data.

Second, we're seeing practical examples of exactly this outside Rudin! In particular, people are using ML more to do live parameter fine-tuning that outwise would need more exhaustive searches or human labour that are difficult for real-time feedback, or copious human ingenuity to resolve in a closed-form solution. Opus 1.5 is introducing some experimental work here, as are a few approaches in video and image encoding. These are domains where, as in the first, we understand the problem, but also understand well enough that there's search spaces we simply don't know enough about to be able to dramatically reduce. Approaches like this have been bubbling out of other sciences (physics, complexity theory, bioinformatics, etc) that lead to some interesting work in distillation and extraction of new models from ML, or "physically aware" operators that dramatically improve neural nets, such as Fourier Neural Operators (FNO) [2], which embeds FFTs rather than forcing it to be relearned (as has been found to often happen) for remarkable speed-ups with PDEs such as for fluid dynamics, and has already shown promise with climate modelling [3], material science [4]. There are also many more operators, which all work completely differently, yet bring human insight back to the problem, and sometimes lead to extracting a new model for us to use without the ML! Understanding begets understanding, so the "shifting goalposts" of techniques considered "AI" is a good thing!

Third, specifically to improvements in explainability, we've seen the Neural Tangent Kernel (NTK) [5] rapidly go from strength to strength since its introduction. While rooted in core explainability vis a vis making neural nets more mathematically tractable to analysis, not only inspiring other approaches [6] and behavioural understanding of neural nets [7, 8], but novel ML itself [9] with ways to transfer the benefits of neural networks to far less resource intensive techniques; which [9]'s RFM kernel machine proves competitive with the best tree ensembles from [0], and even has advantage on numerical data (plus outperforms prior NTK based kernel machines). An added benefit is the approach used to underpin [9] itself leads to new interpretation and explanation techniques, similar to integrated gradients [10, 11] but perhaps more reminiscent of the idea in [6].

Finally, specific to XAI, we're seeing people actually deal with the problem that, well, people aren't really using this stuff! XAI in particular, yes, but also the myriad of interpretable models a la Rudin or the significant improvements found in hybrid approaches and reinforcement learning. Cicero [12], for example, does have an LLM component, but uses it in a radically different way compared to most people's current conception of LLMs (though, again, ironically closer to the "classic" LLMs for semantic markup), much like the AlphaGo series altered the way the deep learning component was utilised by embedding and hybridising it [13] (its successors obviating even the traditional supervised approach through self-play [14], and beyond Go). This is all without even mentioning the neurosymbolic and other approaches to embed "classical AI" in deep learning (such as RETRO [15]). Despite these successes, adoption of these approaches is still very far behind, especially compared to the zeitgeist of ChatGPT style LLMs (and general hype around transformers), and arguably much worse for XAI due to the barrier between adoption and deeper usage [16].

This is still early days, however, and again to harken Rudin, we don't understand the problem anywhere near well enough, and that extends to XAI and ML as problem domains themselves. Things we can actually understand seem a far better approach to me, but without getting too Monkey's Paw about it, I'd posit that we should really consider if some GPT-N or whatever is actually what we want, even if it did achieve what we thought we wanted. Constructing ML with useful and efficient inductive bias is a much harder challenge than we ever anticipated, hence the eternal 20 years away problem, so I just think it would perhaps be a better use of our time to make stuff like this, where we know what is actually going on, instead of just theoretically. It'll have a part, no doubt, Cicero showed that there's clear potential, but people seem to be realising "... is all you need" and "scaling laws" were just a myth (or worse, marketing). Plus, all those delays to the 20 years weren't for nothing, and there's a lot of really capable, understandable techniques just waiting to be used, with more being developed and refined every year. After all, look at the other comments! So many different areas, particularly within deep learning (such as NeRFs or NAS [17]), which really show we have so much left to learn. Exciting!

  [0]: Léo Grinsztajn et al. "Why do tree-based models still outperform deep learning on tabular data?" https://arxiv.org/abs/2207.08815
  [1]: Jordan Tigani "Big Data is Dead" https://motherduck.com/blog/big-data-is-dead/
  [2]: Zongyi Li et al. "Fourier Neural Operator for Parametric Partial Differential Equations" https://arxiv.org/abs/2010.08895
  [3]: Jaideep Pathak et al. "FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators" https://arxiv.org/abs/2202.11214
  [4]: Huaiqian You et al. "Learning Deep Implicit Fourier Neural Operators with Applications to Heterogeneous Material Modeling" https://arxiv.org/abs/2203.08205
  [5]: Arthur Jacot et al. "Neural Tangent Kernel: Convergence and Generalization in Neural Networks" https://arxiv.org/abs/1806.07572
  [6]: Pedro Domingos "Every Model Learned by Gradient Descent Is Approximately a Kernel Machine" https://arxiv.org/abs/2012.00152
  [7]: Alexander Atanasov et al. "Neural Networks as Kernel Learners: The Silent Alignment Effect" https://arxiv.org/abs/2111.00034
  [8]: Yilan Chen et al. "On the Equivalence between Neural Network and Support Vector Machine" https://arxiv.org/abs/2111.06063
  [9]: Adityanarayanan Radhakrishnan et al. "Mechanism of feature learning in deep fully connected networks and kernel machines that recursively learn features" https://arxiv.org/abs/2212.13881
  [10]: Mukund Sundararajan et al. "Axiomatic Attribution for Deep Networks" https://arxiv.org/abs/1703.01365
  [11]: Pramod Mudrakarta "Did the model understand the questions?" https://arxiv.org/abs/1805.05492
  [12]: META FAIR Diplomacy Team et al. "Human-level play in the game of Diplomacy by combining language models with strategic reasoning" https://www.science.org/doi/10.1126/science.ade9097
  [13]: DeepMind et al. "Mastering the game of Go with deep neural networks and tree search" https://www.nature.com/articles/nature16961
  [14]: DeepMind et al. "Mastering the game of Go without human knowledge" https://www.nature.com/articles/nature24270
  [15]: Sebastian Borgeaud et al. "Improving language models by retrieving from trillions of tokens" https://arxiv.org/abs/2112.04426
  [16]: Umang Bhatt et al. "Explainable Machine Learning in Deployment" https://dl.acm.org/doi/10.1145/3351095.3375624
  [17]: M. F. Kasim et al. "Building high accuracy emulators for scientific simulations with deep neural architecture search" https://arxiv.org/abs/2001.08055

strangecasts · 2024-03-28T16:46:54

Thank you for providing an exhaustive list of references :)

> Finally, specific to XAI, we're seeing people actually deal with the problem that, well, people aren't really using this stuff!

I am very curious to see which practical interpretability/explainability requirements enter into regulations - on one hand it's hard to imagine a one-size fits all approach, especially for applications incorporating LLMs, but Bordt et al. [1] demonstrate that you can provoke arbitrary feature attributions for a prediction if you can choose post-hoc explanations and parameters freely, making a case that it can't _just_ be left to the model developers either

[1] "Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts", Bordt et al. 2022, https://dl.acm.org/doi/10.1145/3531146.3533153

babel_ · 2024-03-29T02:31:57

I think the situation with regulations will be similar to that with interpretability and explanations. There's a popular phrase that gets thrown around, that "there is no silver bullet" (perhaps most poignantly in AIX360's initial paper [0]), as no single explanation suffices (otherwise, would we not simply use that instead?) and no single static selection of them would either. What we need is to have flexible, adaptable approaches that can interactively meet the moment, likely backed by a large selection of well understood, diverse, and disparate approaches that cover for one other in a totality. It needs to interactively adapt, as the issue with the "dashboards" people have put forward to provide such coverage is that there are simply too many options and typical humans cannot process it all in parallel.

So, it's an interesting unsolved area for how to put forward approaches that aren't quite one-size fits all, since that doesn't work, but also makes tailoring it to the domain and moment tractable (otherwise we lose what ground we gain and people don't use it again!)... which is precisely the issue that regulation will have to tackle too! Having spoken with some people involved with the AI HLEG [1] that contributed towards the AI Act currently processing through the EU, there's going to have to be some specific tailoring within regulations that fit the domain, so classically the higher-stakes and time-sensitive domains (like, say, healthcare) will need more stringent requirements to ensure compliance means it delivers as intended/promised, but that it's not simply going to be a sliding scale from there, and too much complexity may prevent the very flexibility we actually desire; it's harder to standardise something fully general purpose than something fitted to a specific problem.

But perhaps that's where things go hand in hand. An issue currently is the lack of standardisation, in general, it's unreasonable to expect people re-implement these things on their own given the mathematical nuance, yet many of my colleagues agree it's usually the most reliable way. Things like scikit had an opportunity, sitting as a de facto interface for the basics, but niche competitors then grew and grew, many of which simply ignored it. Especially with things like [0], there are a bunch of wholly different "frameworks" that cannot intercommunicate except by someone knuckling down and fudging some dataframes or ndarrays, and that's just within Python, let alone those in R (and there are many) or C++ (fewer, but notable). I'm simplifying somewhat, but it means that plenty of isolated approaches simply can't worth together, meaning model developers may not have much chance but to use whatever batteries are available! Unlike, say, Matplotlib, I don't see much chance for declarative/semi-declarative layers to take over here, such as pyplot and seaborn could, which enabled people to empower everything backed by Matplotlib "for free" with downstream benefits such as enabling intervals or live interaction with a lower-level plugin or upgrade. After all, scikit was meant to be exactly this for SciPy! Everything else like that is generally focused on either models (e.g. Keras) or explanations/interpretability (e.g. Captum or Alibi).

So it's going to be a real challenge figuring out how to get regulations that aren't so toothless that people don't bother or are easily satisfied by some token measure, but also don't leave us open to other layers of issues, such as adversarial attacks on explanations or developer malfeasance. Naturally, we don't want something easily gamed that the ones causing the most trouble and harm can just bypass! So I think there's going to have to be a bit of give and take on this one, the regulators must step up while industry must step down, since there's been far too much "oh, you simply must regulate us, here, we'll help draft it" going around lately for my liking. There will be a time for industry to come back to the fore, when we actually need to figure out how to build something that satisfies, and ideally, it's something we could engage in mutually, prototyping and developing both the regulations and the compliant implementations such that there are no moats, there's a clearly better way to do things that ultimately would probably be more popular anyway even without any of the regulatory overhead; when has a clean break and freshening up of the air not benefited? We've got a lot of cruft in the way that's making everyone's jobs harder, to which we're only adding more and more layers, which is why so many are pursuing clean-ish breaks (bypass, say, PyTorch or Jax, and go straight to new, vectorised, Python-ese dialects). The issue is, of course, the 14 standards problem, and now so many are competing that the number only grows, preventing the very thing all these intended to do: refresh things so we can get back to the actual task! So I think a regulatory push can help with that, and that industry then has the once-in-a-lifetime chance to then ride that through to the actual thing we need to get this stuff out there to millions, if not billions, of people.

A saying keeps coming back to mind for me, all models are wrong, some are useful. (Interpretable) AI, explanations, regulations, they're all models, so of course they won't be perfect... if they were, we wouldn't have this problem to begin with. What it all comes back to is usefulness. Clearly, we find these things useful, or we wouldn't have them, necessity being the mother of invention and all, but then we must actually make sure what we do is useful. Spinning wheels inventing one new framework after the next doesn't seem like that to me. Building tools that people can make their own, but know that no matter what, a hammer is still a hammer, and someone else can still use it? That seems much more meaningful of an investment, if we're talking the tooling/framework side of things. Regulation will be much the same, and I do think there are some quite positive directions, and things like [1] seem promising, even if only as a stop-gap measure until we solve the hard problems and have no need for it any more -- though they're not solved yet, so I wouldn't hold out for such a thing either. Regulations also have the nice benefit that, unlike much of the software we seem to write these days, they're actually vertically and horizontally composable, and different places and domains at different levels have a fascinating interplay and cross-pollination of ideas, sometimes we see nation-states following in the footsteps of municipalities or towns, other times a federal guideline inspires new institutional or industrial policies, and all such combinations. Plus, at the end of the day, it's still about people, so if a regulation needs fixing, well, it's not like you're not trying to change the physics of the universe, are you?

  [0]: Vijay Arya et al. "One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques" https://arxiv.org/abs/1909.03012
  [1]: High-Level Expert Group on AI "Ethics Guidelines for Trustworthy AI"
  Apologies, will have to just cite those, since while there are some papers associated with the others, it's quite late now, so I hope the recognisable names suffices.

touisteur · 2024-03-28T19:29:24

Thanks a lot. I love the whole XAI movement, as it often forced you think of cliff and limits and non-linearity of the methods. Makes you circle back to an engineering process of thinking about specification and qualification of your black box.

aflip · 2024-03-29T07:26:05

Thank you! especially for the exhaustive reading list!!

babel_ · 2024-03-17T12:54:12

Those are numbers from 7 years ago, so they're beginning to get a bit stale as people start to put more weight behind having frame pointers and make upstream contributions to their compilers to improve their output. People put it at <1% from much more recent testing by the very R.W.M. Jones you're replying to [0] and separate testing by others like Brendan Gregg [1b], whose post this is commenting on (and included [1b] in the Appendix as well), with similar accounts by others in the last couple years. Oh, and if you use flamegraph, you might want to check the repo for a familiar name.

Some programs, like Python, have reported worse, 2-7% [2], but there is traction on tackling that [1a] (see both rwmj's and brendangregg's replies to sibling comments, they've both done a lot of upstreamed work wrt. frame pointers, performance, and profiling).

As has been frequently pointed out, the benefits from improved profiling cannot be understated, even a 10% cost to having frame pointers can be well worth it when you leverage that information to target the actual bottlenecks that are eating up your cycles. Plus, you can always disable it in specific hotspots later when needed, which is much easier than the reverse.

Something, something, premature optimisation -- though in seriousness, this information benefits actual optimisation, exactly because we don't have the information and understanding that would allow truly universal claims, precisely because things like this haven't been available, and so haven't been widely used. We know frame pointers, from additional register pressure and extended function prologue/epilogue, can be a detriment in certain hotspots; that's why we have granular control. But without them, we often don't know which hotspots are actually affected, so I'm sure even the databases would benefit... though the "my database is the fastest database" problem has always been the result of endless micro-benchmarking, rather than actual end-to-end program performance and latency, so even a claimed "10%" drop there probably doesn't impact actual real-world usage, but that's a reason why some of the most interesting profiling work lately has been from ideas like causal profilers and continuous profilers, which answer exactly that.

[0]: https://rwmj.wordpress.com/2023/02/14/frame-pointers-vs-dwar... [1a]: https://pagure.io/fesco/issue/2817#comment-826636 [1b]: https://pagure.io/fesco/issue/2817#comment-826805 [2]: https://discuss.python.org/t/the-performance-of-python-with-...

adrian_b · 2024-03-17T16:02:10

While improved profiling is useful, achieving it by wasting a register is annoying, because it is just a very dumb solution.

The choice made by Intel when they have designed 8086 to use 2 separate registers for the stack pointer and for the frame pointer was a big mistake.

It is very easy to use a single register as both the stack pointer and the frame pointer, as it is standard for instance in IBM POWER.

Unfortunately in the Intel/AMD CPUs using a single register is difficult, because the simplest implementation is unreliable since interrupts may occur between 2 instructions that must form an atomic sequence (and they may clobber the stack before new space is allocated after writing the old frame pointer value in the stack).

It would have been very easy to correct this in new CPUs by detecting that instruction sequence and blocking the interrupts between them.

Intel had already done this once early in the history of the x86 CPUs, when they have discovered a mistake in the design of the ISA, that interrupts could occur between updating the stack segment and the stack pointer. Then they had corrected this by detecting such an instruction sequence and blocking the interrupts at the boundary between those instructions.

The same could have been done now, to enable the use of the stack pointer as also the frame pointer. (This would be done by always saving the stack pointer in the top of the stack whenever stack space is allocated, so that the stack pointer always points to the previous frame pointer, i.e. to the start of the linked list containing all stack frames.)

menaerus · 2024-03-18T09:30:47

I'd prefer discussing technical merits of given approach rather than who is who and who did what since that leads to appeal to authority fallacy.

You're correct, results might be stale, although I wouldn't hold my breath for it since there has been no fundamental change in a way how frame pointers are handled as far as my understanding goes. Perhaps smaller improvements in compiler technology but CPUs did not undergo any significant change w.r.t. that context.

That said, nowhere in this thread have we seen a dispute of those Linux kernel results other than categorically rejecting them as being "microbenchmarks", which they are not.

> though the "my database is the fastest database" problem has always been the result of endless micro-benchmarking, rather than actual end-to-end program performance and latency

Quite the opposite. All database benchmarks are end-to-end program performance and latency analysis. "Cheating" in database benchmarks is done elsewhere.

doctorpangloss · 2024-03-17T16:06:00

> As has been frequently pointed out, the benefits from improved profiling cannot be understated, even a 10% cost to having frame pointers can be well worth it when you leverage that information to target the actual bottlenecks that are eating up your cycles.

Few can leverage that information because the open source software you are talking about lacks telemetry in the self hosted case.

The profiling issue really comes down to the cultural opposition in these communities to collecting telemetry and opening it for anyone to see and use. The average user struggles to ally with a trustworthy actor who will share the information like profiling freely and anonymize it at a per-user level, the level that is actually useful. Such things exist, like the Linux hardware site, but only because they have not attracted the attention of agitators.

Basically users are okay with profiling, so long as it is quietly done by Amazon or Microsoft or Google, and not by the guy actually writing the code and giving it out for everyone to use for free. It’s one of the most moronic cultural trends, and blame can be put squarely on product growth grifters who equivocate telemetry with privacy violations; open source maintainers, who have enough responsibilities as is, besides educating their users; and Apple, who have made their essentially vaporous claims about privacy a central part of their brand.

Of course people know the answer to your question. Why doesn’t Google publish every profile of every piece of open source software? What exactly is sensitive about their workloads? Meta publishes a whole library about every single one of its customers, for anyone to freely read. I don’t buy into the holiness of the backend developer’s “cleverness” or whatever is deemed sensitive, and it’s so hypocritical.

yjftsjthsd-h · 2024-03-17T18:41:24

> Basically users are okay with profiling, so long as it is quietly done by Amazon or Microsoft or Google, and not by the guy actually writing the code and giving it out for everyone to use for free.

No; the groups are approximately "cares whether software respects the user, including privacy", or "doesn't know or doesn't care". I seriously doubt that any meaningful number of people are okay with companies invading their privacy but not smaller projects.

matheusmoreira · 2024-03-17T20:05:18

"Agitators". We don't trust telemetry precisely because of comments like that. World is full of people like you who apparently see absolutely nothing wrong with exfiltrating identifying information from other people's computers. We have to actively resist such attempts, they are constant, never ending and it only seems to get worse over time but you dismiss it all as "cultural opposition" to telemetry.

For the record I'm NOT OK with being profiled, measured or otherwise studied in any way without my explicit consent. That even extends to the unethical human experiments that corporations run on people and which they euphemistically call A/B tests. I don't care if it's Google or a hobbyist developer, I will block it if I can and I will not lose a second of sleep over it.

rstuart4133 · 2024-03-17T21:15:54

> World is full of people like you who apparently see absolutely nothing wrong with exfiltrating identifying information from other people's computers.

True. But such people are like cockroaches. They know what they are doing will be unpopular with their targets, so they keep it hidden. This is easy enough to do in closed designs, car manufacturers selling your driving habits to insurance companies and health monitoring app selling menstrual cycle data to retailers selling to women.

Compare that do say Debian and RedHat. They too collect performance data. But the code is open source, Debian has repeatable builds so you can be 100% sure that is the code in use, and every so often someone takes a look it. Guess what, the data they send back is so unidentifiable it satisfies even the most paranoid of their 1000's of members.

All it takes is a little bit of sunlight to keep the cockroaches at bay, and then we can safely let the devs collect the data they need to improve code. And everyone benefits.

doctorpangloss · 2024-03-18T00:55:09

I fully support the schemes for telemetry that already happens. It’s just obnoxious that there’s no apparent reason behind the support of one kind of anonymous telemetry versus another. For every 1 person who digests the package repo’s explanation of the telemetry strategy, there are 19 who feel okay with GitHub having all the telemetry and none of the open source repos they host because of vibes.

babel_ · 2024-03-17T19:32:00

I think the kind of profiling information you're imagining is a little different from what I am.

Continuous profiling of your system that gets relayed to someone else by telemetry is very different from continuous profiling of your own system, handled only by yourself (or, generalising, your community/group/company). You seem to be imagining we're operating more in the former, whereas I am imagining more in the latter.

When it's our own system, better instrumented for our own uses, and we're the only ones getting the information, then there's nothing to worry about, and we can get much more meaningful and informative profiling done when more information about the system is available. I don't even need telemetry. When it's "someone else's" system, in other words, when we have no say in telemetry (or have to exercise a right to opt-out, rather than a more self-executing contract around opt-in), then we start to have exactly the kinds of issues you're envisaging.

When it's not completely out of our hands, then we need to recognise different users, different demands, different contexts. Catering to the user matters, and when it comes to sensitive information, well, people have different priorities and threat models.

If I'm opening a calendar on my phone, I don't expect it to be heavily instrumented and relaying all of that, I just want to see my calendar. When I open a calendar on my phone, and it is unreasonably slow, then I might want to submit relevant telemetry back in some capacity. Meanwhile, if I'm running the calendar server, I'm absolutely wanting to have all my instrumentation available and recording every morsel I reasonably can about that server, otherwise improving it or fixing it becomes much harder.

From the other side, if I'm running the server, I may want telemetry from users, but if it's not essential, then I can "make do" with only the occasional opt-in telemetry. I also have other means of profiling real usage, not just scooping it all up from unknowing users (or begrudging users). Those often have some other "cost", but in turn, they don't have the "cost" of demanding it from users. For people to freely choose requires acknowledging the asymmetries present, and that means we can't just take the path of least resistance, as we may have to pay for it later.

In short, it's a consent issue. Many violate that, knowingly, because they care not for the consequences. Many others don't even seem to think about it, and just go ahead regardless. And it's so much easier behind closed doors. Open source in comparison, even if not everything is public, must contend with the fact that the actions and consequences are (PRs, telemetry traffic, etc), so it inhabits a space in which violating consent is much more easily held accountable (though no guarantee).

Of course, this does not mean it's always done properly in open source. It's often an uphill battle to get telemetry that's off-by-default, where users explicitly consent via opt-in, as people see how that could easily be undermined, or later invalidated. Many opt-in mechanisms (e.g. a toggle in the settings menu) often do not have expiration built in, so fail to check at a later point that someone still consents. Not to say that's the way you must do it, just giving an example of a way that people seem to be more in favour of, as with the generally favourable response to such features making their way into "permissions" on mobile.

We can see how the suspicion creeps in, informed by experience... but that's also known by another word: vigilance.

So, users are not "okay" with it. There's a power imbalance where these companies are afforded the impunity because many are left to conclude they have no choice but to let them get away with it. That hasn't formed in a vacuum, and it's not so simple that we just pull back the curtain and reveal the wizard for what he is. Most seem to already know.

It's proven extremely difficult to push alternatives. One reason is that information is frequently not ready-to-hand for more typical users, but another is that said alternatives may not actually fulfil the needs of some users: notably, accessibility remains hugely inconsistent in open source, and is usually not funded on par with, say, projects that affect "backend" performance.

The result? Many people just give their grandma an iPhone. That's what's telling about the state of open source, and of the actual cultural trends that made it this way. The threat model is fraudsters and scammers, not nation-state actors or corporate malfeasance. This app has tons of profiling and privacy issues? So what? At least grandma can use it, and we can stay in contact, dealing with the very real cultural trends towards isolation. On a certain level, it's just pragmatic. They'd choose differently if they could, but they don't feel like they can, and they've got bigger worries.

Unless we do different, the status quo will remain. If there's any agitation to be had, it's in getting more people to care about improving things and then actually doing them, even if it's just taking small steps. There won't be a perfect solution that appears out of nowhere tomorrow, but we only have a low bar to clear. Besides, we've all thought "I could do better than that", so why not? Why not just aim for better?

Who knows, we might actually achieve it.

doctorpangloss · 2024-03-18T01:00:58

[flagged]

babel_ · 2024-03-18T02:44:11

Telemetry is exceedingly useful, and it's basically a guaranteed boon when you operate your own systems. But telemetry isn't essential, and it's not the heart of the matter I was addressing. Again, the crux of this is consent, as an imbalance of power easily distorts the nature of consent.

Suppose Chrome added new telemetry, for example, like it did when WebRTC was added in Chrome 28, so we really can just track this against something we're all familiar (enough with). When a user clicks "Update", or it auto-updated and "seamlessly" switched version in the background / between launches, well, did the user consent to the newly added telemetry?

Perhaps most importantly: did they even know? After all, the headline feature of Chrome 28 was Blink, not some feature that had only really been shown off in a few demos, and was still a little while away from mass adoption. No reporting on Chrome 28 that I could find from the time even mentions WebRTC, despite entire separate articles going out just based on seeing WebRTC demos! Notifications got more

So, capabilities to alter software like this are, knowingly or unknowingly, undermine the nature of consent that many find implicit in downloading a browser, since what you download and what you end up using may be two very different things.

Now, let's consider a second imbalance. Did you even download Chrome? Most Android devices often have it preinstalled, or some similar "open-core" browser (often a Chromium-derivative). Some are even protected from being uninstalled, so you can't opt out that way, and Apple only just had to open up iOS to non-Safari backed browsers.

So the notion of consent via the choice to install is easily undermined.

Lastly, because we really could go on all day with examples, what about when you do use it? Didn't you consent then?

Well, they may try to onboard you, and have you pretend to read some EULA, or just have it linked and give up the charade. If you don't tick the box for "I read and agree to this EULA", you don't progress. Of course, this is hardly a robust system. Enforceability aside, the moment you had it over to someone else to look at a webpage, did they consent to the same EULA you did?

... Basically, all the "default" ways to consider consent are nebulous, potentially non-binding, and may be self-defeating. After all, you generally don't consent to every single line of code, every single feature, and so on, you are usually assumed to consent to the entire thing or nothing. Granularity with permissions has improved that somewhat, but there is usually still a bulk core you must accept before everything else; otherwise the software is usually kept in a non-functional state.

I'm not focused too specifically on Chrome here, but rather the broad patterns of how user consent typically assumed in software don't quite pan out as is often claimed. Was that telemetry the specific reason why libwebrtc was adopted by others? I'm not privy to the conversations that occurred with these decisions, but I imagine it's more one factor among many (not to mention, Pion is in/for Go, which was only 4 years old then, and the pion git repo only goes back to 2018). People were excited out of the gate, and libwebrtc being available (and C++) would have kept them in-step (all had support within 2013). But, again, really this is nothing to do with the actual topic at hand, so let's not get distracted.

The user has no opportunity to meaningfully consent to this. Ask most people about these things, and they wouldn't even recognise the features by now (as WebRTC or whatever is ubiquitous), let alone any mechanisms they may have to control how it engages with them.

Yet, the onus is put on the user. Why do we not ask about anything/anyone else in the equation, or consider what influences the user?

A recent example I think illustrates the imbalance and how it affects and warps consent is the recent snafu with a vending machine with limited facial recognition capabilities. In other words, the vending machine had a camera, ostensibly to know when to turn on or not and save power. When this got noticed at a university, it was removed, and everyone made a huge fuss, as they had not consented to this!

What I'd like to put in juxtaposition with that is how, in all likelihood, this vending machine was probably being monitored by CCTV, and even if not, that there is certainly CCTV at the university, and nearby, and everywhere else for that matter.

So what changed? The scale. CCTV everywhere does not feel like something you can, individually, do anything about; the imbalance of power is such that you have no recourse if you did not consent to it. A single vending machine? That scale and imbalance has shifted, it's now one machine, not put in place by your established security contracts, and not something ubiquitous. It's also something easily sabotaged without clear consequence (students at the university covered the cameras of it quite promptly upon realising), ironically, perhaps, given that this was not their own property and potentially in clear view of CCTV, but despite having all the same qualities as CCTV, the context it embedded in was such that they took action against it.

This is the difference between Chrome demanding user consent and someone else asking for it. When the imbalance of power is against you, even just being asked feels like being demanded, whereas when it doesn't quite feel that way, well, users often take a chance to prevent such an imbalance forming, and so work against something that may (in the case of some telemetry) actually be in their favour. However, part and parcel with meeting user needs is respecting their own desires -- as some say, the customer is always right in matters of taste.

To re-iterate myself from before, there are other ways of getting profiling information, or anything you might relay via telemetry, that do not have to conform to the Google/Meta/Amazon/Microsoft/etc model of user consent. They choose the way they do because, to them, it's the most efficient way. At their scale, they get the benefits of ubiquitous presence and leverage of the imbalance of power, and so what you view as your system, they view as theirs, altering with impunity, backed by enough power to prevent many taking meaningful action to the contrary.

For the rest of us, however, that might just be the wrong way to go about it. If we're trying to avoid all the nightmares that such companies have wrought, and to do it right by one another, then the first step is to evaluate how we engage with users, what the relationship ("contract") we intend to form is, and how we might inspire mutual respect.

In ethical user studies, users are remunerated for their participation, and must explicitly give knowing consent, with the ability to withdraw at any time. Online, they're continually A/B tested, frequently without consent. On one hand, the user is placed in control, informed, and provided with the affordances and impunity to consent entirely according to their own will and discretion. On the other, the user is controlled, their agency taken away by the impunity of another, often without the awareness that this is ongoing, or that they might have been able to leverage consent (and often ignored even if they did, after all, it's easy to do so when you hold the power). I know which I'd rather be on the other end of, at least personally speaking.

So, if we want to enable telemetry, or other approaches to collaborating with users to improve our software, then we need to do just that. Collaborate. Rethink how we engage, respect them, respect their consent. It's not just that we can't replicate Google, but that maybe we shouldn't, maybe that approach is what's poisoned the well for others wanting to use it, and what's forcing us to try something else. Maybe not, after all, that's not for us to judge at this point, it's only with hindsight that we might truly know. Either way, I think there's some chance for people to come in, make something that actually fits with people, something that regards them as a person, not simply a user, and respects their consent. Stuff like that might start to shift the needle, not by trying to replace Google or libwebrtc or whatever and get the next billion users, but by paving a way and meeting the needs of those who need it, even if it's just a handful of customers or even just friends and family. Who knows, we might start solving some of the problems we're all complaining about yet never seem to fix. At the very least, it feels like a breath of fresh air.

doctorpangloss · 2024-03-18T03:23:11

You’re agreeing with me.

> Rethink how we engage, respect them, respect their consent.

One way to characterize this is leadership. Most open source software authors are terrible leaders!

You’re way too polite. Brother, who is making a mistake and deserves blame? Users? Open source non corporate software maintainers? Google employees? Someone does. It can’t be “we.” I don’t make any of these mistakes, leave me out of it! I tell every non corporate open source maintainer to add basic anonymized telemetry, PR a specific opt-out solution with my preferred Plausible, and argue relentlessly with users to probe the vapors they base their telemetry fears on. We’re both trying to engage on the issue, but the average HN reader is downvoting me. Because “vibes.” Vibes are dumb! Just don’t be afraid to say it.

babel_ · 2024-02-24T21:58:29

> it's quite impressive for a single author to have a functional, fast language with a working garbage collector and arena allocator (with some issues) in only a few years.

As the included code shows, the gc is boehm gc, and checking their repo shows they just include libgc/bdwgc. This is absolutely not a knock against anyone here, it's just about the standard library for this need, and I think it is a far smarter move to use it than for most to attempt to make their own general-purpose gc (though boehm can't catch all leaks).

I feel it would be wrong, however, to characterise this as being a single author having made a language with a gc and arenas, as if those were significant parts of the author's own developments, rather than using a well-picked import and a half-baked implementation of "arenas", which here are really just a global linked list of buffers, freed only at exit, and so everything leaks [0]. They're not really arenas, you can't use them locally in a region of code or as scratchpads, let alone multi-thread it. By their code's own admission, it's just a little pre-allocation to batch mallocs for all the little heap allocations of a GC-assuming codebase, so they're not really arenas like you'd use in C or elsewhere.

Not unimpressive, it's a valid approach for some uses (though not general purpose), it's just different from a language with their own gc and actual arenas. Indeed, just implementing an arena barely even registers in the complexity, I feel, as arenas really should be very simple in most use cases [1]. It would be far more impressive to have them actually integrated and be available as a true alternative to a GC for memory management, particularly integrating common patterns (e.g. [2]) in a way that could serve as a drop-in replacement, such that we can actually provide bounded lifetimes and memory safety without a full GC, let alone support multiple concurrencies with it from multi-threading to coroutines -- this would likely still be unsafe without a GC compared to, say, Rust or Vale, let alone Pony or SPARK, and would likely require a cultural shift in manual management akin to Zig or Odin, as it may be largely moot if dependencies end up enforcing the gc themselves. Still, again, making anything substantial is never unimpressive, we just need to retain the perspective of what was achieved and how.

As to the rest, well, I think it's fair to say that there should be a clear delineation between statements of "we can do this and here's how" and roadmaps with "we're aiming to do these things and here's our current progress". In my experience, people are quick to get these mixed up when they're excited about making something, and none of us are fully immune to this. It's not some moral failing or anything in and of itself, it can very easily be an honest mistake, but humans see patterns everywhere, so we often need to be receptive when others are trying to help us be level-headed and clear things up; otherwise a reputation begins to form. Especially in this industry, reproducibility matters, as we're all too familiar with the smoke-and-mirrors of demos (not to personally claim there is any here, just that it obviously helps dispel such concerns).

And, of course, second chances are always offered if someone is willing to amend mistakes.

[0]: prealloc.c.v is barely over 100 lines long and quite manageable, https://github.com/vlang/v/blob/master/vlib/builtin/prealloc...

[1]: Chris Wellons, "Arena allocator tips and tricks", https://nullprogram.com/blog/2023/09/27/

[2]: Ryan Fleury, "Untangling Lifetimes: The Arena Allocator", https://www.rfleury.com/p/untangling-lifetimes-the-arena-all...

babel_ · 2024-02-02T14:34:41

Personally, whatever helps with the specific writing part of it all the most is what's best. If you find writing in a given dialect of Markdown or LaTeX or Org-mode is easiest, do that. For me, that's Markdown with embedded LaTeX, for others it's Org-mode, or RST, and so on.

Pandoc handles these fairly seamlessly, and with many options for PDF engines, though I'd say it has a preference for LaTeX and HTML in the backend and Markdown in the frontend, based on my experiences with the edge cases (sometimes entirely solvable with a little Haskell or Lua).

Since LaTeX is the default for PDFs, it pays to keep that in mind and help LaTeX help you (you can use it inline with Markdown or included as preamble in configuration), but sometimes I've just had better luck converting via HTML to PDF ("-t html output.pdf" or directly chaining on from output.html) for what I'm writing in the moment, though other times I'm not stressing LaTeX as much and can just go straight from Markdown to PDF (for example, just writing up something with inline maths). I prefer to avoid LaTeX or HTML's escaped character encoding and often need far more than a single Latin font can provide, so I've ended up dealing with LaTeX's limitations here (even in lualatex and xelatex) more than what I'd suspect is typical. Meanwhile, the standard HTML to PDF backend uses Qt, and I've found it works for everything else I've needed when LaTeX isn't the right backend (and it does come up). On one occasion, I did have to switch that to weasyprint, and that was everything sorted. Alternative backends is an unsung power that few have, while pandoc not only has many built-in (or it is at least internally aware of) but will also integrate with any CLI needed.

Output to all three with HTML, EPUB, and PDF can just need a bit of fiddling before it comes out right, depending on how much you're willing to mess with specific metadata for each versus accepting the limits of what Pandoc can handle universally in its AST. Invariably, some compromise is required, but the core semantics of Markdown (including extensions) almost always translate without an issue. The dialect problem of Markdown is really just in the confluence of said semantics with things that have not been separately included, such as the lack of an actual header in Markdown (Pandoc here allows YAML for some, or you just fall back to HTML).

So, tldr; there's no "best" input format, except the one that you find most comfortable to just write the book in, but I find Pandoc is usually best approached from Markdown with the LaTeX or HTML backends. It's powerful and oh so very handy, but it's not going to do all the thinking for you, just a lot of the grunt work, same as any other tool. When in doubt, the user manual is quite readable, and I've found it answered almost every question I had. When it doesn't, other people do, and when they don't, it means I'm either going about it the wrong way or I get to solve an actual problem (but usually the former). But, as always, the most important thing is actually writing it, distribution comes later, so focus your efforts on that and the tools you need to do that effectively.

mapreduce · 2024-02-02T14:40:33

> If you find writing in a given dialect of Markdown or LaTeX or Org-mode is easiest, do that.

I find Org-mode the easiest but like I said in my comment, the conversion quality is not great. Pandoc breaks a lot of stuff in Org-mode in edge cases. One example I shared in my comment was Pandoc breaking internal links.

So by selecting something I find the easiest I have burned many hours of troubleshooting figuring out why the output does not look right.

That's why I want to draw upon the wisdom of the community here to find out which input format works best and by best I mean flawlessly. No edge case issues. No rendering flaws. If I get the specific recommendations, I'll try them out for sometime and then commit myself to it instead of burning more time trialling all of the different input formats.

babel_ · 2024-02-02T15:02:23

Unfortunately, the perfect is very much the enemy of the good here. Aside from HTML, I'm afraid that PDF and EPUB are very much driven by purpose-built tools designed to show interactively what it will look like as output. This means that they've both delved into a depth of subtle semantic differences that makes flawless output an extremely difficult task. Of course, practically, pandoc can resolve the vast majority of what people actually use, but everything will still be hit by edge cases from time to time, leading to subtle issues or incompatibilities between EPUB, PDF, and HTML. Each edge case can, of course, be solved in isolation, so finding something that's solved the ones you are encountering already is the ideal, providing a seamless experience for your work. Sadly, each of those is built to solve someone else's specific work, and so sometimes we just have to accept that we either need to compromise on something, we need to paper over the gaps by combining the right tools, or we have to write something ourselves. Fortunately, it isn't the 80s anymore, so many of the tools we have are the "right" ones, and pandoc is very good at combining them.

Again, I find that Markdown (with inline LaTeX or HTML) seems to be Pandoc's preferred starting point, and that the HTML backends are quite useful (particularly when not needing full LaTeX), so perhaps there's some luck to be had there, since HTML may preserve Org's linking and such a bit better, though I don't use Org myself so can't attest to it. And if there's really a problem, then perhaps Pandoc needs some help sorting Org-mode out!

mncharity · 2024-02-02T19:49:00

Riffing on crafting pipelines by combining tools...

Org mode can also export html and markdown, so that's three potential pandoc inputs, with potentially different properties. All of which might be massaged before input. And in extremity, an org-mode parser permits emitting customized input. Then pandoc's parsing and filters permit altering the pandoc ast in flight. And the ast isn't hard (assuming comfort with ASTs), so if some other tool has templates and output one likes, one might skip the pandoc backend and emit it oneself from pandoc ast json. Rather than hoping to persuade that other tool to both accept and generate what's needed.

So for instance, last year I had a project written in a project-specific markdown dialect, kludged to pandoc-flavored markdown, parsed with `pandoc -t json`, and html emitted custom from the pandoc ast. With embedded directives from dialect to emitter. And html templates copied from non-pandoc tools. In a language with nice pattern matching (julia's Match), the emitter was a short page of code.

"Avoid reinventing wheels, but sometimes it's easier to assemble a satisficing custom vehicle, than to find and adapt a previously-built one."

mapreduce · 2024-02-02T17:15:03

Great comment! Thanks for engaging in this discussion and offering some good perspective about my Pandoc issues. Really appreciate it!

babel_ · 2023-09-07T18:55:03

It uses JavaScriptCore, which is a GPL submodule of WebKit.

JimDabell · 2023-09-07T21:17:56

It’s LGPL, not GPL: https://github.com/WebKit/WebKit/blob/main/Source/JavaScript...

babel_ · on July 26, 2023

The add_zero_attn parameter in PyTorch is used for this, but by default their softmax is the regular kind. It has been in flaxformer for a couple years now though, however it claims to be a compatibility variant for older models [2] and I haven't seen any mention of it in their recent papers (though I've not checked exhaustively).

[1]: https://pytorch.org/docs/stable/generated/torch.nn.Multihead... [2]: https://github.com/google/flaxformer/blob/main/flaxformer/co...

babel_ · on July 18, 2023

> "[!] Your paswords will be saved as readable text (e.g., BadP@ssw0rd) so anyone who can open the exported file can view them."

That's effectively what almost all of them say when you export your logins (usually as CSV, JSON, or XML), because they export in plain text, because you don't know what the user needs it for, up to and including manual imputation (better than expect a random user to have to learn how to print out a database, or worse submit that database file to some online service to print out).

Users aren't necessarily highly computer literate, we don't want to prevent people from having security, but even if they were they may still have use cases that do not accept such a database (migrating password manager that don't know your previous one, perhaps), so most of them use (unencrypted) plain text and just accept they'll have to leave it in the user's hands, and warn them it's exposed.

We'd absolutely love there to be safe, portable ways to move our data around such that it remains encrypted while migrating, yes, but that's just not something our current crop of software really enables fully these days, unfortunately.