Essentially we're trying to figure out when it's appropriate for "my" code to be...

eslaught · on March 24, 2016

> One thing that would be useful to this debate an analysis of a language ecosystem where there are only "macropackages" and see if the same function shows up over and over again across packages.

Look no further than C++, where nearly every major software suite has its own strings, vectors, etc. implemented, frequently duplicating functionality already implemented in (1) STL, and (2) Boost. I seem to recall that the original Android Browser, for example, had no fewer than 5 kinds of strings on the C++ side of the code base, because it interfaced with several different systems and each had its own notion of what a string should be.

The advantage (or disadvantage) of including common functionality in macro packages is that you can optimize for your particular use case. In theory, this may result in more well-tailored abstractions and performance speedups, but it also results in code duplication, bugs, and potentially even poor abstraction and missed optimization opportunities (just because you think you are generalizing/optimizing your code doesn't mean you actually are).

Clearly, we need some sort of balance, and having official or de facto standard libraries is probably a win. Half the reason we're even in this situation is because both JS and C++ lack certain features in their standard libraries which implicitly encourage people to roll their own.

the_mitsuhiko · on March 24, 2016

> Look no further than C++, where nearly every major software suite has its own strings, vectors, etc. implemented, frequently duplicating functionality already implemented in (1) STL, and (2) Boost.

In many ways problem exists because there used to be different ideas in how strings should be set up. Today we mostly decide on UTF-8 and only convert now to legacy APIs if needed. This is a bad comparison because it's literally caused by legacy code. C++ projects cannot avoid different string classes because of that.

EGreg · on March 24, 2016

As the author of Groups on iOS and the Qbix Platform (http://qbix.com/platform) I think I have a pretty deep perspective on this issue of when "my thing" becomes "everyone's thing".

When a company goes public, there are lots of regulations. A lot of people rely on you. You can't just close up shop tomorrow.

When your software package is released as open source, it becomes an issue of governance. If you can't stick around to maintain it, the community has a right to appoint some other people. There can be professional maintainers taking over basic responsibilities for various projects.

Please read this article I wrote a couple years ago called the Politics of Groups:

http://magarshak.com/blog/?p=135

This is a general problem. Here is an excerpt:

If the individual - the risk is that the individual may have too much power over others who come to rely on the stream. They may suddenly stop publishing it, or cut off access to everyone, which would hurt many people. (I define hurt in terms of needs or strong expectations of people that form over time.)

__jal · on March 24, 2016

For workaday engineers (e.g., not people attempting to build distributed libraries), the appropriate question about dependency management is, "under what conditions is it appropriate for a person or group outside of your org to publish code into your project?"

Note that this isn't a cost-benefit approach. It simply asks what needs to be the case for you to accept third-party code. Every project is going to answer this slightly differently. I would hope that running a private repo, pinning versions, learning a bit about the author and some sort of code review would be the case for at least most, but apparently many folks feel OK about the unrestricted right of net.authors, once accepted into your project, to publish code at seemingly random times which they happily ingest on updates.

A lot of coders seem to see only the "neat, I don't have to write a state machine to talk to [thing]," or "thank god someone learned how to [X] so I don't have to." But that, combined with folks who don't manage their own repos or even pin things, leads to folks who's names the coders probably don't even know essentially having commit privileges on their code.

jrochkind1 · on March 24, 2016

> the appropriate question about dependency management is, "under what conditions is it appropriate for a person or group outside of your org to publish code into your project?"

>Note that this isn't a cost-benefit approach.

Eh, it is a cost-benefit approach if the appropriate circumstances are "where the benefit outweighs the cost/risk." And I don't know any good way to answer it without doing that.

What are your own examples of specification of such circumstances that don't involve cost-benefit?

Unless you just decide, when asking the question like that, "Geez, there's no circumstances where it's appropriate" and go to total NIH, no external dependencies at all (no Node, no Rails, no nothing).

__jal · on March 24, 2016

Well, in that everything is eventually a cost-benefit decision, sure.

What I was getting at is that "under what conditions" is a baseline gating requirement that needs to reflect the nature of what you're building. If I'm building something that is intended to replace OpenSSL, my gating conditions for including third party code is going to be a lot different than if I'm building what I hope is the next flappy birds' high-score implementation.

People have all sorts of gating functions. I don't write code on Windows, because I never have and see no reason to start. Baseline competency of developers is another one. You can view baseline requirements as part of the cost-benefit if you like, but generally, if the requirements need to change to make a proposed course of action "worth it", you probably need to redo the entire proposal because the first one just failed. (If your requirements are mutable enough that ditching some to make the numbers work seems acceptable, I have grave doubts about the process and/or the decision making.)

I basically included my own examples already: a private repo to freeze versions in use and host our own forks when appropriate, code review by at least one experienced developer of any included modules, and a look at the developer[1].

[1] This is pretty fuzzy (the process, not necessarily the developer). I generally google-stalk them to see what else they've worked on, if there have been public spats that lead me to believe the future upgrade path might be trouble for us, whether they have publicly visible expertise in what they're coding, etc. Basic reputation stuff, plus evidence of domain expertise, if appropriate.

jrochkind1 · on March 25, 2016

So when you say "code review by at least one experienced developer" as a condition, you really mean "code review AND the developer thinks it's 'good enough'", right? Presumably same for "a look at the developer" is the same the condition isnt' just that someone is looking at the developer, it's that they are looking and satisfied.

Without being clear about that, it can sound like your conditions are just about having a certain internal _process_ in place, and not about any evaluation at all -- if the process is in place, the condition is met. Rather you're saying you won't use any dependencies without reviewing them more than people typically do and deciding they are okay.

The problem of HOW you decide if something is okay still remains; although the time it takes to review already means you are definitely going to be using less dependencies than many people do, even before you get to the evaluation, just by having only finite time to evaluate. And you'll be especially unlikely to use large dependencies, since they will take so much time to review. (Do you review on every new version too?)

And you still need to decide if something is even worth spending the time to code review. I'd be surprised if expected 'benefit' doesn't play a role in that decision. And, really, I suspect expected benefit plays a role in your code review standards too -- you probably demand a higher level of quality for something you could probably write yourself in 8 yours, than for something that might take you months to write yourself.

What you do is different than what most people do in code reviewing all dependencies, and making sure you have a private repo mirror. I'm not sure it's really an issue of "conditions instead of cost-benefit analysis" though. You just do more analysis than most people, mainly, and perhaps have higher standards for code quality than most people (anyone doing "cost benefit analysis" probably pays _some_ attention to code quality via some kind of code review, just not as extensive as you and without as high requirements. If you're not paying any attention to code quality of your dependencies, you probably aren't doing any kind of cost-benefit analysis either, you're just adding dependencies with no consideration at all, another issue entirely).

__jal · on March 25, 2016

I'm actually having difficulty seeing if there's a real disagreement here or if we're arguing semantics. And for the record, you are correct that, for instance, I don't ignore what was learned in code reviews; I assumed I didn't need to spell everything out, but you're correct that I'm not endorsing cargo-cult code review or whatever.

Try this. Say you want to build a house. You plan it out, and while doing so write requirements into your plan. One is that you want zombie-green enameled floors, and a second is that it needs to comply with local construction ordinances.

Neither of those are costless. It is possible that your green floor related urges are something you'll compromise on - if it is too expensive, you'll suffer through with normal hardwood. This, to me, looks like a classic cost-benefit tradeoff - you want to walk on green enamel, but will take second-choice if it means you can also have a vacation instead of a bigger mortgage.

The building code requirement looks different. While you can certainly build your home while ignoring codes, that is usually not a very good strategy for someone who has the usual goals of home ownership in mind. Put a different way, unless you have rather unusual reasons for building a house, complying with building codes is only subject to cost-benefit insofar as it controls whether the home is built at all.[1]

Does this better illustrate where I'm coming from?

When one begins a coding project, hopefully one has a handle on the context in which is to be used. That knowledge should inform, among many other things, expectations of code quality. Code quality in a given project is not just the code you write, but also the code you import from third parties. If you don't subject imported code to the same standards you have for internally produced code, you have a problem, in that at the least you don't know if it is meeting the standards you set.[2]

So… I agree that, from one perspective, every decision is cost-benefit, including eating and putting on clothes when you leave the house. I think it is useful, however, to distinguish factors for which tradeoffs exist within the scope of succeeding in your goals from factors that amount to baseline conditions for success.

As an aside, if you wanted to jump up and down on this, I'm surprised you didn't take the route of asking how far down the stack I validate. What's _really_ running on that disk controller?[3]

[1] I do hope we can avoid digressing into building code discussions.

[2] "I don't care" is a perfectly fine standard, too, depending. Not everything is written to manage large sums of money, or runs in a life-or-death context, or handles environment control for expensive delicate things, or... I absolutely write code that I'm not careful about.

[3] http://www.wired.com/2015/02/nsa-firmware-hacking/