It's not what programming languages do, it's what they shepherd you to

c3534l · on March 27, 2020

This is what people mean when they say Haskell is "opinionated."

Haskell shepherds you into separating out IO code from library code to such an extent that literally any function that has an IO action taints the value returned from that function, causing it to be an IO value, and trying to pass that IO value into another function makes the return type of that function IO, too. Parametric polymorphism is the default, too, so it also shepherds you into writing general purpose code. Haskell is full of these little decisions where it just won't let you do something because it's not "correct" code, and they kind of don't care if that makes coding in it a fight against the compiler.

Rust took that philosophy and applied it pointers. Every value has a lifetime and an ownership which makes it quite hard to do things that aren't memory safe.

Both Rust and Haskell wrap values that can fail in little boxes, and to get them out you have to check which type of value it is, and in C# there's nothing stopping you from returning null and not telling anyone that you can return null, and just assuming people will check for null all the time. Haskell has a philosophy of "make invalid code unrepresentable." The concept of a value being in a box, rather than null being a possible value makes it impossible to use that value without getting it out.

People who write Go love that concurrency is easy and Go fmt has enforced a single canonical style. Building these sorts of things into the language goes a long way in getting them adopted and becoming the norm.

I think we saw a rise of the easy, anything-goes, screw-performance scripting languages. I think the next fashion seems to be in enforcing "correct" coding style. They all have their place.

tmountain · on March 27, 2020

I'm very much in agreement with your assessment here. If you rely on programmers to "do the right thing", some people will break those rules, and systems will suffer varying levels of quality decay as a result. Language level enforcement of key concepts prevents--or at least makes it harder--for folks to make bad decisions in whatever areas those concepts apply. Clojure is another good example where they've provided concurrency primitives that allow you to avoid all the major pitfalls you typically see with multithreaded Java programs. As such, most Clojure code does concurrency the "right way", and in 10+ years of using it, I've never seen a deadlock.

SkyBelow · on March 27, 2020

>I think we saw a rise of the easy, anything-goes, screw-performance scripting languages. I think the next fashion seems to be in enforcing "correct" coding style. They all have their place.

Having freedom to do what you want is great. Even if you shoot yourself in the foot, you learn your lesson and become a better developer. But as you work with increasing numbers of people, many making the same mistakes you have made, and especially as you end up having to fix their mistakes, you begin to look for a tool that takes away their ability to shoot themselves in the foot. That was one appeal of Rust when I was learning it. It is a pain to fight the compiler over memory, especially coming from a garbage collected background, but it both protected me from myself and protected me from others. At a certain point, at least on large enough group projects, the benefits of that protection outweigh the costs.

singularity2001 · on March 28, 2020

   > "It is a pain to fight the compiler over memory, especially coming from a garbage collected background"

What I don't understand: why don't people stick with garbage collected languages whenever possible?

cogman10 · on March 30, 2020

Lots of reasons.

- You don't want to spend time tuning your GC.

- Response Latency REALLY matters

- Response Throughput REALLY matters

- Memory footprint REALLY matters

- Application + runtime footprint matters

- Memory isn't the only resource you need to manage

- Cost matters

I get that you said "whenever possible" but figured I'd list reasons for "not possible" because I think they have a lot of tie in.

In particular. I could imagine that cost is going to be a driving factor in the future for people to want languages like Rust. They want the absolute fastest app runtime with the lowest resource footprint because AWS charges you for the more memory and CPU time you consume. In that case, the most economical thing to do is favoring the fastest to start and run languages available with the smallest amount of resources.

You might be able to technically do the same job with python, but if you can reduce your operation cost by 10x by switching (or starting) with a slimmer languages, why not?

pkolaczk · on March 28, 2020

Because GC addresses only one type of resource: memory, but there exist many other types of resources, and handling them correctly is PITA in most GCed languages.

PhilippGille · on March 27, 2020

> in C# there's nothing stopping you from returning null and not telling anyone that you can return null, and just assuming people will check for null all the time

You mean there was nothing?

https://docs.microsoft.com/en-us/dotnet/csharp/nullable-refe...

rustybolt · on March 27, 2020

Doesn't that break a lot of legacy code?

Also, it is certainly an improvement, but having `Foo?` as a type is still less explicit than having `Maybe<Foo>` as a type. If you miss the question mark, you can still have null pointer exceptions.

Dylan16807 · on March 27, 2020

> Also, it is certainly an improvement, but having `Foo?` as a type is still less explicit than having `Maybe<Foo>` as a type. If you miss the question mark, you can still have null pointer exceptions.

A perfect example of Stroustrup's Rule.

  * For new features, people insist on LOUD explicit syntax.
  * For established features, people want terse notation.

A question mark is concise, but it's just as explicit. The risk of people glossing over it isn't much worse, and avoiding tons of repeated keywords has benefits to comprehension.

lstamour · on March 28, 2020

The only problem I have is the new syntax has trained me to expect a ? when null is possible, and null is not allowed if ? is missing, but in older libraries not yet updated to C# 8 Nullable Types, not having a ? means null is allowed. So reading the syntax in my IDE is a lot harder as sometimes an un-annotated type means null is allowed and sometimes not. I wish in cross-over projects there was the option to use ! to say null shouldn’t be allowed, thus visually and temporarily distinguishing new from old code, and to that end, the ! could be inserted as an overlay by my IDE, I suppose... Maybe I should try to write an IDE plug-in for this, or have a look for one.

nicoburns · on March 27, 2020

> `Foo?` as a type is still less explicit than having `Maybe<Foo>` as a type.

I actually disagree with this. As long as `Foo?` is checked by the compiler, I think are almost identical in use. It doesn't matter if you don't notice the `?` if it's a compile error to miss the null check.

smichel17 · on March 27, 2020

It's the same if your Maybe<Foo>=Just<Foo>|Nothing, and in fact in that case I often prefer the nullable version, unless there's a dedicated, terse syntax for Maybe-checking built in to the language, the equivalent of null coalescing (Kotlin's ?: (Elvis operator), Typescript's ??), along with optional chaining calls (?.).

If you made a Maybe with a Nothing<String> that comes with a reason why nothing was returned, or any more complicated structure like that, then it's better[1] to use that instead of approximating it with exceptions, null, callbacks with an optional error argument, etc.

[1] In most cases. There's always exceptions, no pun intended.

nicoburns · on March 27, 2020

My view is that `foo?` is a nice sugar for the common Option/Maybe/null case, but that a language is severely missing out if it doesn't also offer general sum types. I don't understand why more languages don't offer them, it seems like it'd be a fairly easy feature to add without breaking backwards compatibility.

cdmcmahon · on March 27, 2020

Rich Hickey's talk "Maybe Not" has some interesting thoughts on this: https://youtu.be/YR5WdGrpoug

apetrovic · on March 27, 2020

I spent three years making iOS programs in Swift, which uses Foo! and Foo?, and never, NEVER, missed question or exclamation mark. Even if I did, the compiler would complain.

WorldMaker · on March 27, 2020

It does break legacy code so you need to opt-in per file or per project. It also doesn't fix legacy code automatically. Any of your dependencies that haven't yet opted in (and thus added the right annotations to their assemblies) to strict null checking are assumed `Foo?` (as they've always been) and may still give NullReferenceExceptions.

Almost all of the BCL (base class library; the system libraries) has been annotated at this point, but there will be plenty of libraries that are not yet on NuGet still.

If you miss the question mark and you've opted in to strict null checks you won't get a null pointer exception, you'll get a compiler error when trying to assign null. (That's why you have to opt-in: it makes `Foo` without the question mark non-nullable.)

yyyk · on March 27, 2020

I recall `Nullable<Foo>` is equally valid c# if we want a verbose syntax.

Kuinox · on March 27, 2020

yes it does, but you enable it with a flag, per file or per project. Converting an existing solution can make you find a lot of bug...

sgift · on March 27, 2020

I hope that this will be wildly accepted and not end like many other good things with "you know, our code currently 'works' .. why would we invest so much effort to make the compiler happy?"

Kuinox · on March 27, 2020

Microsoft is currently updating a lot of code to add this feature, in the end the community will pressure the libraries author so they do the same. I guess that in a few year all popular libs will use the nullable feature.

IggleSniggle · on March 27, 2020

See also: adoption of TypeScript in the JavaScript community. There is pressure from the ts community for libraries to either be created in ts or for popular libraries to adopt it, and it's become the way a plurality if js devs write js very quickly.

tasogare · on March 27, 2020

Or it may introduce new ones? This is seriously the most confusing feature I ever saw in C#.

I'll try it on a new project, but converting my existing code that is working fine, no way.

gowld · on March 27, 2020

Haskell is not opinionated. All in all, it's probably easier (but misses much of the point of Haskell) to just write "IO" and "do/return" on every function in your program than to use IO in a disciplined way. Haskell even supports this with special do-syntax (and the fortuitously-named "return") to make monadic code look more imperative!

Paying that IO/do/return syntax tax (sin-tax? syn-tax?) is still cheaper than the signature/return boilerplate in competing compiled languages like C and Java. Haskell invites you avoud that syn-tax by writing principled IO.

One of the major complains of Haskell is that it is so expressive and powerful that there are so many incompatible ways of architecting modules. (see the incompatibilities in implementations of Monad Transformers / Effects, Lens, etc)

Rails is perhaps the original "opionated" system. https://guides.rubyonrails.org/getting_started.html

7thaccount · on March 27, 2020

I wonder how much of an uptick Adacore has seen with people using Ada and especially Spark in projects lately. Ada has a different niche than Haskell and Rust, but they're obsessed with software quality and provability. I've only played with Ada, but really liked the code that came out of it. If only they could make strings less painful to deal with.

seangrogg · on March 27, 2020

I'd be very curious if, in the same way that "TypeScript is a superset of JavaScript", there could be a superset of TypeScript that encouraged one to annotate when they're performing IO operations (reading from/writing to the DOM, the network, workers, storage, etc) and if you consumed a function that had said annotation it would further encourage you to annotate that function as well. Something like:

    function writeStringToLocalStorage(key: string, value: string): void, *localStorage {
      // impl
    }

    function persistUsername(username: string): void {
      // ...
      writeStringToLocalStorage('username', username)
      // ...
    }

^ compiler complains that persistUsername writes to localStorage but does not have the *localStorage IO annotation. While I feel like TypeScript does a great job of working with arguments and return values there's still a whole class of issues that can crop up from things like unexpected DOM manipulation that still would be useful in detecting.

naasking · on March 27, 2020

> Haskell is full of these little decisions where it just won't let you do something because it's not "correct" code, and they kind of don't care if that makes coding in it a fight against the compiler.

They care more about predicability and compositionality than about a novice's struggles. Professional programmers should prioritise those things. Certainly you can not care about those things for personal projects.

That said, Haskell could of course use plenty of ergonomic improvements, but the ones you describe are not among them.

BiteCode_dev · on March 27, 2020

> I think we saw a rise of the easy, anything-goes, screw-performance scripting languages. I think the next fashion seems to be in enforcing "correct" coding style. They all have their place.

This is the sane way of looking at it.

We noticed that a lot of tasks were not worth the trouble of ensuring correctness, and so dynamic languages too over.

But then a lot of system scaled to a point complexity was hard to manage. And those system had huge economic impact. Which made perf and correctness valuable again, especially since they had a lower price of entry.

I do a lot of Python, usually with dynamic types and mixing IO everywhere. It works surprising well for a ton of cases, and can scale quite far. But I recently wanted to make a system that would provide a plugin system that included a scriptable scenario of input collection that would be chained up to a rendering of some sort. This required to disconnect completely the logic of the scenario - controlled by the 3rd party dev writing it - from the source of the input, and to make the API contract very strict.

It was quite a pleasant experience, seing that Python was capable of all that. Type hints work well now, and coroutines are exactly made for that use case. You can make your entire lib Sans I/O using coroutines as contracts. The automatic state saving and step by step execution is not inelegant.

But you can feel that it's been added on top of the original language design. It's not seamless. It's not shepherding you at all, you need to have discipline, and a deep understanding of the concepts. Which is the opposite of how it feels for other more core features of Python: well integrated, inviting you to do the right thing.

I'm hoping the technology will advance enough so that we eventually get one language that can navigate both side of the spectrum. Giving you the ease of Python/Ruby scripting, data analysis, and agility for medium projects, but letting you transition to Haskell/Rust safety nets and perfs with strict typing + memory safety without GC + good I/O hygiene in a progressive way. Something that can scale up, and scale down.

Right now we always have to choose. I've looked at swift, go, v, zig, nim, various lisps and jvm/.net based products. They always have those sweet spots, but also those blind spots. Which of course people loving the language often don't see (I know some people reading this comment will want to shim in their favorite as candidate, don't bother).

Now you could argue that we can't have it all: choose the right tool for the right job. But I disagree. I think we will eventually have it all. IT is a young field, and we are just at the beginning of what we can do.

Maybe as a transition, we will have a low level runtime language that also include a high level language runtime. Like a rust platform with a python implementation, or what v-lang does with v-script. It won't be the perfect solution, but I'd certainly use something like that.

pjc50 · on March 27, 2020

> a lot of tasks were not worth the trouble of ensuring correctness, and so dynamic languages too over.

I'm not sure about this ... to me it seems the dominant effects were firstly that Javascript is the only language allowed in browsers, and secondarily that nobody had really cracked the usability issues to give us a language that had type-safety and no explicit precompile phase and easy integration with the webserver.

(It doesn't help that a lot of experienced developers are either actively hostile to the concept of developer-usability, or think that their own idiosyncratic habits are the definition of usable and cannot be improved!)

People instead moved their correctness work to unit-testing.

BiteCode_dev · on March 27, 2020

Javascript is only dominant in the browser.

Other scripting languages (python, bash, ruby, php) dominate other parts of automation.

jasode · on March 27, 2020

>I'm hoping the technology will advance enough so that we eventually get one language that can navigate both side of the spectrum. Giving you the ease of Python/Ruby scripting, data analysis, and agility for medium projects, but letting you transition to Haskell/Rust safety nets and perfs with strict typing + memory safety without GC + good I/O hygiene in a progressive way. Something that can scale up, and scale down.
[...]

>Now you could argue that we can't have it all: choose the right tool for the right job. But I disagree. I think we will eventually have it all.

The idea of One Language that covers everything (or even most scenarios) is a seductive goal but it's not mathematically possible to create because the finite characters we use to design a language's syntax forces us to make certain concepts more inconvenient than others. This inevitably leads to multiple languages that emphasize different techniques. Previous comments about why this is unavoidable:

https://news.ycombinator.com/item?id=15483141

https://news.ycombinator.com/item?id=19974887

(One could argue that the lowest level of binary 0s and 1s is already the "One Programming Language For Everything" because it's the ancestor of all subsequent languages but that's just an academic distinction. Working in pure 0s and 1s is not a realistic language for working programmers and they'd inevitably find the syntax too inconvenient and thus invent new languages on top of it such as assembly code, Lisp, etc.)

BiteCode_dev · on March 27, 2020

> not mathematically possible to create because the finite characters we use to design a language's syntax forces us to make certain concepts more inconvenient than others

There is a huge number of combinations possible, especially once keywords come into play. I don't think that's the limitation.

On big limitation is that people that are good at creating dynamic languages are bad at creating strict ones, and vice versa.

You can see in a comment bellow than some people talk about swift, Raiky, kotlin like solutions to this problem (as I mentioned in my post it would happen). But of course, they don't have the I/O solution Haskell has, the borrow checker rust has, nor the agility or signal/noise ratio of Python. They have a compromise. A compromise that can be good. Those languages are well designed. But it's not "the ultimate solution", because the don't navigate any end of the spectrum.

jasode · on March 27, 2020

">There is a huge number of combinations possible, especially once keywords come into play. I don't think that's the limitation."

The mathematical limitation still remains even if you switch from 1-character symbols like '+' to verbose words like "plus()". You can attempt to invent a new programming language using longer keywords but you'll still run into contradictions of expressing concepts because there's always an implicit runtime assumption behind any syntax that hides the raw manipulation of 0s and 1s. If you didn't hide such assumptions (i.e. "abstractions"), then that means the 0s & 1s and NAND gates would be explicitly visible to the programmer at the syntax layer and that's unusable for non-trivial programs.

There's a reason why no credible computer science paper[0] has claimed to have invented the One Programming Language that can cover the full spectrum of tasks for all situations. It's not because computer scientists with PhDs over the last 50 years are collectively dumb. It's because it's not mathematically possible.

[0] e.g. one can search academic archives: https://arxiv.org/archive/cs

BiteCode_dev · on March 27, 2020

It would be like saying you can't have high level languages because assembly have only a few limited combinations of registers. You can always abstract things away.

In fact, you are making the assumption that there is no more generalist logical concepts we can discover that can abstract or merge what now appears to be conflicting paradigms.

I imagine people said that to Argand. Nah, you can't do that sqrt(-1) thing, you'll run into contradictions.

Given that we have been at the task for less than a century, which is like 1 minute of human history, I'm not inclined to such categorical rejection of the possibility.

SAI_Peregrinus · on March 27, 2020

Unlambda makes everything equally difficult!

andrewflnr · on March 27, 2020

I believe this is a failure of imagination. I'm not a beginner programmer. I've played with a lot of different languages and read about a lot more, and I still have a strong gut feeling that they can be unified if we just figure out the right framework. Note that I, for one, would consider a language that shepherds you towards highly interoperable DSLs to be (close to) a success; something like Racket that could generate efficient native binaries would be really close...

I don't believe for a second that syntax is the obstacle. You can express all the complexity you need with composition. Also, the One True Language will obviously have macros.

jasode · on March 27, 2020

> I, for one, would consider a language that shepherds you towards highly interoperable DSLs [...] Also, the One True Language will obviously have macros.

But that just motivates someone else who isn't you to prefer another language that has that "DSL" and macros as the baseline syntax of the new language for convenience. Now you have 2+ languages again.

If someone else prefers not to type out extra parentheses ")))))" to balance things and/or requires highest performance of no GC, then a "Racket/Lisp-like" language can't be the basis of The One True Language.

>You can express all the complexity you need with composition.

True, and to generalize further, a Turing Complete language can be used to create another Turing Complete language. But the ability to build any complexity by composition is itself the motivation to create another programming language that doesn't require the extra work of composition.

For example, one can program in the C Language and combine its primitives to build the "C++ Language" (first C++ compiler was C-with-classes) and the C++ Language can be used to build Javascript interpreter (Netscape browser was written in C++). And then Javascript can be used to build the first Typescript compiler. Thus we might say (via tortured logic) that C Language can let you write a "DSL" as complicated as C++ and Javascript and Typescript. Even though that's true in sense, people don't think of "C Language" as the One True Language. It's the same situation of not thinking of low-level 0s and 1s of NAND gates as being the One True Language even though composition of NAND gates will let you build any other language.

andrewflnr · on March 27, 2020

> But that just motivates someone else who isn't you to prefer another language

Sure, they'll want to, and they probably will, but they won't have to due to performance or other constraints, which is how I understood the goal.

> But the ability to build any complexity by composition is itself the motivation to create another programming language that doesn't require the extra work of composition.

That's what the macros are for. All part of the plan.

> ... Even though that's true in sense, people don't think of "C Language" as the One True Language.

C is not certainly not a convenient language for hosting DSLs, due to insufficient abstraction capabilities, but the real missing ingredient is interop between the DSLs. C doesn't make it easy to pass data between them, etc.

NAND gates are not a great comparison. You want to be composing abstractions to create other composable abstractions. You could extend the analogy to composing circuits into bigger circuits, but that's really just converging back to a high level language.

jasode · on March 27, 2020

>, but they won't have to due to performance or other constraints,

They have to because a GC runtime is too heavy for embedded environments with low resources.

>That's what the macros are for. All part of the plan.

But there's still the motivation for another language that doesn't require creating the macros. E.g. Lisp macros are so powerful that they can recreate C#'s syntax feature of LINQ queries. That's true -- but C# doesn't require making the macros.

Each programming language has a different "starting point" of convenience. If you try to invent the One Language that can create all other languages' convenience syntax via macros, you've simply motivated the existence of those other languages that don't require the macros.

And the NAND gate is an abstraction. It's an abstraction of decidable logic based on rules instead of thinking about raw voltages. We do combine/compose billions of NANDs abstractions to create higher abstractions.

andrewflnr · on March 27, 2020

GC is obviously optional for OTL. That's definitely one of the tricky bits (note: not a syntactic problem).

For the rest, I think your goal definition is too narrow. Removing every last desire for people to create alternatives is an unreasonable bar for literally any artifact, based on human psychology alone. That's explicitly not my goal (see previous comment), and with anyone who does have that goal you need to have an entirely different conversation (which, again, does not involve syntax).

jasode · on March 27, 2020

>GC is obviously optional for OTL. That's definitely one of the tricky bits (note: not a syntactic problem).

If a GC language lets the programmer "opt out of GC" to mark a variable or block of memory as "fixed" so the GC doesn't scan it or move it to consolidate free space, how would one annotate that intention unless there's extra syntax for it?

Likewise in the opposition direction: If a non-GC language let's one "opt into GC", you will have ambiguities if you have syntax that allows raw pointers of dynamically calculated addresses to point to any arbitrary offset into a block of memory. That means that memory can't be part of the optional GC which would invalidate the pointer. If you restrict the optional GC language to ban undecidable dynamic pointers, it means you've created the motivation for another language with the syntax that lets you program with the freedom of dynamic pointers!

The general case of "optional GC" that covers all situations and tuning its behavior is tied to syntax because you can't invent a compiler or runtime that can read the mind of the programmer.

andrewflnr · on March 27, 2020

Set a flag during the (optional) compile phase that tells the compiler to error out if it can't statically determine where to allocate/free. (No, it's not Halting-complete because the compiler has the option of bailing due to insufficient evidence) Without that flag, it still tries but will include a GC if needed. Same for types, btw.

Ok, fine, you probably want some annotations (like for types). You got me. There's syntax involved. It's still fundamentally a semantics problem. If that can be solved, the syntax will follow.

Your post reads like, "You want to add features? But you'll have to add syntax! It's impossible!" Even if syntax is necessary for GC-obliviousness (it isn't for type inference), it implies no more about whether the project is possible than that for any other feature. Note how far we've strayed from mathematical absolutes about possible strings.

Even on the semantics side, 50 years is far too early to declare defeat. There are no actual impossibilities stopping this, unless you have a formal proof you're not telling us about. Even that would just be a guide of how to change the problem definition, in the same way that Rice's Theorem tells us to add the "insufficient evidence" output to our program verification tools. Have some more imagination.

jasode · on March 27, 2020

>Note how far we've strayed from mathematical absolutes about possible strings.

Well sure, we can just theoretically concatenate all the existing programming languages' syntax today into one hypothetical huge string and call _that_ artificial mathematical construct, The One True Language. But obviously, we don't consider OTL solved so "mathematical impossible strings" is constrained to mean "nice strings" advantageous to human ergonomics: reasonable lengths that are easy to read, and easy to type, with no ambiguous syntax causing contradictions in runtime assumptions, no long compile times, etc. E.g. I have no problem typing out balanced parentheses for Lisp but I don't want to do that when writing a quick script in Linux so Bash without all those parentheses is much more convenient.

>There are no actual impossibilities stopping this, unless you have a formal proof you're not telling us about.

The mathematical limitation is that all useful higher level abstractions must have information loss of the lower level it is abstracting. This can be visualized with surjection: https://en.wikipedia.org/wiki/Bijection,_injection_and_surje...

In the wiki diagram, we can think of 'X' as low-level assembly language and 'Y' as higher-level C Language. In C, a line of code to add 2 numbers might be:

  a = b + c;

In the wiki diagram we see X elements '3' and '4' both mapped to Y element 'C'. X-3 and X-4 may be thought of as strategy #3 vs strategy #4 for picking different cpu registers before the ADD instruction and Y-C is the "a=b+c" syntax. In assembly, you manually pick the registers but in C Language you don't because gcc/clang/MSVC compilers do it. Because there are multiple ways in assembler to add numbers that collapse to the equivalent "a=b+c", there is information loss. Most of the time, C Language programmers don't care about registers which is why the C Language abstraction is useful but sometimes you do, and that's why raw assembly is still used. You can't make OTL with the syntax that handles both semantics of assembly and C. If you argue that C can have "inline assembly", that doesn't cover the situation of not having the C Runtime loaded at all that runs prior to "main()". Also, embedding asm in C is still considered by programmers as 2 languages rather than one unified one.

Or we can also think of 'X' as low-level C/C++ language that has numeric data types "short, int, long, float, double". And 'Y' is the higher-level Javascript that only has 1 number type which is a IEEE754 double precision floating point which maps to C languages "double". This means that Javascript's "information lost" is the fine-grained usage of 8-bit ints, 16-bit ints, and 32-bit ints.

If programmer John attempts to design a OTL, he will have to choose which information in the lower layer is "lost" in the runtime assumptions of the higher-level OTL. Since the John's surjection can't cover all scenarios, it motivates another programming language being created. An assumption of GC in the language runtime creates some information loss. Even an optional GC is an abstraction also creates information loss of how to manually manage memory at a lower level of abstraction.

andrewflnr · on March 27, 2020

OTL does not need to be surjective onto the set of all binary programs. You only get "information loss" when you try to go backwards, from the end result to the intent. That's reverse engineering, not programming. Now, during translation, the compiler might fill in some information you didn't care about. If you do care about specific instructions and registers for some part of your program, supply them. You probably want to have an assembly DSL that knows about how to integrate with the other code rather than embedding strings. You probably can generate any assembly this way, if just by writing exclusively in the assembly DSL, but the actual requirement is to correctly translate all valid specs.

jasode · on March 27, 2020

> You only get "information loss" when you try to go backwards, from the end result to the intent. That's reverse engineering, not programming.

Instead of "information loss", another way to put it is "deliberate reduced choices to make the abstraction useful to ease cognitive burden". That way, it doesn't have connotations about reverse engineering because limitations of surjective mapping is very much about forward engineering.

E.g. I look at Javascript and think forward to engineer how I want to use integers that are larger than 2^53. Javascript's "simpler abstraction of 1 number type" loses the notion of true 64-bit int with a range up to 2^64. Therefore, I don't use Javascript if I need that capability. This means Javascript can't be the OTL for all situations. Your suggestion of Racket-like language as a candidate for OTL has the same problem: it will always have gaps in functionality/semantics/runtime that make others not want to use it and therefore they create Another Language with the desired semantics.

Supplementing the gaps via the ability to write custom DSLs and macros don't solve it. Lisp already has that now and it's not the OTL. If programmer John extends Lisp with macros to simulate monads, he'll spell the macro his way but programmer Bob will spell his macro differently. Now they've created 2 personal dialects of Lisp instead of a larger unified One True Language.

Rereading your comments, I think you're really saying that it's possible to invent the OTL for you, andrewflnr. That's probably true, but unfortunately, that's not a useful answer when the programming community is confused as to why there isn't a universal OTL yet. They're talking about the OTL that everybody can use that covers all scenarios from low-level embedded C Language to scripting to numeric computing to 4GL business languages where SQL SELECT statements are 1st class and don't require double quotes or parentheses or loading any database drivers. Such a universal programming language, if it could exist, would make the "One" in "One True Language" actually mean one.

andrewflnr · on March 28, 2020

Most/all languages today take away options. Any OTL would just provide defaults. Details are optional but always possible. I thought I was pretty clear about that re assembly. That's barely even one of the hard parts.

I'm well aware of what it means to have a language for everyone to use. I'm thinking of everything from bootloaders to machine learning to interactive shells. The reason there isn't one yet is that it's really hard. Lots of basic theory about how to think about computation is still being sounded out. Unifying frameworks have been known to take a few decades after that. Still no reason to think it's impossible.

You're just repeating that there will always be gaps, with no evidence except gaps in languages produced by today's rushed, history-bound ecosystem. You're trying to use JS as a illustration of an OTL, which is baffling. Having a limited set of integer sizes would obviously not fly.

I'm apparently not getting the vision across. This is not even a type of thing that exists today, which is why I keep saying to use more imagination. Racket is only close due to its radical flexibility in inputs and outputs.

jasode · on March 28, 2020

>Any OTL would just provide defaults.

And you will inevitably have defaults that contradict each other which motivates another language.

Another way of saying "default" is "concepts in the programming language we don't even have to explicitly type by hand or have our eyeballs look at."

What should the OTL default be for not typing any explicit datatype in front of the following _x_ that works for all embedded, scientific numeric, and 4GL business?

  x = 3

Should the default _x_ be a 32bit int or 64int or 128bit int? Or a 64bit double-precision? Or a arbitrary precision decimal (512+ bits memory expandable) or arbitrary size integer (512+ bits expandable)?

Should the default for x be const or mutable? Should the default for x have overflow checks or not? Should default for x be stored in a register or on the stack? Should the name 'x' be allowed to shadow an 'x' defined at a higher scope? What about the following?

  x = 3/9

Should x be turned into approximation of 0.3333... or should x preserve the underlying symbolic representation of 2 rationals with a divide operator (3,div,9)?

The defaults contradict each other at a fundamental level. The default for x cannot be simultaneously be both a 32-bit int and a 512-bit arbitrary precision decimal at the same time. We don't need a yet-to-be-discovered computer science breakthrough to understand that limitation today.

If we go meta and say that the default interpretation for "x = 3" is that it's invalid code and the programmer must type out a datatype in front of x to make it valid, then that choice of default will also motivate another language that doesn't require manually typing out an explicit datatype!

Therefore, we can massively simplify the problem from "One True Language" to just the "One True Datatype" -- and we can't even solve that! Why is it unsolvable? Because the OTD is just another way of saying "read the mind of the programmer and predict which syntax he doesn't want to type out explicitly for convenience in the particular domain he's working in". This is not even a well-posed question for computer science research. Mind-reading is even more intractable than the Halting Problem.

As another example, the default for awk language -- without even manually typing an explicit loop -- is to process text line-by-line from top-to-bottom. This is not a reasonable default for C/Javascript/Racket/etc. But if you make the default in the proposed OTL to not have implicit text processing loop in the runtime, it motivates another language (such as awk) that allows for it. You can't have a runtime that has both simultaneous properties of implicit-text-loop and text-loop-must-be-manually-coded.

Whatever choice you make as the defaults for OTL, it will be wrong for somebody in some other use case which motivates another language that chooses a different default.

>Details are optional but always possible.

Yes, but any extra possibilities will always require extra syntax that humans don't want to type or look at. Again, it's not what's possible. It's what's easy to type and read in the specific domain that the programmer is working in.

>You're just repeating that there will always be gaps, with no evidence except gaps in languages produced by today's rushed, history-bound

Are you saying you believe that abstractions today have gaps but tomorrow's yet-to-be-invented abstractions can be created without gaps and we just haven't discovered it yet because it's really hard with our limited imagination? Is that a fair restatement of your position?

Gaps don't just exist because of myopic accidents of history. Gaps must exist because they are fundamental to creating abstractions. To create an abstraction is to create the existence of gaps at the same time. Gaps are what make the abstraction useful. A map (whether fold paper map or online Google maps) is an abstraction of the real underlying territory. The map must have gaps of information loss because otherwise, the map would be the same size and same atoms as the underlying territory -- and thus the map would no longer be a "map".

The mathematical concept of "average or mean" is an abstraction tool of summing a set of numbers and dividing by its count. The "average" as one type of statistics shorthand, adds power of reasoning by letting us ignore the details but to do so, it also has gaps because there is information loss of all the individual elements that contributed to that average. The unavoidable information loss is what makes "average" usable in speech or writing. You cannot invent a new mathematical "average" which preserves all elements with no information loss because doing so means it's no longer the average. We can write "the average life expectancy is 78.6 in the USA". We can't write "the average life expectancy is [82,55,77,1,...300 million more elements divided by 300 million] in the USA" because that huge sentence's text would then be a gigabyte in size and incomprehensible. You can invent a different abstraction such as "weighted average" or "median" or "mode" but those other abstractions also have "information loss". You're just choosing different information to throw away. We can't just say we're not using enough imagination to envision a new type of mathematical "average" abstraction that will allow us to write an alternative sentence that preserves all information of individual age elements without the sentence being a gigabyte in size.

>JS as a illustration of an OTL, which is baffling.

No, I was using JS as one example about surjection that affects forward engineering. When I say "this means Javascript can't be the OTL for all situations", it's saying all programming languages above NAND gates will have gaps and thus you can't make a OTL.

What's baffling is why anyone would think Racket's (1) defaults + (2) DSL + (3) macros -- would even be a realistic starting point for the universal OTL. The features (1,2,3) you propose as ingredients for universal OTL are the very same undesirable things that motivates other alternative languages to exist! Inappropriate defaults motivates another language with different defaults. The ability to write DSLs motivate another language that doesn't require coding that DSL. The flexibility of coding macros motivate another language that doesn't require coding the macro.

pythonaut_16 · on March 27, 2020

I don't think syntax is the big limitation here; it's library and behavior design.

The One Language concept could still be considered to be accomplished by a language with two syntaxes provided they have a low friction to interoperability.

qayxc · on March 27, 2020

This sounds more like a you-problem than a programming language problem.

The fact of the matter is that there can be no "perfect" programming language in the sense that it perfectly fits all possible use cases.

So rather than trying to develop or hoping for such language to be developed, a programmer should become multi-lingual. Experiencing first-hand how different programming languages and paradigms approach problems not only broadens the horizon, but also helps with choosing the right tool for the job.

No sane contractor would build a house using nothing but a hammer after all.

BiteCode_dev · on March 27, 2020

A programming language is not tool, like a hammer, with which you build a house. It's a truck full of toolboxes, containing sets of tools made to work in harmony, that you use to build parts of the house that requires a human to deal with it manually.

nicoburns · on March 27, 2020

> I'm hoping the technology will advance enough so that we eventually get one language that can navigate both side of the spectrum.

Me too! I'd love a language at the JavaScript/Python/PHP/Perl level, but in a Swift/Rust style. Possibly with some kind of gradual typing. TypeScript is pretty close to this, but alas its type system isn't sound. And it has to deal with the legacy of JS semantics (like exceptions).

lizmat · on March 27, 2020

Perhaps Raku (https://raku.org) could hit your sweet spot?

jamil7 · on March 27, 2020

I think Swift more or less hits this sweet spot for me. As mentioned Kotlin is also very close but comes with some baggage. Crystal and Nim are on the horizon and are promising this kind of combination of ergonomics, correctness and performance.

wool_gather · on March 27, 2020

Swift has a long road ahead for the very "low" end, i.e. replacing C or even C++. It's missing, or has extremely awkward versions (`UnsafeOMGPointer`) of various pieces right now, [and some may never even be added][0].

[0]:https://forums.swift.org/t/bit-fields-like-in-c/34651/7

jamil7 · on March 29, 2020

Yeah that makes sense, I guess in my mind I don't see it as a C or C++ replacement as with Rust. To me it fits as a slightly higher level, safe, general purpose language with pretty good performance for most tasks you throw at it. After working with it for a year I feel very productive and that I can trust my code will work if it compiles. A similar feeling to Elm or maybe Rust but yet to spend a longer time with Rust.

wool_gather · on March 29, 2020

Yeah, fair, and I agree with your take; there's just this longstanding idea/goal that Swift can (or will be able to) do it all.

smichel17 · on March 27, 2020

Have you tried Kotlin?

uvtc · on March 28, 2020

Sounds like you'd like Haxe.

intrepidhero · on March 27, 2020

Instead of a reaction to scripting languages, or maybe in addition to, I think the current trends of shepherding languages are reacting to the flexibility of C and, even more so C++. C++ in particular is such a mind-boggling huge language. It presents so many choices that designing anything new involves searching a massive solution space. A task better left to experts.

Newbies (speaking from experience) need a framework to lean on. Something that provides a starting point for solving problems. Opinionated languages provide that out of the box.

_mt3y · on March 28, 2020

I think the "C++ is huge" complaints are a bit overblown. C++ is huge, but most of its new features are designed with backwards compatibility in mind - if the size of the language bothers you, then you can write the limited subset of whatever C++ you know, or even just straight-up C, while making use of new features (auto, foreach, smart pointers) as you see fit. It's an all-you-can eat standard library buffet.

myu701 · on March 27, 2020

> Both Rust and Haskell wrap values that can fail in little boxes, and to get them out you have to check which type of value it is, and in C# there's nothing stopping you from returning null and not telling anyone that you can return null, and just assuming people will check for null all the time.

F# is the .NET citizen that does the equivalent of the Rust or Haskell stuff.

Either you use an option type (https://fsharpforfunandprofit.com/posts/the-option-type/) which is an easy way of making a function that says 'user says to find a record with the name of Bob, and you will either get a return type of Some record(id:1,name:bob), or you will get a return type of None'

     let GetThisRecord(name) =
          if SomeDatabaseLookup(name).IsSome then
               Some record(SomeDatabaseLookup(name).Value) // not idiomatic but works and is faster than a match
          else
               None

Or you use the Success/Failure type (see railway oriented programming)

cjfd · on March 27, 2020

The haskell situation sounds like generally a good thing but I am not sure I would like it very much if this also applies to logging.... It does not sound like great fun to have to change the signature of a function when it needs to log something and then change it again if it no longer needs to.

js8 · on March 27, 2020

For logging, you can use unsafePerformIO. Of course, you would call it inside a special function that can do logging. In fact, there are functions in Debug.Trace that do exactly that (to standard output).

Similarly, I used unsafePerformIO (again put into a convenient function) to save a checkpoint data in a large computation. The computation is defined pure, but it calls the function to checkpoint, which does in fact IO under the covers.

Remember, type safety is there to help you. As long as the function performing the I/O doesn't affect the outcome of the computation, everything is safe.

ghostwriter · on March 27, 2020

> As long as the function performing the I/O doesn't affect the outcome of the computation, everything is safe.

except it's not! Your IO action may not affect the outcome of the computation but it may launch the missiles in background, which changes everything. The less contrived example would be - "computation is fine and is not affected, yet we have our [production cluster deleted / disk space run out / money sent to wrong recepients] by the IO action".

nybble41 · on March 27, 2020

Despite the name, unsafePerformIO isn't automatically akin to undefined behavior in C. It can cause undefined behavior if misused, the most obvious example being the creation of polymorphic I/O reference objects which act like unsafeCoerce—but that would be affecting the outcome of the computation. If the value returned from unsafePerformIO is a pure function of the inputs then the only remaining risk is that any side effects may occur more than once or not at all depending on how the pure code is evaluated. As long as you're okay with that there isn't really any issue with using something like Debug.Trace for its intended purpose, debugging.

There are better ways to handle logging, of course—you generally want your log entries to be deterministic, and the ability to produce log entries (as opposed to arbitrary I/O actions) should be reflected in the types.

gowld · on March 27, 2020

Debug.Trace doesn't launch missiles or delete clusters or send money. It might run you out of disk space, but so can safe IO.

marcosdumay · on March 27, 2020

Honestly, that will depend on what exactly both you and the GP are calling "logging".

Usually "logging" is semantically relevant, and it better reflect on the return type. But well, it's pretty useless to log the execution of pure code anyway.

I agree that GP seems to be talking about print-debugging (one doesn't go changing his mind about semantically relevant logging), so everything on your comment is on the spot, but generalizing this can lead to confusion.

js8 · on March 27, 2020

Standard functional programming methods apply, in this case you would use inversion of control to limit the access to I/O.

If you need to do "semantically relevant" logging from a pure function, just create a pure function to process the semantic relevant part to something generic (like a Text), and call the simple unsafe logging function on the generic result.

dropofwill · on March 27, 2020

Thinking about the systems I work with I can only think of a few cases where logging is semantically relevant (the way I understand it).

One is replaying critical failed requests when a downstream was offline and the other is gathering tracking statistics from apache access logs.

Everything else I would classify as diagnostic, wondering if you would consider that semantic as well.

marcosdumay · on March 27, 2020

To clarify it, what I mean by semantically relevant is if it is on the user requirements. It's not semantically relevant if it's there just to make the developer's life easier. So, it seems we are using the same definition.

Every kind of software has some error log, long lived servers tend to have some usage log too, databases tend to have journaling logs, and distributed computing tends to have a retry log. There are other kinds of them, like all those lines a compiler outputs when it tries to work on a program, or the ones a hardware programmer shows while working. Every one of those are there for the user.

cjfd · on March 27, 2020

Okay, that does sound like a workable solution.

alephu5 · on March 27, 2020

There's a tendency to be very idealistic when talking about IO in haskell, people talk about launching missiles when you ask to print a string and it makes you think we're purist fools. For debugging you can easily drop print statements in without affecting type signatures (with the Debug.Trace package) and this is really helpful but in production you almost never want logging inside pure functions. Think about it, why would you want to log runtime information inside a function that does arithmetic or parses a JSON string? The interesting stuff is when you receive a network request or fail to open a file.

wyager · on March 27, 2020

If you have a large application written in Haskell, you're probably already using some sort of abstract or extensible monad for your "business logic", and that means it's usually not hard (in practice) to add a MonadLogger instance to your code.

Also, when you've written Haskell for long enough, you start to write your code in such a way that it's astronomically unlikely that you need to add logging to a pure function. I haven't found myself wanting to do that in years. Haskell has a library to do logging in pure code, but it's unpopular for a reason.

hopia · on March 27, 2020

You generally would not put logging into pure functions as that would be fairly pointless. You only log in the IO actions where you can log freely anyway.

whateveracct · on March 27, 2020

In my experience, it actually is a good thing to have to do that, especially in a context-logging world. The actual refactoring is rarely at all difficult in my experience, and by doing so you can make it so logging context is automatically threaded everywhere more ergonomically than other languages even!

And usually when you're logging, it's near other IO anyways. So that makes it even easier.

jiggawatts · on March 27, 2020

This.

People just want to get things done, and at some point you start fighting the language, except that the language wins and you lose.

One thing I like about PowerShell is that functions are surprisingly complex little state machines with input streams, begin/process/end pipeline handling, and multiple output streams.

Everything is optional and pluggable. So if you want to intercept the warnings of a function, you can, but it won't pollute your output type.

So in Haskell and Rust, you have "one channel" that you have to make into a tuple. E.g. in Rust syntax:

   fn foo() -> (data,err)

Imagine if you wanted verbose logs, info logs, warnings, errors, etc! You'd have to do something psychotic like:

   fn foo() -> (data,verbose,info,warn,err)

In PowerShell, a function's output is just the objects it returns. E.g. if you do this:

    $result = Invoke-Foo

The $result will contain only your data. Warnings and Errors go to the console. But you can capture them if you want:

    $result = Invoke-Foo -WarningVariable warn -ErrorVariable err
    if ( $warn ) { ... }

In some languages, like Java, strongly typed Exceptions play a similar role. You can ignore them if you like and let them bubble up, or you can capture them, or some subtree of the available types. The only issue is that this mechanism is intended for "exceptional errors" and is too inefficient for general control flow.

There have been proposals for extensible, strongly-typed control flow where functions can have more than just a "return". They can also throw exceptions, raise warnings, yield multiple results, log information, etc... The calling code can then decide how to interact with these in a strongly typed manner, unlike the PowerShell examples above which are one-way and weakly typed.

I'm a bit saddened that Rust didn't go down this path, instead preferring to inherit the current style of providing only a handful of hard-coded control flows, some of which are weakly typed. For example, there's only one "panic", unlike typed exceptions in Java.

steveklabnik · on March 27, 2020

> You'd have to do something psychotic like:

You wouldn't have to do this. First of all, if you're talking about something that can error, you'd use a Result, not a tuple (I'm going to use Rust names here):

  fn foo() -> Result<Data, Error> {

Note that you choose both of these types. You can make them do whatever you want. If you wanted to be able to stream those non-fatal things back to the parent, you'd either enhance the Data type to hold them, in which case there'd be no changes, or you'd create a wrapper type for it. You still end up with Result.

Rust also doesn't like globals as much as many languages, but doesn't hate them as much as haskell. Most logging is sent to a thread-local or static logger, so you don't tend to have this in the signature.

In general, many people consider the Result-based system Rust has to be much closer to Java's checked exceptions than most other things. I don't personally because the composability properties feel different to me, but it's also been a long time since I wrote a significant amount of Java code.

wyager · on March 27, 2020

> People just want to get things done

If you let people "just get things done", they usually do a shitty job, as we've seen from the last 50 years of software development. People need at least one of unfailing mechanical guidance or impressive levels of restraint. Most people don't have that much restraint (and it's exhausting to keep it up all the time), so the practical option is to have the compiler keep us in check.

If I'm not using Haskell (or equivalent), I usually end up thinking "eh, a quick print statement in the middle of this function won't hurt anybody" and before I know it I've lost the compositionally that makes me love Haskell programming.

> strongly-typed control flow where functions can have more than just a "return"

This sounds to me like what monads give you. ContT, MTL stacks, effect monads, take your pick. There are several ways to get strongly-typed advanced control flow in Haskell.

marcosdumay · on March 27, 2020

Hum... Haskell is the one language where people use pluggable middleware everywhere.

But if you program like in C#, you really won't be able to.

jlokier · on March 27, 2020

> literally any function that has an IO action taints the value returned from that function, causing it to be an IO value, and trying to pass that IO value into another function makes the return type of that function IO, too. Parametric polymorphism is the default, too, so it also shepherds you into writing general purpose code. Haskell is full of these little decisions where it just won't let you do something because it's not "correct" code, and they kind of don't care if that makes coding in it a fight against the compiler.

From a Haskell perspective, and a correctness perspective, and also Rust with its pointer tracking, all this makes sense. It's very helpful for correctness.

Yet, the IO monad "virality" reminds me of Java checked exceptions. Checked exceptions mean every function type signature includes the set of exceptions that function might throw.

When that was introduced, it was thought to be a good idea because it's part of the type-safety of Java and will ensure programmers write code that deals with exceptions correctly, one way or another.

But some years later, people started to argue that listing exceptions in the type signature is causing more software engineering problems than it solves (and C# designers took the decision to not include checked exceptions). Googling "checked exceptions harmful" yields plenty of essays on this.

For checked exceptions, there are people arguing both sides of it. Yet they are pretty much all fans of static typing for the rest of the language; it isn't an argument between people who favour static vs. dynamic typing.

So why are checked exceptions considered harmful by some? On the face of it, there's an argument against verbosity. But the deeper one is about software engineering. What I call "type brittleness".

When you have a large codebase, beautifully and carefully annotated with exact, detailed checked-exception signatures, then one day you have to add a trivial little something to one little function that might throw an exception not already in that function's signature... You may have to go through the large codebase, updating signatures on hundreds or thousands of functions which use the first little function indirectly.

And that's if you have the source. When you have libaries you can't change, you have to wrap and unwrap exceptions all over the place to allow them to propagate via libraries which call back to your own code. Sometimes there is no exception type explicitly allowed by the libaries, so you wrap and unwrap using Java's RuntimeException, the one which all functions allow.

The "viral effect" of so much effort for sometimes tiny changes is a brittleness issue. It leads people to resort to "catch and discard all" try-blocks, to confine the virality Sometimes it's "temporary", but you know how it is with temporary things. Sometimes it isn't temporary because the programmer can't find another clean way to do it while not modifying things they shouldn't or can't.

nybble41 · on March 27, 2020

> When you have a large codebase, beautifully and carefully annotated with exact, detailed checked-exception signatures, then one day you have to add a trivial little something to one little function that might throw an exception not already in that function's signature... You may have to go through the large codebase, updating signatures on hundreds or thousands of functions which use the first little function indirectly.

And you know what? That's probably a good thing. How else can you be sure that all those functions can deal with that exception correctly? If you're adding a new exceptional case to an operation, and rather than handle it locally you decide to punt the issue up the call stack, you should expect that to have far-ranging effects on the rest of the codebase. At that point you have two options for limiting the impact: you can handle the error close to the source, or you can rethrow it as a more general-purpose exception type which is already part of the function's signature (i.e. RuntimeException in Java) with the understanding that any handling of that exception will likewise be generic—typically cancelling or retrying the entire operation.

Of course, libraries which call back in to the user's code can be an issue. (More so in Java than Haskell—so far as I know Java doesn't have any way to make library functions polymorphic in the kinds of exceptions they can throw, whereas in Haskell the exceptions are just part of the type signature so there's no issue with saying "this function throws the same exceptions as the callback".) You may need to temporarily convert the exception into a return value or even provide some out-of-band channel to smuggle it across the library boundary.

pjmlp · on March 27, 2020

> Java checked exceptions

Actually CLU checked exceptions, Modula-3 exception sets, C++ exception specifications.

jlokier · on March 27, 2020

Good points, all.

I thought of Java only because I'd been reading essays about Java exceptions considered harmful, and then one day I recognised the problem it described, where to change one small function I had to do an absurd number of boilerplate-like edits elsewhere.

I found it quite thought-provoking about "type brittleness" with regard to aspects of the dynamic vs. static typing debate.

I've written in Haskell and SML too, where it didn't feel like the same level of brittleness. Perhaps it's to do with the size of applications and libraries, and how they evolve.

That's why I think of it as a software engineering getting-the-balance-right thing, rather than a correctness vs. prototype-in-a-hurry thing as static-vs-dynamic is often portrayed.

pjmlp · on March 27, 2020

I jump between Java and .NET languages depending on the project/customer, and one thing it bothers me in .NET land, or JVM guest languages, is having to hunt for exceptions, because documentation in some libraries is hardly up to date.

So one ends up putting a couple of catch all handlers in critical code paths, just in case.

yarrel · on March 27, 2020

"Rust makes it quite hard to do things" generally as a result of that decision. Even just syntactically it's a large overhead. It does force you to explicitly manage lifetimes at every place in your code. Which is a good example of the wrong implementation of the wrong objective.

zackmorris · on March 27, 2020

I agree with your assessment 100%. Does anyone else out there get frustrated with "bare hands" conventions? That's where you have to manually follow a verbose convention or write things like glue manually, when the compiler/runtime could do more of the heavy lifting automatically for us.

For example, say we want to hide low-level threading primitives due to their danger. So we implement a channel system like Go. But we run into a problem where copying data is expensive, so the compiler/runtime has an elaborate mechanism to pass everything by reference and verify that two threads don't try to write the same data. I'm glossing over details here, but basically we end up with Rust.

But what if we questioned our initial assumptions and borrowed techniques from other languages? So we decide to pass everything by value and use a mechanism like copy-on-write (COW) so that mutable data isn't actually copied until it's changed. Now we end up with something more like Clojure and state begins to look more like git under the hood. But novices can just be told that piping data between threads is a free operation unless it's mutated.

To me, the second approach has numerous advantages. I can't prove it mathematically, but my instincts and experience tell me that both approaches can be made to have nearly identical performance. So on a very basic level, I don't quite understand why Rust is a thing. And I look at tons of languages today and sense those fundamental code smells that nobody seems to talk about like boxing, not automatically converting for to foreach to higher level functions (by statically tracing side effects), making us manually write prototypes/headers, etc etc etc.

I really feel that if we could gather all of the best aspects of every language (for example the "having the system on hand" convenience of PHP, the vector processing of MATLAB, the "automagically convert this code to SIMD to run on the GPU" of Julia <- do I have this right?), then we could design a language that satisfies every instinct we have as developers (so that we almost don't need a manual) while at the same time giving us the formalism and performance of the more advanced languages like Haskell. What I'm trying to say is that I think that safe functional programming could be made to look nearly identical to Javascript, or even some of the spoken-language attempts like HyperTalk.

The handwaving around the bare hands stuff is what tires me out as a coder today because fundamentally I just don't view it as necessary. I really believe that there is always a better way, and that we can evolve towards that.

DubiousPusher · on March 27, 2020

This is my main issue with C++. For a while my job was to get game engine codebases running, integrate tools and move on. So I saw a lot of big C++ codebases. Nearly every one had the same bad behaviors. Tons of globals. Configuring build options from code. Header mazes that made it clear people didn't actually know what code their classes needed.

I then worked for awhile developing a fairly fresh C++ code base. The programers I worked with were very willing to write maintainable code and follow a standard and it was still really damn hard to keep things like header hygene.

When I go back to the language I can't believe how much time I spend dealing with minor issues that stem from the bad habits it builds. For years I would refuse to say any language was good or bad. Always I insisted you use the right tool for the job. And there are some features of C++ that when you need them you have to use that language or maybe C in its place. But the shortcomings are unrelated to language's issues which largely seem to come from a focus on backward comparability. And so even used in its right application it seems incredibly flawed. And I pretty much believe it's a bad language now.

Disclaimer: I learned to program with C++, I understand its power and for years I loved the language. I also understand there are situations where despite its shortcomings it is the right choice.

naikrovek · on March 27, 2020

Why are globals considered bad? I'm seriously asking. I, too, have been told hundreds of times over the course of my career, and I never questioned it. I want to question it now, because I've never understood why people work SO HARD to remove and avoid globals. I seriously doubt that the time and effort I've seen spent on removing and avoiding globals has been time well spent. And I'm quite sure that the effort spent on that is not comparable to the amount of problems prevented by not having globals. There's just no way globals can be dangerous enough to justify the size of globals-cleansing efforts I've seen.

Game development often has a very large global state, and game problems are often inherently global state manipulation problems; you need globals in order to even have the game in many cases.

sudhirj · on March 27, 2020

Imagine a kitchen where a hundred cooks are trying to make the same pot of soup, same pile of ingredients and utensils. Now imagine they all have telekinesis. That’s global state.

The problem is that when disparate bits of code directly affect the details or internals of a state machine, is pretty much impossible to ever maintain a valid state at all times. Throw in threading and the whole mess becomes non deterministic to boot.

All state management tools and procedures seek to handle this by encapsulating details and establishing rules for updates. Some like Finite state machines are more fixed and formalizable. Some like Redux are looser but stay deterministic.

mettamage · on March 27, 2020

That is such a fun image! Are any teachers taking note? I think this is a fun metaphor metaphor to use in a classroom setting.

ricardobeat · on March 27, 2020

As you mentioned state machines and patterns like reducers allow you to make state changes deterministic, solving the 'telekinesis' problem for global state. Conclusion?

sudhirj · on March 28, 2020

There isn’t really a conclusion - each solution pattern allows you to trade off progressively less control for more rigidity and determinism. Pick a system that matches your use case the best.

Think of all your state as a state machine. Is there a finite number of possible states you can be in, with clearly defined ways to go to each from each one? You have a finite state machine. Lots of libraries will be available in your language.

Are your state combinations unbounded and unknowable, but still subject to validation and sequencing? This is pretty much any UI - a Redux style system helps you organise changes and make them linear. Any number of states are possible, but they’re all deterministic and can be reproduced.

Can’t linearise the states but still have validation rules for correctness? Sounds like an RDMBS type system - set up you constraints, foreign keys and go to town with any number of threads.

There’s really no right answer. I just try to understand the problem as well as I can and see if the solution presents itself.

sudhirj · on March 28, 2020

There’s also one step after RDBMSs, which is the Redis style key value or data structure stores that allow some level of client based or cooperative structuring, using conditional sets and gets or CAS operations.

Then finally there’s the Wild West of everyone do whatever they want.

bluejekyll · on March 27, 2020

Global state is nearly impossible to test in any decent automated fashion. When writing unit tests, globals are the bane of your existence.

If you’re relying on globals for passing data, they are also difficult to reason about in multithreaded code.

There are means by which you can share data, that data if instantiated at the code entry point, can be shared in such a way as to never need globals, and rely on decent patterns for sharing between code points.

Yes, there is a trade off in adding parameters to functions, references in classes, but these can be avoided by adopting patterns like inversion of control, etc.

Basically globals are a bad pattern because they make it hard to test and hard to reason about data access patterns.

taeric · on March 27, 2020

This is only true in a case where you don't spin up and tear down your program per test case. And I don't want to defend globals.

Globals are bad because they are just often used poorly. In large because they require you to think about the whole system as you make changes.

Ironically, the best changes are done with the whole system in mind. Such that sometimes establishing a few core globals and some rules for how they will be treated can actually help your logic. So it really is a tradeoff. With a great slogan of "think globally, but act locally."

TheOtherHobbes · on March 27, 2020

It's about scope. The "ideal" design one pattern is supposed to be separation of concerns - the devolution of performance and responsibility into units that can be built and tested independently.

This is fine when that design pattern fits the domain. But some domains require global context, and it isn't useful or possible to strictly enforce separation - because you end up passing parameter bundles around and managing all those local scopes introduces more bugs than implementing a global context.

Multithreading is a different issue, and is a different kind of domain requirement. If you need multithreading and have a global context, you have a very interesting problem to solve.

pjc50 · on March 27, 2020

> This is only true in a case where you don't spin up and tear down your program per test case.

Well, yes, but then the unit tests end up taking twenty minutes.

nybble41 · on March 27, 2020

Not only that, you also can't trust that the test results will apply to any situation where the user doesn't restart the program after every action—i.e., to normal operation.

Don't restart the program between tests. Randomize the order of the test cases between runs. Try running the same test multiple times on occasion.

taeric · on March 28, 2020

You shouldn't have your tests artificially limit the life cycle of parts. Either for artificial reuse or artificial termination.

To that end, if you have variables that live as long as your program, or longer, have your tests reflect that.

speleding · on March 27, 2020

I agree that globals are usually a bad pattern, but there are situations where judicious of globals is warranted and can actually improve readability.

An example is small scripts, where the scope of the script limits the scope of the global. The overhead of an abstraction doesn't pay off in that case.

Another example are "near constants" like a locale setting, an environment variable that gets detected once at startup, or a development feature flag. The "proper" way to structure those is to create a settings object and pass it to every function that needs them, but judicious use of a well-documented global can prevent a lot of boiler plate code.

Of course, as soon as the code base needs to be touched by many devs, especially less experienced ones, it's safer to say "never do it" than "judiciously use", so I understand why most textbooks say this.

clarry · on March 27, 2020

> Another example are "near constants" like a locale setting, an environment variable that gets detected once at startup, or a development feature flag. The "proper" way to structure those is to create a settings object and pass it to every function that needs them, but judicious use of a well-documented global can prevent a lot of boiler plate code.

In small programs, globals are ok, but in larger programs a better approach would be a global accessor that gives you read-only access:

    printf("%s\n", Environment()->Host);

This doesn't require passing an object to every function, and the application still can't trample on these variables.

naikrovek · on March 27, 2020

I don't understand. If you have a good understanding of the code you're writing, you won't put yourself into a position where globals cause problems unless you're being very stupid, and if you do, normal use of the program will detect those problems, right? Certainly bug reproduction steps and a debugger will figure out what's going on.

You mentioned unit tests, and these are another thing I don't fully understand. Obviously testing your code is important, and automated tests are good. My beef with unit testing comes with the requirement that all methods and functions have multiple tests each for success and failure conditions, and that results in test code which outweighs tested code by several times. When you discover that the architecture you've been putting together isn't going to work, which is something that happens approximately 100% of the time if you're doing anything real, you now have (say) 5,000 lines of code that needs rework, and 50,000 lines of test code that need to be thrown away and rewritten.

That is A LOT of effort to shove onto yourself to avoid a few global variables, to me. That's so much effort that many projects will just not make the change and ship software that they know is insufficient, and then they'll graft on whatever functionality can't be attained natively with the given architectural decisions rather than redesign.

The ability to paralyze yourself with the weight of unit test code seems like an extremely high price to pay to avoid some global variables.

Others · on March 27, 2020

I think that globals are not a problem when "you have a good [enough] understanding of the code you're writing." The problem is when code bases grow, references to globals can start to appear in lot of different places, and the exact use of a particular one can be hard to reason about. (Strictly talking about mutable global state here.)

As code bases grow, and developers come and go, eventually no one will have a "good [enough] understanding." Mutable global state is fundamentally hard to reason about since it can be changed at any time by any part of the program. When you first start out the codebase, you can just remember where all the usages are. But eventually that is not a good approach.

I consider the testing stuff orthogonal and muddying the issue. Mutable globals are hard to reason about, therefore they can make code hard to debug. Thus they should be avoided. No need to bring testing into the picture.

kungtotte · on March 27, 2020

I think most of the problems with globals can be solved at a language level. Immutable references to globals are practically never a problem, so if your language forced you to explicitly mutably borrow a reference to a global variable you force the programmer to think about every instance of code where they are modifying the global.

This also enables tooling to for example syntax highlight these things differently. An immutable global looks like any other variable, but a mutable global is bold red.

You bet your ass that people will think about whether they really need it mutable in that case, and they'll know everywhere it's made mutable and therefore error prone.

Again this comes back to shepherding. Globals in Rust aren't the same as globals in C++ because the languages shepherd you differently.

amylene · on March 27, 2020

Remember the old phrase, “imagine the person who maintains your code is an axe wielding murderer who has your home address”?

tonyarkles · on March 27, 2020

One of the problems I’ve encountered in the wild is that globals often mean that you have to check the entire program when things go weird. You’re right: if you have a complete understanding of the entire codebase, then it probably won’t be an issue. But software grows and ages; globals won’t hurt you (much) early on in the project, but they start to in the long term. Your coworker modifies it in a place where you don’t realize it’s being modified, and things that worked fine yesterday stop working. The coworker might be you when you’re tired :)

unlinked_dll · on March 27, 2020

Not sure what your gripe on testing has to do with what the comment is saying. Globals make testing hard.

The simple answer is that globals are expensive. Literally, they cost a lot of money. They introduce bugs that are harder to find, reproduce, and fix. That means introducing a global is a high risk, since it's increasing the expected value of your non recoverable engineering costs.

Rejecting globals is about lowering risk and cost because it's so easy to not use them and toss them out of code review, and it's really easy to work around that limitation.

Gonna remove my more uncivil remark. Basically relying on bug reports and debugging is the software equivalent of waiting for your engine to seize before you change the oil.

NathanKP · on March 27, 2020

> When you discover that the architecture you've been putting together isn't going to work

One of the underappreciated benefits of unit tests is it quickly teaches you how to write good code. It turns out testable code is also code that tends to be well architected and doesn't need to be rewritten. Basically writing tests leads you to being a better programmer

giantDinosaur · on March 27, 2020

Unit tests are perhaps good for instilling a decent sense of function decomposition, but make no mistake, you can go too far in this direction and not develop the sense of an integrated system. It's a hard problem to avoid, especially when starting out. That's one of the reasons I generally find type-driven development better for seeing how parts are actually interacting.

Not to discount, testing, naturally, but I also prefer property based testing to unit for the same reason (i.e. a function can be a mini-system with relationships between internal values that may not be exposed with unit tests.)

sgift · on March 27, 2020

That's a myth. It teaches you to write code which is easily unit testable. That may be a better architecture than the one you would have used, but often it's just a different architecture, sometimes even markedly worse.

I have seen far too many code bases with simple things chopped up beyond recognition to make the code unit testable.

anthonyrstevens · on March 27, 2020

"[S]imple things chopped up beyond recognition" sounds like a hyperbolic argument to me. What is an example? When is the maxim "A function should do one thing well" not applicable?

tasuki · on March 27, 2020

> One of the underappreciated benefits of unit tests is it quickly teaches you how to write good code. It turns out testable code is also code that tends to be well architected and doesn't need to be rewritten. Basically writing tests leads you to being a better programmer

In majority of situations, this holds (apart from the "doesn't need to be rewritten" part!). But there's a large minority of situations where it doesn't.

naikrovek · on March 27, 2020

I have never witnessed that in 15 years of working at places which write unit tests. I've witnessed a LOT of unit tests which test nothing and manually return the pass/fail result desired so the indicators stay green.

NathanKP · on March 27, 2020

That's very unfortunate.

mdpopescu · on March 27, 2020

I think the "leave the site better than you found it" advice applies here. Whenever you need to touch a piece of code, write the proper tests (hell, add some fuzzy testing if you don't want to write them by hand) and then improve that code.

LandR · on March 27, 2020

Try reasoning about a code base that is 5million+ LOC, tens of thousands or hundreds of thousands of functions and has 300+ people working on it. THis

>If you have a good understanding of the code you're writing, you won't put yourself into a position where globals cause problems

Becomes basically impossible.

mdpopescu · on March 27, 2020

Because of spooky action at a distance.

Consider the following code fragment:

  glob = 5;
  f();
  // what is the value of glob here?

The problem with globals is that you can't know. f might change glob, directly or indirectly, and there is no way you can keep in mind all possible changes (especially with multiple people working on the same codebase).

(The same problem can happen, on a more limited scale, with class fields - which is why some of us insist on requesting that classes are kept small and cohesive.)

Note that this does not happen as often with database values (which are also globals that can be changed from any point in the program) because of expectations. When using those, we have all kinds of mechanisms - like transactions and isolation modes - that let us specify how much we want a value we have written to stay like that until we're done with it; when we don't use those mechanisms we generally expect that "this value could change between one statement and the next".

cbharris · on March 27, 2020

I think one of the main complaints about global variables is that because you can change them from anywhere within the code, you are tempted to actually do so, which can get into some pretty nightmare debugging scenarios. If you truly have global state, I think the preferred solution is to have one piece of code which changes/updates the state, but everywhere else may simply read it. Then you at least know where the problem has to be if your state updates are buggy.

ionforce · on March 27, 2020

A loss of local reasoning. You don't know which functions will touch the variable and when.

You might know at a high level on paper, but you won't have clean, easy-to-read and easy-to-predict life times. Then you'll have race conditions.

MaxBarraclough · on March 27, 2020

Related to this: the great advantages of pure functions.

naikrovek · on March 27, 2020

> You don't know which functions will touch the variable and when.

How do you not know? You have the source code. You can run it through a debugger. "grep" can find where that variable is used. Of course you know...

elpatoisthebest · on March 27, 2020

I don't think that person means you literally can't know, just that it increases the difficulty of reasoning through the code.

I was debugging some code earlier today. Someone had put a global variable that is either altered or used in 4 or 5 different functions across our codebase. I had to literally draw out the paths a user could go down to figure out what the value of this global variable would be at the time I was trying to call one of those functions. It was not awesome.

I figured it out, so you're right. I do know which functions touch the variable and NOW I know when. But I still can't guarantee the value of the variable.

Needless to say, tomorrow will see a little refactoring.

gregmac · on March 27, 2020

I was dealing with a hard problem earlier this week, which I'm pretty sure was causing a thread to crash without logging anything, but the program to stay running. Unfortunately, only seen in production and only once every few days.

The program does several stages of data processing in parallel batches, initially loading and eventually saving to a database. It's basically a "continuous" and complicated ETL.

There is effectively a set of global state variables to track progress of each input item through the stages. The values in this global state can depend on the data, execution order, and can be modified from a dozen places in the code.

I narrowed down several potential crash points, which was basically stuff like: if the global state contains x and a db lookup in thread 2 times out, if thread 3 accesses the value before 2 starts the next batch it could get a null reference. Another was based on making a decision to insert or update: in theory, the two global state value that effectively made this decision could never be set to states where it would do the wrong thing (getting either a foreign or duplicate key error) but the state is possible to represent.

If I were to run in a debugger using the massive production data stream I might eventually get lucky and see the data that triggers this. However, I could also sit for days and get nowhere, or the act of debugging and inspecting night be enough to prevent a race condition and not trigger a bug.

I still don't know for sure what's happening (though now there's instrumentation and better error handling in those spots so hopefully I will), but the point here is it's nearly impossible to reason about in a definitive way.

MaulingMonkey · on March 27, 2020

This works fine on small scales.

When dealing with millions of lines of code, I do not have the time to read the whole thing and internalize it's whole state. Understanding the call graph can help, but diving through every abstract interface and callback and abstraction is a non-starter. Even if I had time to read the entire codebase line by line, I wouldn't be able to fit it all in my head, and I often have enough coworkers that changes are occuring faster than I can read and understand them all.

Even the codebases I work on are dwarfed by much larger ones.

AlexCoventry · on March 27, 2020

> How do you not know? You have the source code.

For instance, concurrent accesses and modifications could occur in any order.

Sammi · on March 27, 2020

You loose local reasoning as was already said.

In theory you have the source code and you can know everything just by reading it all and debugging it all. In practice it becomes overwhelming.

Even intelligent people can only fit a little bit of information into working memory in their heads at a time. Mere mortals have no chance. We need things to be bite size and local and simple so we can fit it in our heads and reason about it.

Global variables force you to do global reasoning, which a human mind just doesn't have the capacity to do.

meheleventyone · on March 27, 2020

There are lots of ingenious ways to accidentally hide where a variable is used. Start passing some pointers around and storing them off under different names.

And of course with a race condition in a multithreaded context knowing where a variable is accessed is about 1% of the battle.

MaulingMonkey · on March 27, 2020

Reasoning about code requires reasoning about relevant state. On the one extreme, you have pure functional programming, where all state is passed in and returned out - all relevant state is explicit and "obvious". On the other extreme, you might use global state for everything - relevant state requires diving into all your code. This sounds unthinkable in the modern era, but similar styles aren't entirely uncommon in sufficiently old codebases with codebases that didn't really bother to use the stack.

This is part of the reason why memory corruption bugs can be so insidious in large codebases - if anything in your codebase could've corrupted that bit of memory, and your codebase is millions of lines of code, you have a large haystack to find your bugs in, and your struggle will be to narrow down the relevant code to figure out where the bug actually is. This isn't hypothetical - I've had system UI switch to Chinese because of a use-after-free bug relating to gamepad use in other people's code, for example.

(EDIT: Just to be clear - globals don't particularly exacerbate memory corruption issues, I'm just drawing some parallels between the difficulty in reasoning about global state and the difficulty in debugging memory corruption bugs.)

> Game development

John Carmack on the subject, praising nice and self contained functional code and at some point mentioning some of the horrible global flag driven messes that have caused problems in their codebase, mirroring my own experiences: https://www.youtube.com/watch?v=1PhArSujR_A&feature=youtu.be...

> you need globals in order to even have the game in many cases.

Simply untrue unless you're playing quite sloppy with the definition of "globals" and "global state". The problem isn't that one instance of a thing exists throughout your program, it's that access is completely unconstrained. Game problems do often involve cross cutting concerns that span lots of seemingly unrelated systems, but globals aren't the only way to solve these.

jiggawatts · on March 27, 2020

> Game development often has a very large global state

Not any more!

I'm currently playing Doom Eternal, and I've got to take my hat off to its developers: It's ridiculously well optimised! I played the previous version of Doom on the same hardware, and it was a stuttering mess at 4K, but now it's silky smooth with Ultra Nightmare quality settings. Wow.

They achieved this by breaking up the game into about a hundred "tasks" per frame, and each task runs in parallel across all available cores. These submit graphics calls in parallel using the Vulkan API.

There is just no way to write an engine like this with a "very large global state". No human is that good at writing thread-safe data structures.

The only way to do it is to separate the data and code, making sure each unit does its own thing, independently of the others as much as possible.

marcosdumay · on March 27, 2020

Hum... May I ask how those different tasks communicate with each other?

kungtotte · on March 28, 2020

I have no idea about how Doom Eternal does it, but John Carmack has some ideas on how to parallelize game engines here: https://www.youtube.com/watch?v=1PhArSujR_A&feature=youtu.be...

tmountain · on March 27, 2020

Haha, love that game, but it gives me high blood pressure ;-).

cptnapalm · on March 28, 2020

Where would be a good thing to read about that?

ben-schaaf · on March 27, 2020

I'll try to address things other replies haven't. Global variables are not just a problem for understanding code, but they also have a large potential for causing incredibly hard to debug bugs. Say you're writing a parser and decide to use `strtok`, which uses global variables. Everything works fine, but then you try to improve performance using multiple threads and suddenly your linux and mac users are seeing all kinds of weird incorrect behavior. Turns out strtok uses thread local storage on windows, but not on other platforms, so your parallel strtok's were all overriding each other.

_y5hn · on March 27, 2020

It's a good question, and we should always question our assumptions.

Globals and Singleton avoidance stem from long-term experience. Their design tend to lead to write-only code: Because any part of the code can at any time access and modify them, globals quickly become distinct from your main program flow!

This property makes them more complex to reason about, while overall codebase complexity tend to increase as well. From a complexity standpoint, at some time globals become an untenable nightmare to develop further and maintain. Because of lack of foresight and design, you get stuck with too much scared code to properly refactor. The tunnel to clean up the "mess" will be long and dark. Bugs may also be introduced, making it tempting to rebuild everything from scratch, something with its own caveats and troubles. If you lacked design the first time, how sure are you to be able to hit the nail the second time? It's costly and doesn't benefit from an iterative approach with rapid feedback cycle.

For small scripts / one-offs, globals and singletons are OK. Good coders know they're there, how to remove them, and nobody else are going to build airport traffic controller software on top of them.

Btw, encapsulating globals/singletons with OO CRUD or REST, doesn't make them any less distasteful. You end up exporting complexity to all the different parts of the whole codebase, instead of encapsulating behaviour within its own domain.

mastrsushi · on March 27, 2020

A simple, quick answer that I'm sure you heard is that global variables pollutes namespaces. The qualities of design choices are rarely apparent outside the real world.

A big problem with global is that it's often abused as a work around. Restricting access is an abstraction. The user isn't expected to alter this value, why should they be allowed to? What's more critical is the fact that the programmer might not realize what they wanted to be simply accessible is in fact static as well. So you now have a variable that's not only accessible, but state dependent. Now anyone using this variable has to be mindful of this.

Unless there is C code being called as well, in C++ you should rarely use global. It's much more manageable to have a game class object, where inside it, what used to be global could now just a private member that's global to that class.

You're team put in all that effort to remove global because it takes even more effort to get rid of all the trivial errors tied to the choice to begin with.

It all comes down to writing reusable code, objects that manage themselves. People shouldn't have to be cautious when reusing code. This doesnt only apply to other programmers but yourself 6 months down the line.

nottorp · on March 27, 2020

> It's much more manageable to have a game class object, where inside it, what used to be global could now just a private member that's global to that class.

That's still a global, except now it has lipstick.

mastrsushi · on March 27, 2020

If it's private in the class then why are you saying it's global?

nottorp · on March 27, 2020

"a game class object" is 99.95% likely to contain the whole game. Doesn't matter that it's labeled private, it's basically global to all the code of the ... game.

mastrsushi · on March 27, 2020

> 99.95% likely to contain the whole game

That doesn't sound like a wise assumption. It is common to have the actual engine of the game, and even the game itself separate from other architectural components.

meheleventyone · on March 27, 2020

At first blush a lot of games systems and code look like they're global but aren't really. If you think about a game as the flow through a frame you can break things down and it turns out a lot of things are not as global as you first think.

For a naive example game flow is basically:

- Get input.

- Update game state.

- Render.

If each stage only consumes data generated by the prior stage then it doesn't need to be dependent on how that data was generated. Nothing needs to be global in this case.

There's nothing inherently wrong with modelling this using globals though just that they require more discipline on the part of programmers to stick to the application design. It's sorely tempting to just reach in and tweak something when it's easy and then suddenly your entire application is a spiderweb of little tweaks. Not using globals and only having the systems and data available that you need to use makes the design harder to break and its much easier to detect the spiderweb creeping in.

This isn't limited to globals though, dependency injection, IoC and other application patterns suffer the same problems as well. Lot's of software ends up passing around a 'context' or injecting defacto globals everywhere which results in the same spider web except you can't even navigate the codebase sanely.

The problem with the spiderweb is that it's harder to maintain and can make things more difficult down the line if you want to re-architect things for example to make the game multithreaded.

More generally the harder we make it to mess stuff up the less stuff will get messed up and the easier it will be to find. That's partly why static types, lifetimes and immutability are popular. They of course come with tradeoffs in performance or ease of use that need to be weighed. Software design choices are just a less strict version of the same.

progre · on March 27, 2020

One fun class of bugs that occurs on 8-bit systems is when you have a 16-bit global variable (C makes this easy), and read access is actually 2 separate reads (one for each 8-bit part). This is invisable from the C code. Now lets say there is a separate thread or an interrupt that writes the variable in between the two read phases. Most of the time its fine, but every so often you get garbage (often double or half the value you expected).

codr7 · on March 27, 2020

The more local the state is, the easier it is to reason about since less code can reach it. And invalid state is a major source of bugs.

And the more global the state is, the less modular the system becomes, which makes it more difficult to test and adapt to new requirements.

That being said, it's more important to know why than memorizing/following rules, every good decision is some kind of compromise.

bskap · on March 27, 2020

It's usually a sign that you don't have clear component boundaries or well-defined interfaces, which means your code is going to be harder to test and harder to debug. Every place you read and write global state is also a potential race condition in multi-threaded code.

Of course there are places where global state is unavoidable (even if it's just "the filesystem"), but by confining your global state to a small corner of your codebase and having the rest of the code interact with this component instead of touching all the global variables directly, you can reduce the number of potential problem spots.

_8ljf · on March 27, 2020

Coupling. If any part of the program can touch a global variable, then the only way to understand how that global is used and when and why it changes is by understanding the entire program. Limiting the variable’s scope (e.g. to a module or class) makes it easier to reason about, as there’s less to learn and mentally model all at once.

Have you got Steve McConnell’s Code Complete? Read the chapter on coupling and cohesion. If not, you should. (You can nab a first edition off eBay for a few bucks.) Good for the “Why”s of software construction.

cryptonector · on March 27, 2020

Global state == shared state if threading (and you probably will be eventually) == a mess.

Global state == lots of refactoring if you want to make your program a library (OpenSSH is a poster child for this).

Write it like it needs to be a library. Write it to be thread-safe. Write it to use async I/O. Do these things and you'll save yourself a ton of work later. Learn to do these well and you'll always do this from the get-go.

Gibbon1 · on March 27, 2020

> Why are globals considered bad? I'm seriously asking.

I think outside of special cases they bad. I use globals for embedded code because I don't have a heap.

What I've found is as long as globals are used to hold state and not pass data via spooky action at a distance they're okay. A test is if you can trivially refactor them out then they're okay.

Example you have one uart.

   UartInit(baud_rate, bits, parity, stop);

Now you have two so gets refactored

   UartInit(port, baud_rate, bits, parity, stop);

Terrible is shit code like this.

   foo.bar = 2;

and somewhere else in the code

   if(foo.bar == 2)
   {
      foo.bar = 0;
      ...
   }

A note: Game programs to me look like really big embedded programs.

_bxg1 · on March 27, 2020

Games and UIs are special when it comes to state. They're weird in the space of all programs because their domain specifically concerns itself with maintaining and transforming a bunch of state over time. State is the point, in a way that it isn't for the vast majority of programs.

There are still lots of cases in games where state shouldn't be global, but there are also lots of cases where it's very natural and legitimate.

magicalhippo · on March 27, 2020

In addition to the reasons mentioned, one reason is that, by design, you can only have one instance of a global variable.

That might be fine today, but who knows what tomorrows requirements might entail.

At the very least, put global variables in a context object, and pass that around. Then it's clear what is affected by and can affect the "global" state, and it's easy to create multiple context instances if you suddenly find you need to.

billfruit · on March 27, 2020

In my view it makes the code extremely difficult to understand when someone other than the original author tries to read/modify.

The side effects of changing a global variables value is very difficult to glean from code.

It is as if some inputs to a function are getting passed to it implicitly, and it isn't obvious what value it has, who has set it, and what effect will be produced if you change its value.

Hitton · on March 27, 2020

Globals, similarly to "goto", are considered bad, because people tend to abuse them. But, same as with goto, they are not inherently bad and have their use. There are just lot of bad programmers who have been told that using globals (or goto) is dangerous and take it as "NEVER USE GLOBALS (or GOTO)" and spread this warped message further.

astrobe_ · on March 27, 2020

Probably it takes a lot of experience to use both correctly. If we are talking about a small programs, no threads, no chances of reuse (no modularity) - in other words "Keep It Simple and Stupid" - then it is fine.

But KISS is difficult to achieve: there's Hubris that pushes you to do "powerful" things rather than getting the job done, there's "anticipatis" that pushes you to have an answer ready for all future changes instead of solving the problem you actually have now, there's deadlines, and there's invasions of external unwanted complexities (silly requirements, interfacing with buggy software/hardware...).

That's why generally speaking "don't" is the safe piece of advice. But those who think they have the basics down can try it (in a harmless context like personal tools) and see what happens for themselves.

p_l · on March 27, 2020

one example of good use of globals is for very light-weight pub/sub, where you keep the rule that only one place can write to the global (preferably with something like atomic write) and any other place only reads.

DubiousPusher · on April 1, 2020

I use this kind of blackboard system still when I can't avoid globals. The main thing it helps with I find is you still have to know the order of and when your systems are being setup.

Ran into so many bugs from people creating static instance globals and thinking it was good that they didn't have to care when systems were setup.

I hate create on access with a passion. For god sake just new the damn thing at the beginning of main if nowhen else.

haimez · on March 27, 2020

Singletons (globals) considered stupid, a Yeggie classic: https://sites.google.com/site/steveyegge2/singleton-consider...

cormacrelf · on March 27, 2020

If your game state is a global because every action in the game changes the global state, then that's great, your game will be alive. There are so many valid states and perhaps rather few invariants, or you are okay with invariants being enforced once every few seconds. You do what you have to.

Not every program is like this. Consider something like TeX, whose goal is perfect bit for bit reproducibility of documents across every run on every machine. Same with a compiler.

When you say globals, I imagine those kinds of programs having code like this:

    add_to_current_doc_index();
    fn document_map_overflow() {
        ERROR_CODE = 46;
        print_current_error();
        exit();
    }
    literally_every_function_could_write_an_error_code();
    do_not_call_another_one_without_checking_it();
    if ERROR_CODE != 0 { return ...; }

This is what XeTeX code looks like. You don't have to do it like that! You can write this:

    pub struct EnforcesOwnInvariants { private_data: [u8; 65536] }
    impl EnforcesOwnInvariants {
        fn get_first_something(&self) { ... }
        fn flush_cache(&mut self) { ... }
    }
    static GLOBAL_DATA: Mutex<EnforcesOwnInvariants> = ...;

The second kind is better, because you can at least say that any internal invariants in the global data should be upheld in very specific code.

But when you use the first kind, you're completely giving up on being able to point to the line of code responsible for global data having bugs in it. Obviously you can use globals without this problem if you encapsulate them effectively, but you'd need your language to "shepherd" you towards this. All the C codebases that had no such shepherding seem to end up looking exactly like this, and it is truly awful trying to find the source of bugs. You'll notice the languages that shepherd you away from globals (Rust) do so because they want your programs to work when you decide one thread is not enough. This has the side benefit of shepherding you away from global data generally, and mutability rules restrict which code can modify, so there is a huge impact overall on how you look for bugs.

Essentially, you're having the same discussion the original article is saying is fruitless. Globals can be good or bad! You can make them accessible everywhere without actually accessing them everywhere and causing debug problems. But do they make good code easy to write and bad code hard? Absolutely not. They are bad shepherds, pied pipers that offer you easy solutions that make your codebase worse.

523453245 · on March 27, 2020

You will end up with code like this:

https://github.com/elonafoobar/elonafoobar/blob/develop/src/...

A lot of those variables could have been grouped into a struct. Like all those key_<action> variables. Even if you think global state is fine you would only have one way of accessing it. It would be closer to this:

game_instance.key_mapping.charainfo

but I never see things like that. All I ever get to see is projects with almost thousands of global variables.

z3t4 · on March 27, 2020

Globals are bad when they are used together with the include pattern. So you are reading code and see variable foo and have no idea what it does, cant find it when searching in file, then you find it in an include file two levels down. Try to refactor only to to find its used elsewhere too, and sometimes included twice, and sometimes overwritten (but you are not sure if that is a bug or not).

jcelerier · on March 27, 2020

In any decent IDE the operations you mentioned are one keyboard shortcuts each - go to definition and find all uses. This is really a non-problem.

z3t4 · on March 27, 2020

IDE's are good at treating the symptoms. But it's also possible to write the code so that you don't need an IDE to untangle it: For example keeping all variables within (file) scope, and abstracting out into reusable (reusable elsewhere) libraries.

jcelerier · on March 27, 2020

do you never have to check for all the calls to a given function in large-ish programs ?

z3t4 · on March 27, 2020

You can make functions pure and specific so they rarely need to change. And use name-spaces and naming conventions - so the variables can be found with grep (find in files).

Lets say you are upgrading an API, lets call it "HN", to a new major version, which has made a breaking change by renaming HN.foo to HN.bar. Now if you have always named the API "HN" you can just make a "replace in file" operation where you replace HN.foo with HN.bar - after you have already checked that there is no HN.foobar (to prevent HN.barbar)

Even sophisticated IDE's will have trouble following functions in a dynamic language that is passed around, renamed, returned, etc. So I would never trust an IDE to find all calls-sites.

Heavily depending on an IDE or tooling can also lead to over-use of patterns and boilerplate that the IDE handles well. And unnecessary work like adding annotations just to satisfy the tooling.

anonymoushn · on March 27, 2020

Not GP, but for most programs I write myself I cannot find all the call sites of a certain function because of using first class functions a bunch. When I worked in nginx I had a smaller amount of similar trouble, since nginx frequently but not pervasively uses function pointers to decide what to do next.

flohofwoe · on March 27, 2020

Globals are not inherently evil, but "shared, mutable state" is, basically if any part of the code is able to scribble over any global at any time.

If your globals are constants, or the globals are only visible inside a single compilation unit where it's easy to keep the situation "contained", they are perfectly fine.

miltondts · on March 27, 2020

Some people agree with you. For example https://scattered-thoughts.net/writing/local-state-is-harmfu...

kazinator · on March 27, 2020

We cannot discuss globals without pinning own exactly what we mean by globals.

Is a global a piece of information of which there is one instance?

Or is it a variable which is widely scoped: it is referenced all over the place without module boundaries?

See, for instance, in OOP there is the concept of singletons: objects of which there is one instance in the system. These objects sometimes have mutable state. Therefore, that state is global. Yet, the state is encapsulated in the object/class, so it is not accessed in an undisciplined way by random code all over the place. On the other hand, the reference to the object as a whole is a plain global: it's scoped to the program, and multiple modules use it. Ah, but then the reference to the singleton is not a mutable global; it is initialized once, and points to the same singleton. Therefore, singletons represent disciplined global state: a singleton is an immutable reference to an object (i.e. always the same object), whose mutable state (if it is mutable) is encapsulated and managed. This is an example of a "good" global variable.

Another form of "good" global variable is a dynamically scoped variable, like in Common Lisp. Reason being: its value is temporarily overridden in on entry into a dynamic scope and restored afterward (in a thread-local way, in multithreaded implementations). Moreover, additional discipline can be provided by macros. So that is to say, the modules which use the variable might not know anything about the variable directly, but only about macro constructs that use the variable implicitly. Those constructs ensure that the variable has an appropriate value, not just any old value.

Machine registers are global variables; but a higher level language mangages them. A compiler generates code to save and restore the registers that must be restored. Even though there is only one "stack pointer" or "frame pointer" register, every function activation frame has the correct values of these whenever its code is executing. Therefore, these hardware resources are de facto regarded as locals. For instance, a C function freely moves its stack pointer via alloca to carve out space on the stack, as if the stack pointer register belonged only to it.

Global variables got a bad name in the 1960's, when people designed programs the Fortran and COBOL way. There is some data, such as a bunch of arrays. These are global. The program consists of a growing number of procedures which work on the global arrays and variables. These procedures communicate with each other by the effect they have on the globals. The globals are the input to each procedure and its output. When one procedure finishes, it places its output into the globals, and then when the next one is called, it picks that up, and so on.

The global situation was somewhat tamed by languages that introduced modules. A module could declare variables that have program lifetime, but are visible only to that module, even if they have the same name as similar variables in another module. In C, these are static variables. C static variables and their ilk are considerably less harmful than globals. A module with statics can be as disciplined as an OOP singleton. The disadvantage it has is that it cannot be multiply instantiated, if that is needed in the future, without a code reorganization (moving the statics into a structure).

lmm · on March 27, 2020

> See, for instance, in OOP there is the concept of singletons: objects of which there is one instance in the system. These objects sometimes have mutable state. Therefore, that state is global. Yet, the state is encapsulated in the object/class, so it is not accessed in an undisciplined way by random code all over the place. On the other hand, the reference to the object as a whole is a plain global: it's scoped to the program, and multiple modules use it. Ah, but then the reference to the singleton is not a mutable global; it is initialized once, and points to the same singleton. Therefore, singletons represent disciplined global state: a singleton is an immutable reference to an object (i.e. always the same object), whose mutable state (if it is mutable) is encapsulated and managed. This is an example of a "good" global variable.

Lol no it's not. It has all the problems of any other global: unsafe to use concurrently, difficult to test, difficult to reason about.

kazinator · on March 27, 2020

It's best not to conflate global variables and their problems with the issues of shared, mutable state.

The difficulties caused by global variable are related to them being shared, mutable state. But global variables are recognized as causing additional problems, in the context of programming with shared, mutable state. So that is to say, practitioners who accept the imperative programming paradigm involving shared mutable state nevertheless have identified global variables as causing or contributing to specific problems.

In an OOP program based on shared mutable state, singleton objects having shared mutable state do not introduce any new problem. The global variable they are bound to doesn't change, so the variable per se is safe.

(There can be thread-unsafe lazy initializations of singleton globals, of course, which is an isolated problem that can be addressed with specific, localized mechanisms. Global shutdown can be a gong show also.)

A singleton could be contrived to provide a service that is equivalent to a global variable. E.g. it could just have some get and set method for a character string. If everyone uses singleton.get() to fetch the string, and singleton.put(new_string) to replace its value, then it's no better than just a string-valued global. That's largely a strawman though; it deliberately wastes every opportunity to improve upon global variables that is provided by that approach.

lmm · on March 28, 2020

I disagree; as far as I know the specific problems of global variables (over and above shared mutable state in general) are things that apply just as much to singletons. Things like absence of scoping, lack of clear ownership, and as you mentioned initialisation and shutdown, are just as much a problem for singleton objects as they are for non-object global variables.

Objects containing mutable state have some advantages over plain mutable variables (e.g. the object can enforce that particular invariants hold and invalid states are never made visible), but as far as I know those are just the generic advantages of OO encapsulation, and there's not really any specific advantage to encapsulating global variables in a singleton that doesn't equally apply to encapsulating a bunch of shared scoped variables into an object.

vlovich123 · on March 27, 2020

I generally strive to avoid singletons but there are cases of API usability where they're useful. If you can carve out the responsibility of what state is being tracked in the singleton then it's useful.

It's also not difficult to test as long as you write it to be testable. It may be more verbose & cumbersome but it's not actually difficult. That means you provide hooks testing the singleton implementation to bypass the singleton requirement but in all other cases it acts like a singleton.

As an example, consider Android JNI. The environment variable is very cumbersome to deal with in background threads & to properly detach it on thread death. It also requires you to keep track of the JavaVM & pipe it throughout your program's data flow where it might be needed. It's doable but it's conceptually simpler to maintain the JavaVM object in a global singleton and have the JNIEnv in a thread-local singleton with all the resource acquisition done at the right time. It's still perfectly testable.

lmm · on March 28, 2020

> It's also not difficult to test as long as you write it to be testable. It may be more verbose & cumbersome but it's not actually difficult. That means you provide hooks testing the singleton implementation to bypass the singleton requirement but in all other cases it acts like a singleton.

At that point you're adding complexity that has a real risk of bringing in bugs in the non-test case. Nothing is impossible to test if you try hard enough, but the more costly testing is, the less you'll end up doing.

> As an example, consider Android JNI. The environment variable is very cumbersome to deal with in background threads & to properly detach it on thread death. It also requires you to keep track of the JavaVM & pipe it throughout your program's data flow where it might be needed. It's doable but it's conceptually simpler to maintain the JavaVM object in a global singleton and have the JNIEnv in a thread-local singleton with all the resource acquisition done at the right time. It's still perfectly testable.

Not convinced - to my mind the conceptually simple thing is for every function to be passed everything it uses. If you instead embed the assumption that there's a single global JavaVM that could be touched from anywhere, then that adds complexity to potentially everything, and any test you write might go wrong (or silently start going wrong in the future) if the pattern of which functions use the JavaVM changes (or else you treat every single test as a JavaVM test, and have the overhead that goes with that). For some codebases that might be a legitimate assumption, just as there are some cases where pervasive mutable state really does reflect what's going on at the business level, but it's certainly not something I'd introduce lightly.

vlovich123 · on March 29, 2020

> If you instead embed the assumption that there's a single global JavaVM that could be touched from anywhere, then that adds complexity to potentially everything, and any test you write might go wrong (or silently start going wrong in the future) if the pattern of which functions use the JavaVM changes (or else you treat every single test as a JavaVM test, and have the overhead that goes with that)

Not sure I follow. If you expect any code to invoke JNI then you are still responsible for explicitly initializing the singleton within the JNI_OnLoad callback. If you don't the API I have will crash so definitely not a silent failure. There's no external calling pattern to this API that can change to break the way this thing works. As for why this is needed it has to do with the arcane properties of JNI:

1. Whatever native thread you use JNI on, the JNIEnv must be explicitly attached (Java does this automatically for you when jumping from Java->native as part of the callback signature).

2. Attaching/detaching native threads is a super expensive operation. You ideally only want to do it once.

3. If you don't detach a native thread before it exits your code will likely hang

4. If you detach prematurely you can get memory corruption accessing dangling local references.

5. It's not unreasonable to write code where you have a cross-platform layer that then invokes a function that needs JNI.

If you're avoiding all global state you only have the following options:

A. Attach/detach the thread around every set of JNI operations. This stops scaling really quick & gets super-complicated for writing error-free composable code (literally manifests as the problem you're concerned about with code flow changes resulting in silent bugs).

B. Anytime you might need to create a native thread, you need to pass the JNIEnv to attach it. If the native thread is in cross-platform code suddenly you're carrying a 2 callback function pointers + state as a magic invocation as the first thing to do on a new thread creation & the last thing to remember to do just before thread exit. Also you have to suddenly carry through that opaque state to any code that may be invoking callbacks that require JNI on that platform. This hurts readability & risks not being type-safe.

At the end of the day you're actually also lying to yourself and trying to fit a square peg in a round hole. JNI is defined to use global state implicitly throughout its API - there's defined to be 1 global JavaVM single instance. Early on in Java days JNI was in theory designed to allow multiple JVMs in 1 process but that has long been abandoned (the API was designed poorly & in practice it's difficult to properly manage multiple JVMs in 1 process correctly with weird errors manifesting). This isn't going to be resurrected. In fact, although not implemented on Android, there's a way to globally, at any point in your program, retrieve the JVM for the process.

In principle we're in agreement that singletons & globals shouldn't be undertaken lightly but there are use-cases for it. It's fine if you're not convinced.

lmm · on April 1, 2020

> A. Attach/detach the thread around every set of JNI operations. This stops scaling really quick & gets super-complicated for writing error-free composable code (literally manifests as the problem you're concerned about with code flow changes resulting in silent bugs).

Sounds like a monad would be a perfect fit, assuming your native language is capable of that. That's how I work with e.g. JPA sessions, which are intended to be bound to single threads.

> At the end of the day you're actually also lying to yourself and trying to fit a square peg in a round hole. JNI is defined to use global state implicitly throughout its API - there's defined to be 1 global JavaVM single instance.

Of course if you're using an API that's defined in terms of globals/singletons then you'll be forced to make at least some use of globals/singletons, but I wouldn't say that's a case of singletons being "useful" as such. And if you're making extensive use of such a library, then I'd look to encapsulate it behind an interface that offers access to it in a more controlled way (using something along the lines of https://github.com/tpolecat/tiny-world).

jcelerier · on March 27, 2020

For many singletons it does not matter at all. E.g. 99% of all desktop gui programs and 99.9995% of games have a single main window by design - trying to abstract that with an API that simulates that you could have more than one just makes the code harder to read for no benefit (as no widget system except beOS' can be used outside the main thread anyways)

lmm · on March 27, 2020

> E.g. 99% of all desktop gui programs and 99.9995% of games have a single main window by design - trying to abstract that with an API that simulates that you could have more than one just makes the code harder to read for no benefit

Being able to test UI behaviour is a huge difference maker. (Also even if you do believe that a singleton is ok in this case, it's clearly no different from a global variable).

> as no widget system except beOS' can be used outside the main thread anyways

Which is a problem in itself.

jcelerier · on March 27, 2020

> Being able to test UI behaviour is a huge difference maker.

obviously UI tests are being run today so this is not really an issue, right ?

> (Also even if you do believe that a singleton is ok in this case, it's clearly no different from a global variable).

yes, that's global state all the same

> Which is a problem in itself.

maybe, does not prevent writing a lot of very useful apps.

lmm · on March 27, 2020

> obviously UI tests are being run today so this is not really an issue, right ?

UI tests are notoriously slow, flaky and generally worse than other kinds of tests. They're absolutely a significant pain point in software development today.

> maybe, does not prevent writing a lot of very useful apps.

People write useful code with global state. People wrote useful code with gotos, with no memory safety... that computers are useful does not mean there isn't plenty of room for improvement.

kazinator · on March 27, 2020

Avoiding singletons in the app implementation will not put a dent in UI testability. If you instantiate the MainWindow as a local variable in the top-level function, and pass that object everywhere it is required as an argument, external testing of your UI is not any easier.

lmm · on March 28, 2020

It's a step in the right direction, and it gives some immediate value: you can see which functions don't actually need the MainWindow and can therefore be tested conventionally (you might argue that those were never actually UI tests, but in practice you'll end up using your UI testing techniques for things that don't actually use UI if you can't tell), and you're nudged towards only passing it where it's needed; also you could try to mock or stub it, which might cover at least some of the simple cases.

afiori · on March 27, 2020

Global variables are fine, mutable global state is considered bad style.

yowlingcat · on March 27, 2020

You've heard it hundreds of times over the course of your career and yet you never once questioned it? Either you're exaggerating to make a rhetorical point, or you have such an apathetic attitude towards the issue that you can't (or haven't tried to) reason about why it polarizes people.

Taking your comment in good faith, not all global state manipulation is equivalent. Depending on how you do it, structured global state manipulation could mean have you end up with something like Postgres, where you have orderly read and write interactions that you can reason about with set theory and transaction monotonicity. It could also mean something like using an in memory cache or session store to persist temporarily durable data. Any kind of structuring like this around what you can read and write and for how long gets you further away from the idea of globals, and that's the point. It's a tool that doesn't reward reaching for it prematurely.

cryptonector · on March 27, 2020

I write or deal with a lot of C, unfortunately. I try very hard to not have global variables, and to minimize sharing between threads. When I pick up a C codebase, one of the first things I do is build it and inspect the object files to see what globals exist. The same can be done in C++, and should be. Use inheritance sparingly. Don't use exceptions if at all possible. Use modern C++ as much as possible, and borrow ideas from Haskell/Rust as much as possible. I'm thinking of https://stackoverflow.com/questions/9692630/implementing-has...

pjmlp · on March 27, 2020

I am so glad when I got to use C, I already had a good school of modular programming languages behind me.

On my own projects, like university assignments, I would treat each translation unit as a kind of module, anything that for whatever reason could not be in a handle structure would be a internal static (years later I started using TLS instead), and in some cases incomplete structs as means to avoid the temptation to directly access internal data.

TheBobinator · on March 27, 2020

Game development is a rapid prototyping adventure that is fuelled by the fact what you are producing is ultimatly a form of art. Architectures are based on abstraction, and abstraction is ultimately mindful ignorance; in this case of specific requirements or specific goals which are going to change because you are creating art. You are going to find out as you continue to develop that technical debt builds because the changing requirements create conflicting workflows which is why you get spam in the header. It's a lot faster to prototype something through duplication, cobbling, or refactoring then later on use automation to remove the chunks of code that are not used and reduce line count by creating utility functions because at that point, part of the project is set in stone and the project is going in one direction. Things will gyrate back and forth between messy and clean, and hopefully you have the budget to refactor to clean before you ship as modders don't like dirty game code.

Games are a simulacrum of reality and reality doesn't say properties of two different objects can never, ever, interact with each other; that's why you have the abuse of global variables to store state and also why there's a rich speedrunning community using all sorts of hacks in games to speed up their playtime due to unforseen edge cases. If you build a model of reality, you're going to be doing R&D learning how it interacts with itself, just like we do today!

Nobody wants to play a game with a static workflow.

joelfolksy · on March 28, 2020

What stands out is how apologetic you are for pointing out that a language might be worse (gasp!) than another language. When did this "all languages are roughly equal, and if you say anything else you're a zealot" ideology get so widely entrenched in our industry?

DubiousPusher · on April 1, 2020

As someone who came up as a C++ game dev I just ran into tons of people that acted like people using managed languages were automatically inferior programmers. Even imbibed of this belief a bit myself.

This was a view purely sourced from ignorance. There were people creating awesome things with Java and Python at the time that I and my contemporaryies could probably never coded up.

It was quite embarrassing when I came to realize the combo of ignorance and arrogance I was working from. So now I tend to bias toward assuming most languages people are working with are useful and warrant some amount of respect. I try to only criticize languages I'm extremely familiar with and have had the opportunity to see bad patterns repeatedly emerge from in a variety of code bases.

Basically, I think we can call some languages "bad" or "good" it just takes a lot of evidence and I'd rather avoid ranking them altogether.

cies · on March 27, 2020

> I also understand there are situations where despite its shortcomings it is the right choice.

Would you say the reason for choosing it are not inherent to the lang itself but to things like: experience of the team, availability of libraries/ecosystem, need for mature/fast compilers?

wffurr · on March 27, 2020

Can't speak for parent, but in our case it's the only choice with zero-overhead abstractions and good cross platform support (Obj-C++, Android NDK, WebAssembly, Linux for tests). I wish Rust were there, but it's not.

DubiousPusher · on March 31, 2020

Exactly. I remember having discussions with folks about using D for games like 10 years ago and it's never gotten there either.

l0b0 · on March 27, 2020

That's a great metaphor for language smells! Some more anecdotes:

- Python shepherds you into using list comprehensions, even when it's almost always premature optimization and much harder to read than a loop. As a language smell that's not bad, it's just the worst I could think of in a post-v2 world. Luckily there's `black`, `flake8`, `isort` and `mypy`.

- Bash shepherds you into using `ls` and backticks, useless `cat`s, treating all text as ASCII, and premature portability in the form of "POSIX-ish". Luckily `shellcheck` takes care of several common issues.

- Lisp shepherds you into building your own language.

There's also tool shepherding:

- IDEA shepherds you into refactoring all the time, since it's the only IDE which does this anywhere near reliably enough. (At least in Java. In other languages renaming something with a common name is almost guaranteed to also rename unrelated stuff.)

- Firefox shepherds you into using privacy extensions.

- Chrome shepherds you into using Google extensions.

- Android shepherds you into installing shedloads of apps you hardly ever use.

- *nixes other than Mac OS shepherd you into using the shell and distrusting software by default.

- Windows and Mac OS shepherd you into using GUIs for everything and trusting software by default.

nikofeyn · on March 27, 2020

> Lisp shepherds you into building your own language.

the feeling of the racket community is that you build a DSL or DSLs in your code all the time, in any language, so why not take it seriously and codify your DSLs?

sbergot · on March 27, 2020

My personal feeling about this is that shared base language mecanisms such as functions, classes, control structure, interfaces imports, properties etc allow you to reason locally about some files. You don't need to read the whole library in order to understand one piece of it.

With macros this goes out of the window. You have to read all the custom macros before you can understand what is their behavior.

With haskell the same issue exists with complex monad stacks and control libraries like lens.

The ability to analyse a tiny piece of a big system is a major factor in building those in a manageable way.

wtetzner · on March 27, 2020

The beauty of macros is that you can just expand them in-place and read the expanded code.

Of course, for really complex macros the expanded code might be hard to read, but I guess that means "write nice macros".

londt8 · on March 27, 2020

I dont think list comprehensions are used to improve performance. One reason to use them is to improve readability, as the execution doesnt jump around with continue/break etc.

lsh · on March 27, 2020

> - Android shepherds you into installing shedloads of apps you hardly ever use.

I disagree with you on this one but I could be in the minority. I have about 8 apps that I trust and that rarely change. I don't go looking for new apps to install and I resist attempts to use the app version of a website.

> Lisp shepherds you into building your own language

I see the propensity of developers to build DSLs in all languages. I think the act of programming shepards us into creating elaborate abstractions.

wodenokoto · on March 27, 2020

> and I resist attempts to use the app version of a website.

If you weren't being shepherded into installing apps, why are you resisting?

I could install apps on my old Nokia dumb phone, but I never resisted installing apps. It never really seemed like it was worth the trouble to install one.

I actually looked into it once, despite the system shepherding me away from installing apps.

marcosdumay · on March 27, 2020

Bash shepherds you into ignoring errors and using maybe-blank values everywhere.

l0b0 · on March 27, 2020

You can improve that situation somewhat by starting scripts with `set -o errexit -o noclobber -o nounset -o pipefail` and `shopt -s failglob` to fail fast.

marcosdumay · on March 28, 2020

Yes, this makes things better.

But then you will have to deal with the non-negligible amount of programs that use the exit value to return information. It is still better to place exceptions on failing code than to ignore errors that can wipe out your entire system, but it's not really good either way.

pjmlp · on March 27, 2020

I never missed any kind of IDEA features in Eclipse and Netbeans, including refactoring.

And yes I do know InteliJ, as I have to put up with it on Android Studio.

chungus · on March 27, 2020

>since it's the only IDE which does this anywhere near reliably enough.

Have a look at the Language Server Protocol implementations for java (eclipse jdtls and to a lesser extent boot-ls). Not all the features from IDEA are there, but the gap is closing.