This is my main issue with C++. For a while my job was to get game engine codebases running, integrate tools and move on. So I saw a lot of big C++ codebases. Nearly every one had the same bad behaviors. Tons of globals. Configuring build options from code. Header mazes that made it clear people didn't actually know what code their classes needed.
I then worked for awhile developing a fairly fresh C++ code base. The programers I worked with were very willing to write maintainable code and follow a standard and it was still really damn hard to keep things like header hygene.
When I go back to the language I can't believe how much time I spend dealing with minor issues that stem from the bad habits it builds. For years I would refuse to say any language was good or bad. Always I insisted you use the right tool for the job. And there are some features of C++ that when you need them you have to use that language or maybe C in its place. But the shortcomings are unrelated to language's issues which largely seem to come from a focus on backward comparability. And so even used in its right application it seems incredibly flawed. And I pretty much believe it's a bad language now.
Disclaimer: I learned to program with C++, I understand its power and for years I loved the language. I also understand there are situations where despite its shortcomings it is the right choice.
Why are globals considered bad? I'm seriously asking. I, too, have been told hundreds of times over the course of my career, and I never questioned it. I want to question it now, because I've never understood why people work SO HARD to remove and avoid globals. I seriously doubt that the time and effort I've seen spent on removing and avoiding globals has been time well spent. And I'm quite sure that the effort spent on that is not comparable to the amount of problems prevented by not having globals. There's just no way globals can be dangerous enough to justify the size of globals-cleansing efforts I've seen.
Game development often has a very large global state, and game problems are often inherently global state manipulation problems; you need globals in order to even have the game in many cases.
Imagine a kitchen where a hundred cooks are trying to make the same pot of soup, same pile of ingredients and utensils. Now imagine they all have telekinesis. That’s global state.
The problem is that when disparate bits of code directly affect the details or internals of a state machine, is pretty much impossible to ever maintain a valid state at all times. Throw in threading and the whole mess becomes non deterministic to boot.
All state management tools and procedures seek to handle this by encapsulating details and establishing rules for updates. Some like Finite state machines are more fixed and formalizable. Some like Redux are looser but stay deterministic.
As you mentioned state machines and patterns like reducers allow you to make state changes deterministic, solving the 'telekinesis' problem for global state. Conclusion?
There isn’t really a conclusion - each solution pattern allows you to trade off progressively less control for more rigidity and determinism. Pick a system that matches your use case the best.
Think of all your state as a state machine. Is there a finite number of possible states you can be in, with clearly defined ways to go to each from each one? You have a finite state machine. Lots of libraries will be available in your language.
Are your state combinations unbounded and unknowable, but still subject to validation and sequencing? This is pretty much any UI - a Redux style system helps you organise changes and make them linear. Any number of states are possible, but they’re all deterministic and can be reproduced.
Can’t linearise the states but still have validation rules for correctness? Sounds like an RDMBS type system - set up you constraints, foreign keys and go to town with any number of threads.
There’s really no right answer. I just try to understand the problem as well as I can and see if the solution presents itself.
There’s also one step after RDBMSs, which is the Redis style key value or data structure stores that allow some level of client based or cooperative structuring, using conditional sets and gets or CAS operations.
Then finally there’s the Wild West of everyone do whatever they want.
Global state is nearly impossible to test in any decent automated fashion. When writing unit tests, globals are the bane of your existence.
If you’re relying on globals for passing data, they are also difficult to reason about in multithreaded code.
There are means by which you can share data, that data if instantiated at the code entry point, can be shared in such a way as to never need globals, and rely on decent patterns for sharing between code points.
Yes, there is a trade off in adding parameters to functions, references in classes, but these can be avoided by adopting patterns like inversion of control, etc.
Basically globals are a bad pattern because they make it hard to test and hard to reason about data access patterns.
This is only true in a case where you don't spin up and tear down your program per test case. And I don't want to defend globals.
Globals are bad because they are just often used poorly. In large because they require you to think about the whole system as you make changes.
Ironically, the best changes are done with the whole system in mind. Such that sometimes establishing a few core globals and some rules for how they will be treated can actually help your logic. So it really is a tradeoff. With a great slogan of "think globally, but act locally."
It's about scope. The "ideal" design one pattern is supposed to be separation of concerns - the devolution of performance and responsibility into units that can be built and tested independently.
This is fine when that design pattern fits the domain. But some domains require global context, and it isn't useful or possible to strictly enforce separation - because you end up passing parameter bundles around and managing all those local scopes introduces more bugs than implementing a global context.
Multithreading is a different issue, and is a different kind of domain requirement. If you need multithreading and have a global context, you have a very interesting problem to solve.
Not only that, you also can't trust that the test results will apply to any situation where the user doesn't restart the program after every action—i.e., to normal operation.
Don't restart the program between tests. Randomize the order of the test cases between runs. Try running the same test multiple times on occasion.
I agree that globals are usually a bad pattern, but there are situations where judicious of globals is warranted and can actually improve readability.
An example is small scripts, where the scope of the script limits the scope of the global. The overhead of an abstraction doesn't pay off in that case.
Another example are "near constants" like a locale setting, an environment variable that gets detected once at startup, or a development feature flag. The "proper" way to structure those is to create a settings object and pass it to every function that needs them, but judicious use of a well-documented global can prevent a lot of boiler plate code.
Of course, as soon as the code base needs to be touched by many devs, especially less experienced ones, it's safer to say "never do it" than "judiciously use", so I understand why most textbooks say this.
> Another example are "near constants" like a locale setting, an environment variable that gets detected once at startup, or a development feature flag. The "proper" way to structure those is to create a settings object and pass it to every function that needs them, but judicious use of a well-documented global can prevent a lot of boiler plate code.
In small programs, globals are ok, but in larger programs a better approach would be a global accessor that gives you read-only access:
printf("%s\n", Environment()->Host);
This doesn't require passing an object to every function, and the application still can't trample on these variables.
I don't understand. If you have a good understanding of the code you're writing, you won't put yourself into a position where globals cause problems unless you're being very stupid, and if you do, normal use of the program will detect those problems, right? Certainly bug reproduction steps and a debugger will figure out what's going on.
You mentioned unit tests, and these are another thing I don't fully understand. Obviously testing your code is important, and automated tests are good. My beef with unit testing comes with the requirement that all methods and functions have multiple tests each for success and failure conditions, and that results in test code which outweighs tested code by several times. When you discover that the architecture you've been putting together isn't going to work, which is something that happens approximately 100% of the time if you're doing anything real, you now have (say) 5,000 lines of code that needs rework, and 50,000 lines of test code that need to be thrown away and rewritten.
That is A LOT of effort to shove onto yourself to avoid a few global variables, to me. That's so much effort that many projects will just not make the change and ship software that they know is insufficient, and then they'll graft on whatever functionality can't be attained natively with the given architectural decisions rather than redesign.
The ability to paralyze yourself with the weight of unit test code seems like an extremely high price to pay to avoid some global variables.
I think that globals are not a problem when "you have a good [enough] understanding of the code you're writing." The problem is when code bases grow, references to globals can start to appear in lot of different places, and the exact use of a particular one can be hard to reason about. (Strictly talking about mutable global state here.)
As code bases grow, and developers come and go, eventually no one will have a "good [enough] understanding." Mutable global state is fundamentally hard to reason about since it can be changed at any time by any part of the program. When you first start out the codebase, you can just remember where all the usages are. But eventually that is not a good approach.
I consider the testing stuff orthogonal and muddying the issue. Mutable globals are hard to reason about, therefore they can make code hard to debug. Thus they should be avoided. No need to bring testing into the picture.
I think most of the problems with globals can be solved at a language level. Immutable references to globals are practically never a problem, so if your language forced you to explicitly mutably borrow a reference to a global variable you force the programmer to think about every instance of code where they are modifying the global.
This also enables tooling to for example syntax highlight these things differently. An immutable global looks like any other variable, but a mutable global is bold red.
You bet your ass that people will think about whether they really need it mutable in that case, and they'll know everywhere it's made mutable and therefore error prone.
Again this comes back to shepherding. Globals in Rust aren't the same as globals in C++ because the languages shepherd you differently.
One of the problems I’ve encountered in the wild is that globals often mean that you have to check the entire program when things go weird. You’re right: if you have a complete understanding of the entire codebase, then it probably won’t be an issue. But software grows and ages; globals won’t hurt you (much) early on in the project, but they start to in the long term. Your coworker modifies it in a place where you don’t realize it’s being modified, and things that worked fine yesterday stop working. The coworker might be you when you’re tired :)
Not sure what your gripe on testing has to do with what the comment is saying. Globals make testing hard.
The simple answer is that globals are expensive. Literally, they cost a lot of money. They introduce bugs that are harder to find, reproduce, and fix. That means introducing a global is a high risk, since it's increasing the expected value of your non recoverable engineering costs.
Rejecting globals is about lowering risk and cost because it's so easy to not use them and toss them out of code review, and it's really easy to work around that limitation.
Gonna remove my more uncivil remark. Basically relying on bug reports and debugging is the software equivalent of waiting for your engine to seize before you change the oil.
> When you discover that the architecture you've been putting together isn't going to work
One of the underappreciated benefits of unit tests is it quickly teaches you how to write good code. It turns out testable code is also code that tends to be well architected and doesn't need to be rewritten. Basically writing tests leads you to being a better programmer
Unit tests are perhaps good for instilling a decent sense of function decomposition, but make no mistake, you can go too far in this direction and not develop the sense of an integrated system. It's a hard problem to avoid, especially when starting out. That's one of the reasons I generally find type-driven development better for seeing how parts are actually interacting.
Not to discount, testing, naturally, but I also prefer property based testing to unit for the same reason (i.e. a function can be a mini-system with relationships between internal values that may not be exposed with unit tests.)
That's a myth. It teaches you to write code which is easily unit testable. That may be a better architecture than the one you would have used, but often it's just a different architecture, sometimes even markedly worse.
I have seen far too many code bases with simple things chopped up beyond recognition to make the code unit testable.
"[S]imple things chopped up beyond recognition" sounds like a hyperbolic argument to me. What is an example? When is the maxim "A function should do one thing well" not applicable?
> One of the underappreciated benefits of unit tests is it quickly teaches you how to write good code. It turns out testable code is also code that tends to be well architected and doesn't need to be rewritten. Basically writing tests leads you to being a better programmer
In majority of situations, this holds (apart from the "doesn't need to be rewritten" part!). But there's a large minority of situations where it doesn't.
I have never witnessed that in 15 years of working at places which write unit tests. I've witnessed a LOT of unit tests which test nothing and manually return the pass/fail result desired so the indicators stay green.
I think the "leave the site better than you found it" advice applies here. Whenever you need to touch a piece of code, write the proper tests (hell, add some fuzzy testing if you don't want to write them by hand) and then improve that code.
The problem with globals is that you can't know. f might change glob, directly or indirectly, and there is no way you can keep in mind all possible changes (especially with multiple people working on the same codebase).
(The same problem can happen, on a more limited scale, with class fields - which is why some of us insist on requesting that classes are kept small and cohesive.)
Note that this does not happen as often with database values (which are also globals that can be changed from any point in the program) because of expectations. When using those, we have all kinds of mechanisms - like transactions and isolation modes - that let us specify how much we want a value we have written to stay like that until we're done with it; when we don't use those mechanisms we generally expect that "this value could change between one statement and the next".
I think one of the main complaints about global variables is that because you can change them from anywhere within the code, you are tempted to actually do so, which can get into some pretty nightmare debugging scenarios. If you truly have global state, I think the preferred solution is to have one piece of code which changes/updates the state, but everywhere else may simply read it. Then you at least know where the problem has to be if your state updates are buggy.
I don't think that person means you literally can't know, just that it increases the difficulty of reasoning through the code.
I was debugging some code earlier today. Someone had put a global variable that is either altered or used in 4 or 5 different functions across our codebase. I had to literally draw out the paths a user could go down to figure out what the value of this global variable would be at the time I was trying to call one of those functions. It was not awesome.
I figured it out, so you're right. I do know which functions touch the variable and NOW I know when. But I still can't guarantee the value of the variable.
Needless to say, tomorrow will see a little refactoring.
I was dealing with a hard problem earlier this week, which I'm pretty sure was causing a thread to crash without logging anything, but the program to stay running. Unfortunately, only seen in production and only once every few days.
The program does several stages of data processing in parallel batches, initially loading and eventually saving to a database. It's basically a "continuous" and complicated ETL.
There is effectively a set of global state variables to track progress of each input item through the stages. The values in this global state can depend on the data, execution order, and can be modified from a dozen places in the code.
I narrowed down several potential crash points, which was basically stuff like: if the global state contains x and a db lookup in thread 2 times out, if thread 3 accesses the value before 2 starts the next batch it could get a null reference. Another was based on making a decision to insert or update: in theory, the two global state value that effectively made this decision could never be set to states where it would do the wrong thing (getting either a foreign or duplicate key error) but the state is possible to represent.
If I were to run in a debugger using the massive production data stream I might eventually get lucky and see the data that triggers this. However, I could also sit for days and get nowhere, or the act of debugging and inspecting night be enough to prevent a race condition and not trigger a bug.
I still don't know for sure what's happening (though now there's instrumentation and better error handling in those spots so hopefully I will), but the point here is it's nearly impossible to reason about in a definitive way.
When dealing with millions of lines of code, I do not have the time to read the whole thing and internalize it's whole state. Understanding the call graph can help, but diving through every abstract interface and callback and abstraction is a non-starter. Even if I had time to read the entire codebase line by line, I wouldn't be able to fit it all in my head, and I often have enough coworkers that changes are occuring faster than I can read and understand them all.
Even the codebases I work on are dwarfed by much larger ones.
In theory you have the source code and you can know everything just by reading it all and debugging it all. In practice it becomes overwhelming.
Even intelligent people can only fit a little bit of information into working memory in their heads at a time. Mere mortals have no chance. We need things to be bite size and local and simple so we can fit it in our heads and reason about it.
Global variables force you to do global reasoning, which a human mind just doesn't have the capacity to do.
There are lots of ingenious ways to accidentally hide where a variable is used. Start passing some pointers around and storing them off under different names.
And of course with a race condition in a multithreaded context knowing where a variable is accessed is about 1% of the battle.
Reasoning about code requires reasoning about relevant state. On the one extreme, you have pure functional programming, where all state is passed in and returned out - all relevant state is explicit and "obvious". On the other extreme, you might use global state for everything - relevant state requires diving into all your code. This sounds unthinkable in the modern era, but similar styles aren't entirely uncommon in sufficiently old codebases with codebases that didn't really bother to use the stack.
This is part of the reason why memory corruption bugs can be so insidious in large codebases - if anything in your codebase could've corrupted that bit of memory, and your codebase is millions of lines of code, you have a large haystack to find your bugs in, and your struggle will be to narrow down the relevant code to figure out where the bug actually is. This isn't hypothetical - I've had system UI switch to Chinese because of a use-after-free bug relating to gamepad use in other people's code, for example.
(EDIT: Just to be clear - globals don't particularly exacerbate memory corruption issues, I'm just drawing some parallels between the difficulty in reasoning about global state and the difficulty in debugging memory corruption bugs.)
> Game development
John Carmack on the subject, praising nice and self contained functional code and at some point mentioning some of the horrible global flag driven messes that have caused problems in their codebase, mirroring my own experiences: https://www.youtube.com/watch?v=1PhArSujR_A&feature=youtu.be...
> you need globals in order to even have the game in many cases.
Simply untrue unless you're playing quite sloppy with the definition of "globals" and "global state". The problem isn't that one instance of a thing exists throughout your program, it's that access is completely unconstrained. Game problems do often involve cross cutting concerns that span lots of seemingly unrelated systems, but globals aren't the only way to solve these.
> Game development often has a very large global state
Not any more!
I'm currently playing Doom Eternal, and I've got to take my hat off to its developers: It's ridiculously well optimised! I played the previous version of Doom on the same hardware, and it was a stuttering mess at 4K, but now it's silky smooth with Ultra Nightmare quality settings. Wow.
They achieved this by breaking up the game into about a hundred "tasks" per frame, and each task runs in parallel across all available cores. These submit graphics calls in parallel using the Vulkan API.
There is just no way to write an engine like this with a "very large global state". No human is that good at writing thread-safe data structures.
The only way to do it is to separate the data and code, making sure each unit does its own thing, independently of the others as much as possible.
I'll try to address things other replies haven't. Global variables are not just a problem for understanding code, but they also have a large potential for causing incredibly hard to debug bugs. Say you're writing a parser and decide to use `strtok`, which uses global variables. Everything works fine, but then you try to improve performance using multiple threads and suddenly your linux and mac users are seeing all kinds of weird incorrect behavior. Turns out strtok uses thread local storage on windows, but not on other platforms, so your parallel strtok's were all overriding each other.
It's a good question, and we should always question our assumptions.
Globals and Singleton avoidance stem from long-term experience. Their design tend to lead to write-only code: Because any part of the code can at any time access and modify them, globals quickly become distinct from your main program flow!
This property makes them more complex to reason about, while overall codebase complexity tend to increase as well. From a complexity standpoint, at some time globals become an untenable nightmare to develop further and maintain. Because of lack of foresight and design, you get stuck with too much scared code to properly refactor. The tunnel to clean up the "mess" will be long and dark. Bugs may also be introduced, making it tempting to rebuild everything from scratch, something with its own caveats and troubles. If you lacked design the first time, how sure are you to be able to hit the nail the second time? It's costly and doesn't benefit from an iterative approach with rapid feedback cycle.
For small scripts / one-offs, globals and singletons are OK. Good coders know they're there, how to remove them, and nobody else are going to build airport traffic controller software on top of them.
Btw, encapsulating globals/singletons with OO CRUD or REST, doesn't make them any less distasteful. You end up exporting complexity to all the different parts of the whole codebase, instead of encapsulating behaviour within its own domain.
A simple, quick answer that I'm sure you heard is that global variables pollutes namespaces. The qualities of design choices are rarely apparent outside the real world.
A big problem with global is that it's often abused as a work around. Restricting access is an abstraction. The user isn't expected to alter this value, why should they be allowed to? What's more critical is the fact that the programmer might not realize what they wanted to be simply accessible is in fact static as well. So you now have a variable that's not only accessible, but state dependent. Now anyone using this variable has to be mindful of this.
Unless there is C code being called as well, in C++ you should rarely use global. It's much more manageable to have a game class object, where inside it, what used to be global could now just a private member that's global to that class.
You're team put in all that effort to remove global because it takes even more effort to get rid of all the trivial errors tied to the choice to begin with.
It all comes down to writing reusable code, objects that manage themselves. People shouldn't have to be cautious when reusing code. This doesnt only apply to other programmers but yourself 6 months down the line.
> It's much more manageable to have a game class object, where inside it, what used to be global could now just a private member that's global to that class.
That's still a global, except now it has lipstick.
"a game class object" is 99.95% likely to contain the whole game. Doesn't matter that it's labeled private, it's basically global to all the code of the ... game.
That doesn't sound like a wise assumption. It is common to have the actual engine of the game, and even the game itself separate from other architectural components.
At first blush a lot of games systems and code look like they're global but aren't really. If you think about a game as the flow through a frame you can break things down and it turns out a lot of things are not as global as you first think.
For a naive example game flow is basically:
- Get input.
- Update game state.
- Render.
If each stage only consumes data generated by the prior stage then it doesn't need to be dependent on how that data was generated. Nothing needs to be global in this case.
There's nothing inherently wrong with modelling this using globals though just that they require more discipline on the part of programmers to stick to the application design. It's sorely tempting to just reach in and tweak something when it's easy and then suddenly your entire application is a spiderweb of little tweaks. Not using globals and only having the systems and data available that you need to use makes the design harder to break and its much easier to detect the spiderweb creeping in.
This isn't limited to globals though, dependency injection, IoC and other application patterns suffer the same problems as well. Lot's of software ends up passing around a 'context' or injecting defacto globals everywhere which results in the same spider web except you can't even navigate the codebase sanely.
The problem with the spiderweb is that it's harder to maintain and can make things more difficult down the line if you want to re-architect things for example to make the game multithreaded.
More generally the harder we make it to mess stuff up the less stuff will get messed up and the easier it will be to find. That's partly why static types, lifetimes and immutability are popular. They of course come with tradeoffs in performance or ease of use that need to be weighed. Software design choices are just a less strict version of the same.
One fun class of bugs that occurs on 8-bit systems is when you have a 16-bit global variable (C makes this easy), and read access is actually 2 separate reads (one for each 8-bit part). This is invisable from the C code. Now lets say there is a separate thread or an interrupt that writes the variable in between the two read phases. Most of the time its fine, but every so often you get garbage (often double or half the value you expected).
It's usually a sign that you don't have clear component boundaries or well-defined interfaces, which means your code is going to be harder to test and harder to debug. Every place you read and write global state is also a potential race condition in multi-threaded code.
Of course there are places where global state is unavoidable (even if it's just "the filesystem"), but by confining your global state to a small corner of your codebase and having the rest of the code interact with this component instead of touching all the global variables directly, you can reduce the number of potential problem spots.
Coupling. If any part of the program can touch a global variable, then the only way to understand how that global is used and when and why it changes is by understanding the entire program. Limiting the variable’s scope (e.g. to a module or class) makes it easier to reason about, as there’s less to learn and mentally model all at once.
Have you got Steve McConnell’s Code Complete? Read the chapter on coupling and cohesion. If not, you should. (You can nab a first edition off eBay for a few bucks.) Good for the “Why”s of software construction.
Global state == shared state if threading (and you probably will be eventually) == a mess.
Global state == lots of refactoring if you want to make your program a library (OpenSSH is a poster child for this).
Write it like it needs to be a library. Write it to be thread-safe. Write it to use async I/O. Do these things and you'll save yourself a ton of work later. Learn to do these well and you'll always do this from the get-go.
> Why are globals considered bad? I'm seriously asking.
I think outside of special cases they bad. I use globals for embedded code because I don't have a heap.
What I've found is as long as globals are used to hold state and not pass data via spooky action at a distance they're okay. A test is if you can trivially refactor them out then they're okay.
Example you have one uart.
UartInit(baud_rate, bits, parity, stop);
Now you have two so gets refactored
UartInit(port, baud_rate, bits, parity, stop);
Terrible is shit code like this.
foo.bar = 2;
and somewhere else in the code
if(foo.bar == 2)
{
foo.bar = 0;
...
}
A note: Game programs to me look like really big embedded programs.
Games and UIs are special when it comes to state. They're weird in the space of all programs because their domain specifically concerns itself with maintaining and transforming a bunch of state over time. State is the point, in a way that it isn't for the vast majority of programs.
There are still lots of cases in games where state shouldn't be global, but there are also lots of cases where it's very natural and legitimate.
In addition to the reasons mentioned, one reason is that, by design, you can only have one instance of a global variable.
That might be fine today, but who knows what tomorrows requirements might entail.
At the very least, put global variables in a context object, and pass that around. Then it's clear what is affected by and can affect the "global" state, and it's easy to create multiple context instances if you suddenly find you need to.
In my view it makes the code extremely difficult to understand when someone other than the original author tries to read/modify.
The side effects of changing a global variables value is very difficult to glean from code.
It is as if some inputs to a function are getting passed to it implicitly, and it isn't obvious what value it has, who has set it, and what effect will be produced if you change its value.
Globals, similarly to "goto", are considered bad, because people tend to abuse them. But, same as with goto, they are not inherently bad and have their use. There are just lot of bad programmers who have been told that using globals (or goto) is dangerous and take it as "NEVER USE GLOBALS (or GOTO)" and spread this warped message further.
Probably it takes a lot of experience to use both correctly. If we are talking about a small programs, no threads, no chances of reuse (no modularity) - in other words "Keep It Simple and Stupid" - then it is fine.
But KISS is difficult to achieve: there's Hubris that pushes you to do "powerful" things rather than getting the job done, there's "anticipatis" that pushes you to have an answer ready for all future changes instead of solving the problem you actually have now, there's deadlines, and there's invasions of external unwanted complexities (silly requirements, interfacing with buggy software/hardware...).
That's why generally speaking "don't" is the safe piece of advice. But those who think they have the basics down can try it (in a harmless context like personal tools) and see what happens for themselves.
one example of good use of globals is for very light-weight pub/sub, where you keep the rule that only one place can write to the global (preferably with something like atomic write) and any other place only reads.
I use this kind of blackboard system still when I can't avoid globals. The main thing it helps with I find is you still have to know the order of and when your systems are being setup.
Ran into so many bugs from people creating static instance globals and thinking it was good that they didn't have to care when systems were setup.
I hate create on access with a passion. For god sake just new the damn thing at the beginning of main if nowhen else.
If your game state is a global because every action in the game changes the global state, then that's great, your game will be alive. There are so many valid states and perhaps rather few invariants, or you are okay with invariants being enforced once every few seconds. You do what you have to.
Not every program is like this. Consider something like TeX, whose goal is perfect bit for bit reproducibility of documents across every run on every machine. Same with a compiler.
When you say globals, I imagine those kinds of programs having code like this:
The second kind is better, because you can at least say that any internal invariants in the global data should be upheld in very specific code.
But when you use the first kind, you're completely giving up on being able to point to the line of code responsible for global data having bugs in it. Obviously you can use globals without this problem if you encapsulate them effectively, but you'd need your language to "shepherd" you towards this. All the C codebases that had no such shepherding seem to end up looking exactly like this, and it is truly awful trying to find the source of bugs. You'll notice the languages that shepherd you away from globals (Rust) do so because they want your programs to work when you decide one thread is not enough. This has the side benefit of shepherding you away from global data generally, and mutability rules restrict which code can modify, so there is a huge impact overall on how you look for bugs.
Essentially, you're having the same discussion the original article is saying is fruitless. Globals can be good or bad! You can make them accessible everywhere without actually accessing them everywhere and causing debug problems. But do they make good code easy to write and bad code hard? Absolutely not. They are bad shepherds, pied pipers that offer you easy solutions that make your codebase worse.
A lot of those variables could have been grouped into a struct. Like all those key_<action> variables. Even if you think global state is fine you would only have one way of accessing it. It would be closer to this:
game_instance.key_mapping.charainfo
but I never see things like that. All I ever get to see is projects with almost thousands of global variables.
Globals are bad when they are used together with the include pattern. So you are reading code and see variable foo and have no idea what it does, cant find it when searching in file, then you find it in an include file two levels down. Try to refactor only to to find its used elsewhere too, and sometimes included twice, and sometimes overwritten (but you are not sure if that is a bug or not).
IDE's are good at treating the symptoms. But it's also possible to write the code so that you don't need an IDE to untangle it: For example keeping all variables within (file) scope, and abstracting out into reusable (reusable elsewhere) libraries.
You can make functions pure and specific so they rarely need to change. And use name-spaces and naming conventions - so the variables can be found with grep (find in files).
Lets say you are upgrading an API, lets call it "HN", to a new major version, which has made a breaking change by renaming HN.foo to HN.bar. Now if you have always named the API "HN" you can just make a "replace in file" operation where you replace HN.foo with HN.bar - after you have already checked that there is no HN.foobar (to prevent HN.barbar)
Even sophisticated IDE's will have trouble following functions in a dynamic language that is passed around, renamed, returned, etc. So I would never trust an IDE to find all calls-sites.
Heavily depending on an IDE or tooling can also lead to over-use of patterns and boilerplate that the IDE handles well. And unnecessary work like adding annotations just to satisfy the tooling.
Not GP, but for most programs I write myself I cannot find all the call sites of a certain function because of using first class functions a bunch. When I worked in nginx I had a smaller amount of similar trouble, since nginx frequently but not pervasively uses function pointers to decide what to do next.
Globals are not inherently evil, but "shared, mutable state" is, basically if any part of the code is able to scribble over any global at any time.
If your globals are constants, or the globals are only visible inside a single compilation unit where it's easy to keep the situation "contained", they are perfectly fine.
We cannot discuss globals without pinning own exactly what we mean by globals.
Is a global a piece of information of which there is one instance?
Or is it a variable which is widely scoped: it is referenced all over the place without module boundaries?
See, for instance, in OOP there is the concept of singletons: objects of which there is one instance in the system. These objects sometimes have mutable state. Therefore, that state is global. Yet, the state is encapsulated in the object/class, so it is not accessed in an undisciplined way by random code all over the place. On the other hand, the reference to the object as a whole is a plain global: it's scoped to the program, and multiple modules use it. Ah, but then the reference to the singleton is not a mutable global; it is initialized once, and points to the same singleton. Therefore, singletons represent disciplined global state: a singleton is an immutable reference to an object (i.e. always the same object), whose mutable state (if it is mutable) is encapsulated and managed. This is an example of a "good" global variable.
Another form of "good" global variable is a dynamically scoped variable, like in Common Lisp. Reason being: its value is temporarily overridden in on entry into a dynamic scope and restored afterward (in a thread-local way, in multithreaded implementations). Moreover, additional discipline can be provided by macros. So that is to say, the modules which use the variable might not know anything about the variable directly, but only about macro constructs that use the variable implicitly. Those constructs ensure that the variable has an appropriate value, not just any old value.
Machine registers are global variables; but a higher level language mangages them. A compiler generates code to save and restore the registers that must be restored. Even though there is only one "stack pointer" or "frame pointer" register, every function activation frame has the correct values of these whenever its code is executing. Therefore, these hardware resources are de facto regarded as locals. For instance, a C function freely moves its stack pointer via alloca to carve out space on the stack, as if the stack pointer register belonged only to it.
Global variables got a bad name in the 1960's, when people designed programs the Fortran and COBOL way. There is some data, such as a bunch of arrays. These are global. The program consists of a growing number of procedures which work on the global arrays and variables. These procedures communicate with each other by the effect they have on the globals. The globals are the input to each procedure and its output. When one procedure finishes, it places its output into the globals, and then when the next one is called, it picks that up, and so on.
The global situation was somewhat tamed by languages that introduced modules. A module could declare variables that have program lifetime, but are visible only to that module, even if they have the same name as similar variables in another module. In C, these are static variables. C static variables and their ilk are considerably less harmful than globals. A module with statics can be as disciplined as an OOP singleton. The disadvantage it has is that it cannot be multiply instantiated, if that is needed in the future, without a code reorganization (moving the statics into a structure).
> See, for instance, in OOP there is the concept of singletons: objects of which there is one instance in the system. These objects sometimes have mutable state. Therefore, that state is global. Yet, the state is encapsulated in the object/class, so it is not accessed in an undisciplined way by random code all over the place. On the other hand, the reference to the object as a whole is a plain global: it's scoped to the program, and multiple modules use it. Ah, but then the reference to the singleton is not a mutable global; it is initialized once, and points to the same singleton. Therefore, singletons represent disciplined global state: a singleton is an immutable reference to an object (i.e. always the same object), whose mutable state (if it is mutable) is encapsulated and managed. This is an example of a "good" global variable.
Lol no it's not. It has all the problems of any other global: unsafe to use concurrently, difficult to test, difficult to reason about.
It's best not to conflate global variables and their problems with the issues of shared, mutable state.
The difficulties caused by global variable are related to them being shared, mutable state. But global variables are recognized as causing additional problems, in the context of programming with shared, mutable state. So that is to say, practitioners who accept the imperative programming paradigm involving shared mutable state nevertheless have identified global variables as causing or contributing to specific problems.
In an OOP program based on shared mutable state, singleton objects having shared mutable state do not introduce any new problem. The global variable they are bound to doesn't change, so the variable per se is safe.
(There can be thread-unsafe lazy initializations of singleton globals, of course, which is an isolated problem that can be addressed with specific, localized mechanisms. Global shutdown can be a gong show also.)
A singleton could be contrived to provide a service that is equivalent to a global variable. E.g. it could just have some get and set method for a character string. If everyone uses singleton.get() to fetch the string, and singleton.put(new_string) to replace its value, then it's no better than just a string-valued global. That's largely a strawman though; it deliberately wastes every opportunity to improve upon global variables that is provided by that approach.
I disagree; as far as I know the specific problems of global variables (over and above shared mutable state in general) are things that apply just as much to singletons. Things like absence of scoping, lack of clear ownership, and as you mentioned initialisation and shutdown, are just as much a problem for singleton objects as they are for non-object global variables.
Objects containing mutable state have some advantages over plain mutable variables (e.g. the object can enforce that particular invariants hold and invalid states are never made visible), but as far as I know those are just the generic advantages of OO encapsulation, and there's not really any specific advantage to encapsulating global variables in a singleton that doesn't equally apply to encapsulating a bunch of shared scoped variables into an object.
I generally strive to avoid singletons but there are cases of API usability where they're useful. If you can carve out the responsibility of what state is being tracked in the singleton then it's useful.
It's also not difficult to test as long as you write it to be testable. It may be more verbose & cumbersome but it's not actually difficult. That means you provide hooks testing the singleton implementation to bypass the singleton requirement but in all other cases it acts like a singleton.
As an example, consider Android JNI. The environment variable is very cumbersome to deal with in background threads & to properly detach it on thread death. It also requires you to keep track of the JavaVM & pipe it throughout your program's data flow where it might be needed. It's doable but it's conceptually simpler to maintain the JavaVM object in a global singleton and have the JNIEnv in a thread-local singleton with all the resource acquisition done at the right time. It's still perfectly testable.
> It's also not difficult to test as long as you write it to be testable. It may be more verbose & cumbersome but it's not actually difficult. That means you provide hooks testing the singleton implementation to bypass the singleton requirement but in all other cases it acts like a singleton.
At that point you're adding complexity that has a real risk of bringing in bugs in the non-test case. Nothing is impossible to test if you try hard enough, but the more costly testing is, the less you'll end up doing.
> As an example, consider Android JNI. The environment variable is very cumbersome to deal with in background threads & to properly detach it on thread death. It also requires you to keep track of the JavaVM & pipe it throughout your program's data flow where it might be needed. It's doable but it's conceptually simpler to maintain the JavaVM object in a global singleton and have the JNIEnv in a thread-local singleton with all the resource acquisition done at the right time. It's still perfectly testable.
Not convinced - to my mind the conceptually simple thing is for every function to be passed everything it uses. If you instead embed the assumption that there's a single global JavaVM that could be touched from anywhere, then that adds complexity to potentially everything, and any test you write might go wrong (or silently start going wrong in the future) if the pattern of which functions use the JavaVM changes (or else you treat every single test as a JavaVM test, and have the overhead that goes with that). For some codebases that might be a legitimate assumption, just as there are some cases where pervasive mutable state really does reflect what's going on at the business level, but it's certainly not something I'd introduce lightly.
> If you instead embed the assumption that there's a single global JavaVM that could be touched from anywhere, then that adds complexity to potentially everything, and any test you write might go wrong (or silently start going wrong in the future) if the pattern of which functions use the JavaVM changes (or else you treat every single test as a JavaVM test, and have the overhead that goes with that)
Not sure I follow. If you expect any code to invoke JNI then you are still responsible for explicitly initializing the singleton within the JNI_OnLoad callback. If you don't the API I have will crash so definitely not a silent failure. There's no external calling pattern to this API that can change to break the way this thing works. As for why this is needed it has to do with the arcane properties of JNI:
1. Whatever native thread you use JNI on, the JNIEnv must be explicitly attached (Java does this automatically for you when jumping from Java->native as part of the callback signature).
2. Attaching/detaching native threads is a super expensive operation. You ideally only want to do it once.
3. If you don't detach a native thread before it exits your code will likely hang
4. If you detach prematurely you can get memory corruption accessing dangling local references.
5. It's not unreasonable to write code where you have a cross-platform layer that then invokes a function that needs JNI.
If you're avoiding all global state you only have the following options:
A. Attach/detach the thread around every set of JNI operations. This stops scaling really quick & gets super-complicated for writing error-free composable code (literally manifests as the problem you're concerned about with code flow changes resulting in silent bugs).
B. Anytime you might need to create a native thread, you need to pass the JNIEnv to attach it. If the native thread is in cross-platform code suddenly you're carrying a 2 callback function pointers + state as a magic invocation as the first thing to do on a new thread creation & the last thing to remember to do just before thread exit. Also you have to suddenly carry through that opaque state to any code that may be invoking callbacks that require JNI on that platform. This hurts readability & risks not being type-safe.
At the end of the day you're actually also lying to yourself and trying to fit a square peg in a round hole. JNI is defined to use global state implicitly throughout its API - there's defined to be 1 global JavaVM single instance. Early on in Java days JNI was in theory designed to allow multiple JVMs in 1 process but that has long been abandoned (the API was designed poorly & in practice it's difficult to properly manage multiple JVMs in 1 process correctly with weird errors manifesting). This isn't going to be resurrected. In fact, although not implemented on Android, there's a way to globally, at any point in your program, retrieve the JVM for the process.
In principle we're in agreement that singletons & globals shouldn't be undertaken lightly but there are use-cases for it. It's fine if you're not convinced.
> A. Attach/detach the thread around every set of JNI operations. This stops scaling really quick & gets super-complicated for writing error-free composable code (literally manifests as the problem you're concerned about with code flow changes resulting in silent bugs).
Sounds like a monad would be a perfect fit, assuming your native language is capable of that. That's how I work with e.g. JPA sessions, which are intended to be bound to single threads.
> At the end of the day you're actually also lying to yourself and trying to fit a square peg in a round hole. JNI is defined to use global state implicitly throughout its API - there's defined to be 1 global JavaVM single instance.
Of course if you're using an API that's defined in terms of globals/singletons then you'll be forced to make at least some use of globals/singletons, but I wouldn't say that's a case of singletons being "useful" as such. And if you're making extensive use of such a library, then I'd look to encapsulate it behind an interface that offers access to it in a more controlled way (using something along the lines of https://github.com/tpolecat/tiny-world).
For many singletons it does not matter at all. E.g. 99% of all desktop gui programs and 99.9995% of games have a single main window by design - trying to abstract that with an API that simulates that you could have more than one just makes the code harder to read for no benefit (as no widget system except beOS' can be used outside the main thread anyways)
> E.g. 99% of all desktop gui programs and 99.9995% of games have a single main window by design - trying to abstract that with an API that simulates that you could have more than one just makes the code harder to read for no benefit
Being able to test UI behaviour is a huge difference maker. (Also even if you do believe that a singleton is ok in this case, it's clearly no different from a global variable).
> as no widget system except beOS' can be used outside the main thread anyways
> obviously UI tests are being run today so this is not really an issue, right ?
UI tests are notoriously slow, flaky and generally worse than other kinds of tests. They're absolutely a significant pain point in software development today.
> maybe, does not prevent writing a lot of very useful apps.
People write useful code with global state. People wrote useful code with gotos, with no memory safety... that computers are useful does not mean there isn't plenty of room for improvement.
Avoiding singletons in the app implementation will not put a dent in UI testability. If you instantiate the MainWindow as a local variable in the top-level function, and pass that object everywhere it is required as an argument, external testing of your UI is not any easier.
It's a step in the right direction, and it gives some immediate value: you can see which functions don't actually need the MainWindow and can therefore be tested conventionally (you might argue that those were never actually UI tests, but in practice you'll end up using your UI testing techniques for things that don't actually use UI if you can't tell), and you're nudged towards only passing it where it's needed; also you could try to mock or stub it, which might cover at least some of the simple cases.
You've heard it hundreds of times over the course of your career and yet you never once questioned it? Either you're exaggerating to make a rhetorical point, or you have such an apathetic attitude towards the issue that you can't (or haven't tried to) reason about why it polarizes people.
Taking your comment in good faith, not all global state manipulation is equivalent. Depending on how you do it, structured global state manipulation could mean have you end up with something like Postgres, where you have orderly read and write interactions that you can reason about with set theory and transaction monotonicity. It could also mean something like using an in memory cache or session store to persist temporarily durable data. Any kind of structuring like this around what you can read and write and for how long gets you further away from the idea of globals, and that's the point. It's a tool that doesn't reward reaching for it prematurely.
I write or deal with a lot of C, unfortunately. I try very hard to not have global variables, and to minimize sharing between threads. When I pick up a C codebase, one of the first things I do is build it and inspect the object files to see what globals exist. The same can be done in C++, and should be. Use inheritance sparingly. Don't use exceptions if at all possible. Use modern C++ as much as possible, and borrow ideas from Haskell/Rust as much as possible. I'm thinking of https://stackoverflow.com/questions/9692630/implementing-has...
I am so glad when I got to use C, I already had a good school of modular programming languages behind me.
On my own projects, like university assignments, I would treat each translation unit as a kind of module, anything that for whatever reason could not be in a handle structure would be a internal static (years later I started using TLS instead), and in some cases incomplete structs as means to avoid the temptation to directly access internal data.
Game development is a rapid prototyping adventure that is fuelled by the fact what you are producing is ultimatly a form of art. Architectures are based on abstraction, and abstraction is ultimately mindful ignorance; in this case of specific requirements or specific goals which are going to change because you are creating art. You are going to find out as you continue to develop that technical debt builds because the changing requirements create conflicting workflows which is why you get spam in the header. It's a lot faster to prototype something through duplication, cobbling, or refactoring then later on use automation to remove the chunks of code that are not used and reduce line count by creating utility functions because at that point, part of the project is set in stone and the project is going in one direction. Things will gyrate back and forth between messy and clean, and hopefully you have the budget to refactor to clean before you ship as modders don't like dirty game code.
Games are a simulacrum of reality and reality doesn't say properties of two different objects can never, ever, interact with each other; that's why you have the abuse of global variables to store state and also why there's a rich speedrunning community using all sorts of hacks in games to speed up their playtime due to unforseen edge cases. If you build a model of reality, you're going to be doing R&D learning how it interacts with itself, just like we do today!
Nobody wants to play a game with a static workflow.
What stands out is how apologetic you are for pointing out that a language might be worse (gasp!) than another language. When did this "all languages are roughly equal, and if you say anything else you're a zealot" ideology get so widely entrenched in our industry?
As someone who came up as a C++ game dev I just ran into tons of people that acted like people using managed languages were automatically inferior programmers. Even imbibed of this belief a bit myself.
This was a view purely sourced from ignorance. There were people creating awesome things with Java and Python at the time that I and my contemporaryies could probably never coded up.
It was quite embarrassing when I came to realize the combo of ignorance and arrogance I was working from. So now I tend to bias toward assuming most languages people are working with are useful and warrant some amount of respect. I try to only criticize languages I'm extremely familiar with and have had the opportunity to see bad patterns repeatedly emerge from in a variety of code bases.
Basically, I think we can call some languages "bad" or "good" it just takes a lot of evidence and I'd rather avoid ranking them altogether.
> I also understand there are situations where despite its shortcomings it is the right choice.
Would you say the reason for choosing it are not inherent to the lang itself but to things like: experience of the team, availability of libraries/ecosystem, need for mature/fast compilers?
Can't speak for parent, but in our case it's the only choice with zero-overhead abstractions and good cross platform support (Obj-C++, Android NDK, WebAssembly, Linux for tests). I wish Rust were there, but it's not.
I then worked for awhile developing a fairly fresh C++ code base. The programers I worked with were very willing to write maintainable code and follow a standard and it was still really damn hard to keep things like header hygene.
When I go back to the language I can't believe how much time I spend dealing with minor issues that stem from the bad habits it builds. For years I would refuse to say any language was good or bad. Always I insisted you use the right tool for the job. And there are some features of C++ that when you need them you have to use that language or maybe C in its place. But the shortcomings are unrelated to language's issues which largely seem to come from a focus on backward comparability. And so even used in its right application it seems incredibly flawed. And I pretty much believe it's a bad language now.
Disclaimer: I learned to program with C++, I understand its power and for years I loved the language. I also understand there are situations where despite its shortcomings it is the right choice.