What Is Null? (2010)

hardwaregeek · on Aug 3, 2019

Non nullability is probably one of the most important concepts I've learned. People can talk your ear off about macros or about borrow checking or whatever cool feature is in their favorite languages. But non nullability isn't a feature as much as the removal of a terrible one: types being automatically unifiable with null.

What does that mean? Well basically that every single (reference) type in most languages comes with an implicit "and null". String in Java? More like "String and null". Car object? Actually it's "Car and null".

Why is this a bad thing? Well null is a very specific type with a very specific meaning, but because it's automatically included with every type, people end up using it for a bunch of situations where a different type or value would work. Let's take a simple parser. A naive implementation, upon reaching the end of the string, might just return null. After all, nothing has been found. But that's not a null value, that's an EndOfString value! The moment you pass that value out of the context of the function, you need to remember that null means EndOfString. Or maybe the string you're passing in is a null value in the first place. It'd be tempting to return out null, right? Except you've now lost information on whether the string itself was null, or if something happened in the parse function that caused it to return null.

That isn't to say null is wholly evil. There's certainly uses for null. But it's often way better to contain its use with Option or Maybe, essentially wrappers that detail "hey, this value could be null". These wrappers are not unifiable with regular values, which forces you to think about where values can and cannot be null.

I totally understand if language designers want to omit features that they deem unnecessary or overcomplicated. I get it if you want a language sans generics or sans macros. But I don't understand keeping a feature that has caused far too much pain and encouraged far too many bad practices.

WalterBright · on Aug 3, 2019

If you don't have a null, then you have to set aside some value for the type as the default value. With null, you get a seg fault if you try and use it.

Without a null, you'll need to create a special default value which performs the same function - giving an error if you try to use it.

Floating point values have NaN for this, UTF-8 code units have 0xFF. The D programming language uses NaN and 0xFF to default initialize these types. NaN is good because it is "sticky", meaning if a result is computed that depended on a NaN, the result is NaN as well.

Some people complain about this, wanting floats to be 0.0 default initialized. But then it would be nearly impossible to tell if an unintended 0.0 crept into the calculation.

> More like "String and null"

Null's replacement, Maybe and Optional, still have the extra check.

dbaupp · on Aug 3, 2019

Another way to address that problem is to not default initialise at all. There's then no need to worry about a default value for pointers, or about NaN vs. 0.0 for floats.

In addition, null doesn't always give a segfault, as I'm sure you're aware. In C and C++, using (dereferencing) null is undefined behaviour, and a segfault is the best-case result. The code may have been optimised so that there's not a direct dereference (which would segfault) but instead other operations that rely on the pointer being non-null.

XMPPwocky · on Aug 4, 2019

Not to mention "fun" cases like this (contrived example) code-

  // Read sparse array from file

  uint64_t len = read_u64();
  uint8_t *buf = malloc(len);
  while (more_data_remaining_in_input_file()) {
    uint64_t pos = read_u64();
    if (pos >= len) { abort(); /* hackers detected! */ }
    buf[pos] = read_u8();
  }

which will very reliably compile into a write-what-where primitive when passed a gigantic `len`. malloc() fails and returns NULL (yes, it'll do this even on Linux when virtual address space is exhausted), and nasal demons emerge rapidly from there.

WalterBright · on Aug 4, 2019

While this can happen, it's theoretical. In 40 years of dealing with null pointers, I've never seen one happen where the offset is outside the protected null page.

The reason is simple - very few allocated objects are that large, and for the arrays that are, they get filled from offset 0 forwards.

The real problem with C is not null pointers, it's buffer overflows caused by C arrays inescapably decaying to pointers, so array overflow checks can't be done:

https://www.digitalmars.com/articles/b44.html

XMPPwocky · on Aug 4, 2019

Out of curiosity, have you been trying to make things work, or have you been trying to break things?

Because- here's the thing. I've been messing around with software security stuff for 5 years or so, and I've seen exploitable bugs related to a pointer unexpectedly being null twice.

There's a big difference between the kind of bugs you find "organically", when somebody's trying to use the software normally, and the kind of bugs you find when you're crafting (or fuzzing) absurd inputs that make no sense and that no ordinary software would produce. Perhaps this is why I've seen more of these bugs despite my much shorter career?

WalterBright · on Aug 4, 2019

> I've seen exploitable bugs related to a pointer unexpectedly being null twice.

I've never heard of one, what I hear about endlessly are buffer overflows.

Can you give more information about these? I want to learn more.

XMPPwocky · on Aug 4, 2019

Yeah, definitely! The one that comes to mind was in a game's texture loader - because of players being able to use custom "spray paint" images, this was exposed to untrusted code.

It unpacked mipmaps from a packed texture file into a malloc'd buffer, with an option to skip the first N mipmaps. (If I remember correctly, it'd then go back and upscale the largest mipmap to fill in the ones it skipped.)

Mipmaps were stored largest first, consecutively- so the first, say, 512x512xsizeof(pixel) would be the biggest mipmap, then you'd have 256x256xsizeof(pixel) bytes for the second-biggest one, etc, down to some reasonable (i.e. not 1x1px) minimum size.

The issue came when a texture's dimensions were specified as being so large that malloc'd fail and return NULL. Normally, this wouldn't be an issue (besides a denial of service) - but by skipping the first N mipmaps, you'd instead write to (where x and y are the dimensions of the texture)

  def addr(n, x, y, pixel_size_in_bytes=3):
    out = 0
    for i in range(n):
      out += x*y*pixel_size_in_bytes
      x >>= 2
      y >>= 2
    return out

By choosing x, y, and N carefully (I used a SMT solver to help) you could overwrite a function pointer and get it called before the later upscaling operation ran (since that would access 0x0 and crash).

It's definitely a unique bug, but this sort of thing does happen in real code.

Making malloc() and friends panic or similar on failure instead of returning NULL would fix most of these bugs- but it does sort of seem like the whole idea of sentinel values and in-band signalling is hazardous.

Go-style 'f, err := os.Open("filename.ext")' has its appeal from that perspective- you can forget to check "err" before doing things with "f", but I assume the Go ecosystem has good tooling to catch that.

Also probably worth noting that arguably this bug is related to C arrays being just pointers wearing fancy pants- as long as you can't get a slice where the pointer is NULL but the length is nonzero.

WalterBright · on Aug 4, 2019

Yup, exploitable all right. Thanks for the explanation!

Usually, what I do is just make a shell around malloc() that checks for failure and aborts the program with a message. Then, and only then, it gets converted to a slice.

I'll still maintain, however, that buffer overflows are a couple orders of magnitude more common. You case involves loading a file essentially backwards, which seems awful obscure. When I load files piecemeal, I'm using memory mapped files, not a malloc'd buffer.

__s · on Aug 4, 2019

What? I don't get why you're saying "you have to". Besides, with null you don't necessarily get a segfault, in C you get undefined behavior

Rust skips nulls & statically disallows use of variables before initialization

WalterBright · on Aug 4, 2019

> Rust skips nulls & statically disallows use of variables before initialization

That doesn't mean you won't need a "not a valid object" for a type. That's why there all these Optional and Maybe constructions - you're still checking for it.

dbaupp · on Aug 4, 2019

In my experience, Option/Optional/Maybe of a type T comes up a lot less than the type T itself. That is, needing "not a valid object" is a rarer than only requiring objects that are always valid.

Those construction exactly exist so that you have to check for it. They force the programmer to think about the "maybe invalid" state, whereas built-in-null doesn't prompt this (and, indeed, since might-be-null is the rare state, people often just assume they're working with a valid instance of the type and don't handle null at all).

FreeFull · on Aug 4, 2019

On the other hand, Rust doesn't let you use an `Option<T>` as if it was a `T`, so you can't just forget to check. And most of your code won't be using Option, it's only used right where it's needed.

antonvs · on Aug 4, 2019

> That doesn't mean you won't need a "not a valid object" for a type.

This confuses two different concepts, and that confusion is precisely the problem with "null".

Most types neither need, nor have, a "not a valid object" value. The "not a valid object" value is typically disjoint from the type.

Null can't be used instead of some other type, as though it were that type. Instead, you have to perform a comparison to decide whether you're dealing with null, or a value of the expected type.

The problem with null, in languages that have it, is the inability to statically prove that terms cannot be null.

This means null is a pervasive possibility that almost always has to be checked for, which is an unnecessary source of bugs. Strictly speaking, you always have to check for null, except in those cases where you can prove (informally, since the compiler doesn't help with this) that you already did that.

> That's why there all these Optional and Maybe constructions - you're still checking for it.

The point is that option types allow you to choose whether you want to allow non-values at a particular term. If you don't - e.g. if a function returns a value not wrapped in an option type - then you're statically guaranteed to be able to use that value without having to check for null.

The only place where you need option types are where you're explicitly allowing for an optional value. And in that case, the option type helps because it statically disallows you from treating the value as though it were an instance of the underlying type.

So while it might seem like "you're still checking for it," there's an important difference in the context of that check - the compiler will give you an error if you try to use the value without unwrapping the option.

mixedCase · on Aug 4, 2019

> you're still checking for it

Where it makes sense to because you actively want it, nowhere else, and you can't accidentally forget it without making code smells bad enough to set off the fire alarm.

__s · on Aug 4, 2019

Equating Option<T> to T is like equating T[] to T, so you're saying everything should be arrays (& in fact, C goes that far by making it unclear whether int* is a reference to one int or many ints)

hardwaregeek · on Aug 3, 2019

Null certainly must exist at the target level. I'm not denying that. But languages can be designed to abstract nullability into controlled forms like Option/Maybe.

XMPPwocky · on Aug 4, 2019

> Null certainly must exist at the target level.

Why? If anything, null must not exist at the target level- having a dedicated region of memory just for "accesses here are bad" at 0x0 isn't ubiquitous, and even having that at all isn't guaranteed (especially in embedded, where you might not get memory protection!).

Modern operating systems have to keep userland from mapping the 0x0 page because it turns out programmers just cannot keep themselves from dereferencing NULL, and that's often exploitable when done from the kernel.

sjolsen · on Aug 4, 2019

>especially in embedded, where you might not get memory protection!

Or where there's perfectly valid memory mapped to address 0

WalterBright · on Aug 4, 2019

The 8086 put the DOS vector table at address 0, in retrospect a truly terrible choice. Any null pointer writes would trash the operating system.

A much better design would have been to put the system ROMs at 0, where at least DOS would survive a null pointer.

goto11 · on Aug 4, 2019

Sometimes you do need a marker for "not initialized", but the problem is languages where every type includes this "not initialized" value. This would be like a language were every value should be a Maybe (except that would be an infinite regress).

For all variables which are definitely assigned (e.g. "int x = 17") you don't need a special "uninitialized" value, and the type system should be able to reflect that.

I guess floating point numbers are a special case, since the standard mandates that it includes the NaN value.

WalterBright · on Aug 4, 2019

> I guess floating point numbers are a special case, since the standard mandates that it includes the NaN value.

NaN is more useful than that. Consider if you've got a sensor array. The data from the array is crunched to produce a result. Suppose there are a few failed sensors in the array. By using NaN's for those sensors' data, you can then tell what part of your crunched output is dependent on that bad data.

If you'd just used 0.0 for the bad sensor data, you'd have no idea.

goto11 · on Aug 4, 2019

Oh I agree about that, and I agree that NaN is a good default for an uninitialized floating point variable.

My point was that you don't always, with every variable, need to be able to express an "undefined" or "invalid" value. The type system should be able to express if a given value is nullable or not.

tempodox · on Aug 4, 2019

> Null's replacement, Maybe and Optional, still have the extra check.

That's a straw man. You would also have to check for Null / NaN / 0xFF to see if you're dealing with a valid object.

WalterBright · on Aug 4, 2019

That's true. Though for Null and NaN, the hardware does that for you.

rogual · on Aug 4, 2019

> The moment you pass that value out of the context of the function, you need to remember that null means EndOfString

Good point, and I don't disagree, but it's also interesting to note that once you return 4 from CountChildren(), you similarly have to remember that it represents the number of children.

hardwaregeek · on Aug 5, 2019

Except you can always assign the result of CountChildren to a `childrenCount` variable. Is there a variable name you could give to the result of `parse` that'd accurately describe the error states?

namelosw · on Aug 4, 2019

I don't see a valid use case for null. It's just a easy-to-make billion dollar mistake.

Null exist because people are being lazy, and don't want to think reference type T to be T | null. Instead, in most of the cases they ignore null unless they know it will crash.

It seems to be every use case for null could be handled well in Haskell, which means null is not mandatory in language design. Everywhere using null can be replaced with Maybe or concrete union type like Int | NaN just fine.

nerdponx · on Aug 5, 2019

No. The problem is conflating "logical null" (the absence of data) with "technical null" (a pointer to nothing).

Logical null is often essential for expressing business logic in code. Technical null is (as far as I can tell) a C implementation detail that probably never should exist in a higher-level language.

The Haskell equivalent of logical null is Nothing. There is no technical null because there is no need to represent that one specific implementation detail in Haskell.

As for using NaN to represent logical null instead of a special reserved value, that has a lot of ergonomic problems. Anyone who has used Python for data analysis can attest.

wruza · on Aug 4, 2019

I think we need more nulls then. In a sense. Consider a ‘cell’ (value box).

1. Null for no value there. But it can be set, that’s our conventional null.

2. Null for many values from a filter that cannot fit into a single cell. Can be set all at once. Not very null, but cell’s situation-wise it is.

3. Null for no value that cannot be set, e.g. filter produced no values to show or edit at all.

4. Null that is not equal to itself for relational joining in non-sql. (3=4? May be to some extent)

5. Null for a value that must be there, but is not yet (Promising Null). Cannot be set and throws on any comparison, since semantics are unclear until resolving. (4,5) require special isoftype() tests to check their type.

5a. Null for a value that is Promised only on access. Doesn’t throw, but instead yields until it’s done (or throws on failure). You can still check if it is already here, e.g. to choose a better access pattern.

6. Null for a missing key, as opposed to a key holding any of null values above. Absense of a cell itself. Not really 6, but another dimension (remember Perl exists $hash{key}, Js, Python ‘key’ in object, etc, that is hard to test from the object itself).

I’m not a type scientist, but it is clear that semantics of a value in reality decompose in much more ways than just null/non-null. Objects (including primitives) should have at least so much null values when used not in a vacuum, but in loading cross-referenced dataset aggregations that appear in our programs everywhere. I’m not really making any counter-argument here, just my daily job observations to discuss!

But I believe that a language that could cook it all right at the core level and remain convenient to program in will be a million years’ win.

hardwaregeek · on Aug 4, 2019

Which is why variants/discriminated unions are so powerful. A common "hump", if you will, that beginners get over with OCaml is an over reliance on booleans. A beginner will write a function that takes an isOpen argument. A more advanced programmer will model the state as a discriminated union that is either Open or Closed.

Combine that with generics and you can get generic unions like Maybe which allow you to create sets of states that can be composed with any type.

inopinatus · on Aug 4, 2019

Also from Ward Cunningham is the CHECKS Pattern Language (published as far back as 1994), which developed novel ideas about value objects well beyond the mainstream of the day. The examples are in Smalltalk but the uniform abstraction level for values is an instantly applicable concept, allowing generic treatment of value objects, and is probably more accessible to modern audiences comfortable with option types and so on than it was at the time.

http://c2.com/ppr/checks.html

rogual · on Aug 3, 2019

One thing that intrigues me about null is that conceptions of null seem to divide into two main families.

In the first, a type nullable(T) is constructed from a type T by simply adjoining a "Null" value to it. So, a nullable integer type could take the values {Null, 0, 1, -1, 2, ...}

In the second, nullable(T) is more like a "box" that you have to get the value out of. This is what Haskell is doing with its Just & Maybe. Here the values of our nullable int would be {Nothing, Just 0, Just 1, ...}

That's actually a fairly big difference. For instance, in the first model of null, nullable(nullable(T)) = nullable(T), but in the second, those are distinct types.

Another big distinction is that in the first model, nullable(T) is a superset of T, where in the second, it's not.

I haven't seen anyone give a name to this distinction or talk about when the first is more useful vs. the second.

chenglou · on Aug 3, 2019

Yeah, more practically speaking, this “flattening” behavior of nullable(nullable(foo)) = nullable(foo) makes type checking much harder. This is one of the things that’s subtle and hard to get right when you tackle a type system on top of an existing dynamic language. The general case also induces horrible type checking performance. Another example of flattening would be JS’ promises, where awaiting on nested promises “flattens” every level into one (in FP jargon, it’s map and bind combined into one function).

With the right sugar, the non-flattening version, option type, is actually _much_ better for not just type checking, but also developer ergonomics. Consider if you want to distinguish a map missing a value, or having a value that happens to be null. Or an iterator that asks you to return null to signal termination, but you just happen to want to provide the value null.

kmill · on Aug 3, 2019

The first is a union type, and the second is a disjoint union type. The second is also known by the names of tagged unions, sum types, and discriminated unions.

For the first to be useful, there needs to be some hidden disjoint union to implement it (how else would you know it's an int vs the adjoined null?) but the type equivalence rules would let unions collapse: union(T,Null,Null) = union(T,Null).

Here's an illuminating example I saw once: consider a class with a member that represents a value that is known or unknown, and it might take some time to initialize the value. There are three states: uninitialized, initialized but unknown, and initialized and known. This can be modeled by nullable(nullable(T)). If the nullable collapses, you can't tell the difference between uninitialized and unknown anymore.

BlackFly · on Aug 4, 2019

I like having a semantic difference for async values (maybe not computed yet) and for optional values (maybe no value).

Thus you get Future<Optional<T>>. It is of course helpful since the "Do this when a value happens, do that when a failure happens" algorithms can be encoded onto the future. Obviously you don't want people implementing busy waits.

Edit: as to your overall point, I agree that this just cannot be modeled with null. The intention is just not capturable.

jraph · on Aug 3, 2019

The second looks like an option type (and the first, a nullable type), are these the names you are after?

https://en.wikipedia.org/wiki/Option_type

contravariant · on Aug 4, 2019

>That's actually a fairly big difference. For instance, in the first model of null, nullable(nullable(T)) = nullable(T), but in the second, those are distinct types.

The difference is basically whether you use the Monad structure explicitly or not. Whereas Nullable(Nullable(T)) might not equal Nullable(T) if you use the Monad definition, there is a natural transformation between the two, which the other method just uses implicitly to make the two equal.

It's basically the difference between having a function which maps 'Just Nothing' to 'Nothing', or adding a rule to your programming language that 'Just Nothing == Nothing'.

Now the advantage of using the Monad structure is that you've got an easy way of chaining multiple functions that depend on non-nullable arguments, but might not return a value. So if you've got f: X -> Maybe Y and g: Y -> Maybe Z, the monad structure makes it easy to chain the two into a function h: X -> Maybe Z (this operation is basically the definition of the Monad). With the implicit definition you need to put a check in between to ensure the first function doesn't return Null. This difficulty is basically the reason for the recent addition of ?. operator to C# which first checks for null before accessing a member of an object.

Of course once you know it's a Monad you can also (ab)use any special notation a language might have for handling them. In python it's pretty tempting to just use:

    for y in f(x):
        for z in g(y):
            yield z

to handle this particular situation. You just need to realise the equivalence between a function returning a nullable and a generator of at most 1 value. It doesn't really matter if you handle things explicitly or implicitly for this, but making the structure explicit helps take advantage of coincidences like this.

H8crilA · on Aug 4, 2019

In C++ that's the distinction between pointers and references (a reference cannot be null; creating a null reference is an undefined operation). Although it is much more clunky than in languages with algebraic data types. Who doesn't love the clarity of algebraic data types ...

dreamcompiler · on Aug 4, 2019

Common Lisp addresses some of these problems out of the box. It returns a second value from some functions to distinguish between "nothing was found" vs. "A value was found, and it is nil."

CL also uses a special reserved value to mean "unbound" so it's always clear when a symbol or instance is uninitialized vs. initialized with nil. It's not possible for the programmer to portably see this value or assign it but there are functions to find out if a container is unbound and set it to such.

Having said that, problems remain. I prefer to use maybe and error monads in my CL programs now rather than just returning nil. That solves most of the remaining issues.

SigmundA · on Aug 3, 2019

I regularly need two types of "null" and both JSON and XML among others provide it:

1. Unknown value, the user whatever could not provide a value so it was left blank, or it had a value and was purposely removed (as in an update), but it was set by the other side specifically to null.

2. The value was not not provided, that is it was not set by the other side, missing property or element (undefined). This usually means do not modify the value if doing an update, distinct from setting it too null.

ACow_Adonis · on Aug 3, 2019

For the record, I get this all the time in my work and it's super frustrating that none of the "science or stats languages" (with the possible exception of SAS of all things) natively support multiple types of nil/missing data.

I often need not applicable, unknown, unsupplied, zero types, other, undefined/ nonsense/error, theoretically knowable but not currently present, missing, etc, depending on the context.

mindB · on Aug 4, 2019

Also for the record, Julia has all of 0, Missing (indicating data that is uknown), Nothing (indicating data non-existence), floating point NaN (as well as +Inf and -Inf of course), and exceptions for actual errors in Base and the Standard Libary. If you need more than that, user-defined types are just as performant and relatively trivial to implement.

Not sure if you were including Julia in "science or stats languages", but there it is anyway.

ACow_Adonis · on Aug 4, 2019

Somewhat off topic, but my main problem with Julia is that my colleagues/correspondents won't understand it, it's not installed anywhere i need it, and my impression is they tried to make it MATLAB'y as though that was a positive rather than a negative.

If I wanted a performant, compiled, solution that allowed me to program up the answer myself that wasn't installed anywhere and everyone else couldn't understand, I'd just cut out the MATLAB syntax and install SBCL lisp :p

mindB · on Aug 5, 2019

If it's a helpful perspective to you, here's an economics researcher who had been using common lisp for scientific computing and why he switched to Julia. I found it helpful for choosing between the two when first selecting a language for personal use:

https://tamaspapp.eu/post/common-lisp-to-julia/

nerdponx · on Aug 5, 2019

R does have both NULL and NA, but you can't easily use NULL in all the places where you can use NA.

inlined · on Aug 3, 2019

I’m sad this didn’t include the mongo query language. My favorite Mongo WAT is the query {x: null}. It will return all documents where x:

1. Is not in the document

2. Is equal to the literal null value

3. Is a list that contains a literal null value

Waterluvian · on Aug 3, 2019

I ran into a very frustrating but unsurprising Null case last week: a value on the client having three states: "not retrieved yet", "no value", and "value".

Javascript's ugly parts called to me, "use undefined!" but that's such a bear trap. I ended up with a separate enum of the possible states for the value's variable.

ben509 · on Aug 3, 2019

It's a trap!

You think it's safer to return null than raise an exception.

After all, the caller just needs to check. Of course, sometimes they don't.

So if you're lucky it blows up in some completely unrelated part of your application. Being null, it carries no information as to where or what the error was.

If you're not lucky, it's saved to disk. Now you have a silent null somewhere in your data that blows up potentially months after the actual error.

heavenlyblue · on Aug 3, 2019

Well... No null check is essentially a tech debt that you are going to pay.

The beauty of tech debt is that most of the time the universe is responsible for making you pay up for not making the right decisions in the first place.

Sometimes you get lucky (we find other ways of dealing with the issue)... Sometimes you don’t.

H8crilA · on Aug 3, 2019

The thing which is not.

In most mathematical logic systems "false" or "⊥" has the nice property that anything can be derived from it (⊥ -> p, for any sentence p). I find it funny that the undefined behavior of null dereferencing works the same way in C - literally anything can happen to the program, so the compiler is free to assume (derive about the current state of execution) anything it wants.

dooglius · on Aug 4, 2019

It's not so much "funny" as a rather obvious consequence of the principal

silasdavis · on Aug 3, 2019

Anything can be derived from p or not p, not from false.

H8crilA · on Aug 3, 2019

The sentence "p or not p" is a tautology ("true", ⊤) in classical logic, so you can't derive anything useful from it. It can carry some power in non-classical logics, like the intuitionistic logic, where it's not true that "p or not p" for any sentence p.

You can derive anything from false, this is known as the principle of explosion: "ex falso quodlibet", "from false anything":

https://en.wikipedia.org/wiki/Principle_of_explosion

thegeomaster · on Aug 3, 2019

This is just a corollary of the fact that "p or not p" is false (for any value of p).

pretty_lorelei · on Aug 3, 2019

"p and not p" is false, "p or not p" is always true.

thegeomaster · on Aug 3, 2019

Oops. Went a bit on autopilot with that one.

rq1 · on Aug 3, 2019

Then everything is true. :)

verytrivial · on Aug 3, 2019

I remember having a realisation regarding null in SQL data models that is probably obvious to other people who paid attention to the "History of SQL" parts of the lecture, but was along the lines that tables are sets, and all relations are sets, so where a relation has no value, you are really just talking about the empty set, and there is only one of those. i.e. You can remove all null "values" from your tables by normalising on that column -- null was just where the join is now empty. (But obviously don't actually do that, it was more a thought experiment for a null-less nirvana.)

ben509 · on Aug 3, 2019

Also, there is a QL that treats all values as sets, EdgeQL[1], and they get null for free.

[1]: https://edgedb.com/docs/edgeql/overview#everything-is-a-set

ben509 · on Aug 3, 2019

Yup, you get it for free if you normalize to 6th normal form. As you indicated, it's not a great idea in SQL DBMS's.

https://en.wikipedia.org/wiki/Sixth_normal_form

kissgyorgy · on Aug 4, 2019

In Python we sometimes differentiate the different meanings like "whatever", "default" or "not set yet" by defining a singleton object and comparing by identity, because None would be acceptable value and we need to differentiate the meaning. Examples:

https://github.com/marshmallow-code/marshmallow/blob/c1506cc...

https://github.com/python-attrs/attrs/blob/eda9f2de6f7026039...

trollied · on Aug 4, 2019

I adore SQL NULL.

The problem is that people don't use it correctly & the empty string can be a nightmare.

SOrry, not adding much to this, but it's a significant problem in the real world. People often don't understand NULL.

rogual · on Aug 4, 2019

How do you use it? I'm often in a situation where I'm not sure whether it's best to allow NULL, "", or both. If a string column is nullable, should you always disallow ""?

Numbers don't have this problem so much, it's really just strings and other types that can be "empty" without being null. Arrays, I guess, though they're not nearly as common.

Koshkin · on Aug 4, 2019

There are many forms of non-existence. Maybe there should be more than one NULL.

mr_toad · on Aug 4, 2019

Since null doesn’t equal null and is also not not equal to null in SQL, there are effectively unlimited nulls, all possibly different.

moosingin3space · on Aug 4, 2019

Algebraic data types!

jmchuster · on Aug 3, 2019

Great talk by Sandi Metz on the topic - Nothing is Something https://www.youtube.com/watch?v=OMPfEXIlTVE

riazrizvi · on Aug 4, 2019

I consider null to properly mean 'a value outside some value domain', and nil to mean 'zero value in some value domain', and both these values are useful, and can be assigned to variables of that value domain. So C/C++'s definition of NULL pointers is a historical transgression that has now stuck, as far as I am concerned.

Examples: p=null-pointer should mean invalid adress, p=nil-pointer should mean zero offset/address. l=nil-list should be empty list, l=null-list is invalid list. So null-list could also be called undefined-list. c=null-character is correct as a character, and works well as a terminator. s=null-string undefined string, s=nil-string empty string "".

chewxy · on Aug 3, 2019

> Nothing: In HaskellLanguage the other possible value of the 'Maybe' datatype.

Wouldn't null be bottom?

Ao7bei3s · on Aug 3, 2019

No.

First, null is generally a value, not a type.

Second, think of bottom types as "does not return".

Taking C as an example: C doesn't have a bottom type, only void which is a unit type (it has a single value, which is anonymous in C). However you can tell the compiler that you _actually_ meant the bottom type:

    void panic() __attribute__((noreturn)) {
        while(1) {
        }
        __builtin_unreachable();
    }

Scala is also interesting, in that it supports all three (Null, Unit, Nothing).

afiori · on Aug 4, 2019

Bottom should rather be the empty type and one consequence of that is that is you happen to build a x of type bottom then x sort of belongs to every type

Ao7bei3s · on Aug 4, 2019

That's the formal definition, and an interesting corollary with interesting consequences (from a typing perspective, a function of any return type can always opt to halt instead of returning. note that in a lazily evaluated language like Haskell the caller can still continue, as long as it does not evaluate the result).

But I didn't go with that because it doesn't answer the question (is null the bottom type? no.), and in the context of return types (the main use case, though not the only) it's equivalent to what I said (no possible return value -> cannot return).

ben509 · on Aug 3, 2019

They later mention Bottom and get into some discussion of a null value vs. an empty type.

Interestingly, Haskell also has Data.Void.

Grustaf · on Aug 4, 2019

Baby don’t hurt me

alkonaut · on Aug 4, 2019

The two biggest PL design flaws are implicit conversions, and nulls.

They are mostly the same flaw.

enriquto · on Aug 3, 2019

What ever happened to c2.com ?

It used to be one of the best sites on the internet. Now I cannot even read it with and ad blocker.

starsinspace · on Aug 3, 2019

It was reworked to use so-called "modern web technologies".

donio · on Aug 3, 2019

SecondSystemEffect

wffurr · on Aug 3, 2019

Works fine for me. You have to enable JS now tho.

louiz · on Aug 4, 2019

Enable JS to, LITTERALLY, just display black text on white background. No image that loads when you scroll or whatever, no fancy sidebar that scrolls weirdly and annoys you, nothing. Just plain text. And you need javascript.

That’s absurd.

enriquto · on Aug 3, 2019

Not really. I am using umatrix, and after "enabling everything", all that it shows is the spiral loading thingy.

jraph · on Aug 3, 2019

Alternative way to access the content:

1) curl -s http://c2.com/wiki/remodel/pages/WhatIsNull | jq -r .text

or:

2) Go to http://c2.com/wiki/remodel/pages/WhatIsNull

In the web console :

    document.body.textContent = JSON.parse(document.body.textContent).text
    document.body.style.whiteSpace = "pre-wrap"

---

I hope one day we will have some kind of standardized markup language that allows presenting text in a browser in a plug and play way… text/plain for the most basic stuff would do, and we could add some kind of tags to have stuff like italic / bold parts and titles, and links, too.

enriquto · on Aug 3, 2019

> I hope one day we will have some kind of standardized markup language that allows presenting text in a browser in a plug and play way… text/plain for the most basic stuff would do, and we could add some kind of tags to have stuff like italic / bold parts and titles, and links, too.

You should mark this as sarcasm or some young "data architects" will not understand your point here.

jraph · on Aug 3, 2019

This reminds me of a comment from StavrosK 4 months ago:

"I will never stoop so low as to telegraph my own joke."

https://news.ycombinator.com/item?id=19508172 (funny discussion by the way - that was when Cisco "fixed" a vulnerability by blocking curl's user agent).

dreamcompiler · on Aug 4, 2019

This "markup language" idea sounds intriguing. Perhaps we should form a committee to design it, and then a consortium to standardize it. It could even be promulgated World Wide!

dreamcompiler · on Aug 4, 2019

I think we're going to need styles too, so that individual roles (like headings and paragraphs) don't have to constantly restate their font info.

But it's absolutely essential that the styling language not be turing equivalent! That would be a nightmare. So it must have a minimalist design.

wruza · on Aug 4, 2019

Please add simple scripting for form validation.

dreamcompiler · on Aug 4, 2019

Good idea. Scheme would be a good choice.

jraph · on Aug 4, 2019

You mean you want to run arbitrary code on a user's computer behind their back? Sounds terrible.

Forming the committee does not need script validation.

dreamcompiler · on Aug 4, 2019

We could run the code in a chroot environment that simulates the whole machine. It would be a safe place to play. We could call it something cute like "sandbox." Naturally it would be resource limited so code couldn't consume all your memory or CPU cycles. That would be completely under user control. People might want code to show ads, but the user could limit them to 100 ms of load time, 5 seconds of wall clock time, and no more than 10% of the screen.

jraph · on Aug 4, 2019

What browser would go to great lengths to implement such a sandbox to run a code which the user does not care about and for which they have no use anyway?

People put "no ads" signs on their mail boxes. Do you really think they'd see such a thing as a feature? Crazy stuff, no browser on their right mind would shoot themselves such a bullet in the foot. People would immediately turn to the competition, should such a thing happen.

What next? Document viewers weighting several megabytes? Containing a full-fledged garbage collector?

I'd say let's keep things simple. People like simplicity, and stuff that works and that does not freeze or crash every other hour.

wfbarks · on Aug 3, 2019

[flagged]

desi_ninja · on Aug 3, 2019

[flagged]

Yen · on Aug 3, 2019

I don't know, why you're not there

Results of the query, unaware

So is it true, is it false?

It's undefined...

ainar-g · on Aug 3, 2019

[flagged]

paulddraper · on Aug 3, 2019

I can tell HN has no sense of humor by the unreadability of the comment.

dang · on Aug 3, 2019

It isn't that people here have no sense of humor, it's that internet humor tends to grow like weeds, and everyone overestimates how funny their jokes are. scott_s expressed this well a long time ago: https://news.ycombinator.com/item?id=7609289.

Exceptionally witty comments tend to do well here, but they're rare.

chvid · on Aug 3, 2019

We need a discussion of null on this website once every 3 months. This is no laughing matter.

paulddraper · on Aug 3, 2019

Null is the absence of something.

Typically given as sentinel value; i.e. a union with the set of possible non-null values.

Null references were called by their creator "the worst mistake in computer science".

Those that share this belief prefer the more composable Option/Maybe pattern rather then a sentinel value.

https://www.lucidchart.com/techblog/2015/08/31/the-worst-mis...