My point is that if you are not chaining `Maybe` then the utility of employing t...

lkitching · on Aug 2, 2021

The purpose of Maybe is to explicitly represent the possible non-existence of a value which in Haskell is the only option since there's no null value which inhabits every type. The existence of the monad instance is convenient but it's not fundamental. The type of getConfigurationDirectories could be changed to MaybeT IO (NonEmpty FilePath) to avoid the match but I don't think it would make such a small example clearer.

kingdomcome50 · on Aug 2, 2021

There are numerous ways to redesign the function signatures, but I would imagine the simplest would be (again, idk Haskell syntax):

    getConfigurationDirectories: unit -> Maybe [FilePath]
    nonEmpty: [a] -> Maybe [a]
    head: [a] -> Maybe a
    initializeCache: FilePath -> unit

Notice `nonEmpty` isn't really necessary because `head` could to the work. The above could be chained into a single, cohesive stack of calls where the result of each is piped through the appropriate `Maybe` method into the next call in a point-free style. I cannot imagine how this wouldn't be clearer. e.g:

    maybeInitialized <- (getCofigurationDirectories >>= head >> initializeCache)

That's the whole thing. Crystal clear. The big takeaway of "Parse don't validate" should be about the predominant use of the `Maybe` monad as a construct to make "parsing" as ergonomic as possible! Each function that returns `Maybe` can be understood as a "parser" that, of course, can be elegantly combined to achieve your result.

My critique is exactly that unwrapping the `Maybe` immediately in order to throw an exception is kind of the worst of both worlds. I mentioned this in a sibling comment, but my sense is that the author is more concerned with have a concrete value (`configDirs`) available in the scope of `main` than best-representing the solution to the problem in code. It is a shame because I agree with the thesis.

lkitching · on Aug 2, 2021

On the contrary the The NonEmpty type is fundamental to the approach in that example since it contains in the type the property being checked dynamically (that the list is non-empty). The nonEmpty function is a simple example of the 'parse don't validate' approach since it goes from a broader to a more restricted type, along with the possibility of failure if the constraint was not satisfied. The restriction on the NonEmpty type is what allows NonEmpty.head to return an a instead of a (Maybe a) and thus avoid the redundant check in the second example. The nonEmpty in your alternative implementation is only validating not parsing since after checking the input list is non-empty, it immediately discards the information in the return type. This forces the user to deal with a Nothing result from head that can never happen. Attempting to clean the code up by propagating Nothing values using bind is just hiding the problem that the validating approach avoids entirely.

kingdomcome50 · on Aug 3, 2021

You are misunderstanding the system. You can organize the logic into whatever containers you want, but the essence of the system cannot be changed.

You are already handling a `Maybe` type because it's possible for your input to not exist. Because the first implementation of `head` also returns a `Maybe`, it is possible to "bind" them together (I'm leaving out `IO` because I am both unsure of the syntax[0] and it is immaterial to the example):

    head :: [a] -> Maybe a
    head (x:_) = Just x
    head []    = Nothing

    getConfDirs :: Maybe [FilePath]

    initializeCache :: FilePath -> Cache
    
    useCache:: Cache -> Value 

    main :: ()
    main = do

      // you don't need concrete values here
      maybeCache <- (getCofDirs >>= head >> initializeCache) // Maybe Cache
      
      // one option
      case maybeCache of
        Just c -> useCache c
        Nothing -> error "CONFIG_DIRS cannot be empty"

      // another option
      maybeValue <- (maybeCache >> useCache) // Maybe Value

[0] I have never written Haskell, so the above is my best-guess at the syntax given the snippets available (and no extra research)

The two functions `head` and `getConfDirs` are "parsers" because they both return `Maybe`. Contrary to

> Returning Maybe is undoubtably convenient when we’re implementing head. However, it becomes significantly less convenient when we want to actually use it!

It is trivial to use a reference to `Maybe` because it is a monad that it is specifically designed to be used more conveniently than the alternative approaches in the case when a value may (or may not) exist.

lkitching · on Aug 3, 2021

This line:

    maybeCache <- (getCofDirs >>= head >> initializeCache)

is doing exactly what the post is arguing against. getConfDirs is validating the list is non-empty but the [FilePath] list it contains does not encode that information. Now you immediately have to handle the possibility of a missing value from head that you already know cannot happen. This isn't too apparent here since you've combined it into a single expression but if you need to pass the confDirs list to any other part of the program they will also have to continually handle the possibility of the list being empty even though you already checked for that possibility. Now every function that interects with the confDirs list will have to include (Maybe a) in its return type unnecessarily. The post is not suggesting you can remove Maybe entirely but it has moved it to a single point in the program (the point where the config dirs list is checked for emptiness) and removed it everywhere else. Your approach must continually guard against an impossible condition everywhere the dirs list is accessed because you discard the property you checked for in getConfDirs.

The monadic operators make it convenient to propagate missing values through a chain of operations but they are not the primary benefit of an explicit Maybe type. Much like IO, the benefit of having an explicit Maybe type is when you _don't_ have it since its absence represents more information at that point in the program. Likewise a (NonEmpty a) contains more informatation than [a] which consequently makes the implementation of head more informative.

The parsers in this approach have types like

    a -> Maybe b

where type b contains the extra information extracted by the parser. Your getConfDirs function only contains a function with type

    [a] -> Maybe [a]

so isn't parsing in the same way.

kingdomcome50 · on Aug 3, 2021

I understand what the author is doing. I said this earlier but it bears repeating, the author seems to be more concerned with having a concrete type than simpler code. A reference to `Maybe Cache` is good enough (and preferred). The top-level of your program is precisely where you want to have the flexibility to deal with the above.

Furthermore, my example is a much better illustration of the axiom ("Parse don't validate") than what the author is doing -- which is more like "Parse and validate".

You need to clarify "continuously guard". Sure you have to invoke methods like:

    maybeCache >> useCache // map

instead of:

    maybeCache |> useCache // not sure how Haskell pipes

Is that too difficult? The `Maybe` monad is specifically designed so that you don't have to continuously guard against the possibility of the value not existing. That is, you can "map", "bind" and "apply" functions to the value as if it always exists (and it handles the situation when the value doesn't). I also included a `case` block within which you can be statically certain a value of type `Cache` is available if you really need it.

The purpose of `Maybe` is to simplify code that needs to deal with a value that might not exist. Attempting to organize your code to avoid using `Maybe` is, by definition, going to be more cumbersome than simply leaning into the construct (that's what it's for!). It also better-illustrates how "parse don't validate" should work. Using an exception to guard against an invariant is... validating not parsing.

You don't need to defend the author here. It's just a matter of fact the the code provided could be organized differently according to a more idiomatic usage of `Maybe`, and therefore a more illustrative example of their own point. The choice to exemplify something else is unfortunate and the thrust of this entire comment thread -- I felt like I had to say something now seeing that link a second time.

lkitching · on Aug 3, 2021

The author explains what they mean by parsing in the post:

> Really, a parser is just a function that consumes less-structured input and produces more-structured output. By its very nature, a parser is a partial function—some values in the domain do not correspond to any value in the range—so all parsers must have some notion of failure. Often, the input to a parser is text, but this is by no means a requirement, and parseNonEmpty is a perfectly cromulent parser: it parses lists into non-empty lists, signaling failure by terminating the program with an error message.

So the properties checked by the parser are reflected in the output type. Reifying these properties in the type is what allows the validation to be done once at the top level and avoided throughout the rest of the program. Your complaint about throwing exceptions is focusing on an irrelevant detail in a small example - yes this could have been moved into the main function but doesn't affect the overall behaviour.

However your argument that propagating Maybe values is more idiomatic than parsing into a more precise type is one I - and I assume most - static typing advocates would disagree with. Given the choice you would always prefer an 'a' over a 'Maybe a' since a Maybe represents a point of uncertainty which you would rather not have. As a result, having to chain this imprecision using various combinators is inherently more complex than not having to do so. Yes, using bind etc. is preferable to manually destructing Maybe values but avoiding Maybe is more preferable still.

kingdomcome50 · on Aug 3, 2021

> but avoiding Maybe is more preferable still

You can't avoid `Maybe` in this system. It is in the nature of the problem (as it is designed) that the input might not exist (and therefore a list might be empty). The question isn't one of avoidance, rather, integration. How do we deal with problems like the example?

"Parse don't validate" is a great way to deal with it! Even more convenient is the existence of a tool that can be used to offload all of the redundancy involved when choosing to parse instead of validate (i.e. throw an error).

It is the author's prerogative to value having a concrete value at one specific point in the program (`main`) over demonstrating how using `Maybe` can make parsing a breeze. Clearly you also value (for whatever reason) knowing that a variable contains a value at some specific, rather arbitrary point in the example program[0]. But it is an unfortunate choice given the title of the post.

Not only does the example code in the post not illustrate "parse don't validate" very well, it convolutes the solution considerably. My example above is able to achieve identical behavior in an easier-to-digest flow while also illustrating how parsing instead of validating can be done.

[0] Of course we know that any function to which we `map` to our `maybeCache` will for sure be invoked with an instance of `Cache`.

lkitching · on Aug 3, 2021

Your example does not achieve idential behaviour at all since it 'parses' an [a] to another [a] and therefore throws away the very property you've just checked. The (NonEmpty a) property encodes the non-emptiness of the list in the type which is then known at every point the list is accessed throughout the entire rest of the program. The point is not just to check the non-emptiness in main as you appear to be implying. Any use of head on an [a] must continually deal with a (Maybe a) even though this possibility has been ruled out. In contrast NonEmpty.head returns an element directly so removes entires chains of Maybes that would be propagated, conveniently with map/bind or otherwise. Parsing allows to replace N + 1 instances of Maybe with just 1 so you can't claim your approaches are the same just because it hasn't been eliminated entirely.

kingdomcome50 · on Aug 4, 2021

I think you are confusing implementation with behavior. That is, I am achieving that same result through different means. I am mostly uninterested in specifically how the configuration string is parsed. It's not really important.

What is important is that we know we will have to deal with the possibility of something not existing. That is where the complexity lies, and where we want to take care to make our program as sensible as possible. Validating your input to throw an exception or return is one way to satisfy the compiler, another way is to use `Maybe` as intended. The author's "solution" is simply a poor illustration of parsing over validation (read that sentence again).

I suspect, and this applies to you as well, that they are just not comfortable working with the `Maybe` construct. Adding extra ceremony to remove a `Maybe` is simply not worth the trouble, and your idea of "continuously propagating" is severely overblown. Again, we can write every single line of the rest of our program as if `Cache` exists. You don't need to "handle" anything extra (other than the holding the concept of a slightly more complex value in your mind).

lkitching · on Aug 5, 2021

The difference between parsing and validation in the author's formulation is not between returning Maybe and throwing an exception, it's between returning a more precise type and not. Here's the types of the two version of `getConfigurationDirectories`:

    getConfigurationDirectories :: IO [FilePath]
    getConfigurationDirectories :: IO (NonEmpty FilePath)

The second version is preferred because the (NonEmpty FilePath) encodes the property that was checked in the type which means it doesn't have to be handled repeatedly throughout the entire rest of the program.

Yes the second version could have been changed to one of:

    getConfigurationDirectories :: IO (Maybe (NonEmpty FilePath))
    getConfigurationDirectories :: MaybeT IO (NonEmpty FilePath)

but this would only have moved the error reporting up one level to the main function. I would guess the existing version was chosen to simplify the types for a non-Haskell audience.

your attempted 'improvement' of using

     getConfigurationDirectories :: IO (Maybe [FilePath])

is NOT an example of parsing because [FilePath] does not remove the possibility (in the types!) of the list being empty. When you later attempted to use

    maybeCache >>= useCache

this requires the type of useCache to have type

    [FilePath] -> IO a

for some output type a. This function must deal with the possibility of the input list being empty because the type allows it. Every call to `head` returns (Maybe FilePath) and must handle the Nothing case. Neither I nor the author is unaware that there are many combinators that make this more convenient than explicit matching against Just/Nothing but doing so is strictly worse than returning a FilePath directly. Presumably none of the lower-level functions will be able to provide a default FilePath to use so every single one will be forced to return a Maybe somewhere in their return type (or use fromJust which is very ugly). This affects every single one of their callees which will again be forced to propagate Maybe up to their callees etc. To reiterate: the issue is not the possible non-existence of Cache, which can be handled in main. It's that the representation of Cache forces every single operation on it (of which head is just one simple example) to potentially have to represent conditions that should not actually be possible. This is a failure to 'make invalid states unprepresentable', which most proponents of static types aspire to.

kingdomcome50 · on Aug 5, 2021

You have the signature for `useCache` wrong. I defined it above (`Cache -> a`). Notice the concrete type...

I cannot stress this enough. You do not need to remove the possibility of a value not existing in order to compose a simple, coherent program. This is because `Maybe` is designed to handle all of the extra ceremony involved with utilizing such values. You only need to use `>>` (map) instead of `|>` (pipe) when invoking your functions. That is it.

All of the above is really beside the point though, because I am not arguing that one way is necessarily better than the other. I am arguing that the author's post is titled "Parse don't validate", that the perfect construct is right there to exemplify how parsing unstructured data into/through a system can be done, but then the author eschews it in favor of... validation (with what appears to be some tricks to fool the compiler)!

If your guard against an invalid state is to throw an exception you are validating. Attempting to redefine the terms to fit a particular narrative is a distraction that serves no one.

    > Neither I nor the author is unaware that there are many combinators that make this more convenient than explicit matching against Just/Nothing but doing so is strictly worse than returning a FilePath directly

I'd like you to define "strictly worse" here. In order for "strictly worse" to make any sense we would need to define "strictly better" to mean something like: "to have a reference to a variable in this particular scope that is definitely a `FilePath`". But why are variables in this scope (`main`) so important? You can get reference to a `FilePath` directly whenever you need it through a `Maybe`:

    useFilePath :: FilePath -> a

    maybeFilePath <- (getConfigDirs >>= head)

    maybeFilePath >> useFilePath

This is opposed to something like:

    filePath <- getConfigDir // might throw

    filePath |> useFilePath

There is no difference in behavior and only a slight difference in implementation. I suppose if you really really wanted to `print` the value of `FilePath` from `main` (and not some other function), the second version would be preferred (though you could still match in the first version to create a block in main where `FilePath` is statically defined). Pretty arbitrary though.

lkitching · on Aug 5, 2021

> You have the signature for `useCache` wrong. I defined it above (`Cache -> a`)

Yes, sorry it's actually the line

    maybeCache <- (getConfDirs >>= head >> initializeCache)

which shows the issue.

> but then the author eschews it in favor of... validation (with what appears to be some tricks to fool the compiler)!

I think the author is pretty clear about how they're using the terms 'validation' and 'parsing' in the post - validation functions do not return a useful value while parsers refine the input type and carry a notion of failure. The first two examples of parsers they give are:

    nonEmpty :: [a] -> Maybe (NonEmpty a)
    parseNonEmpty :: [a] -> IO (NonEmpty a)

you seem to be arguing that parseNonEmpty is validating because it throws an exception instead of returning Maybe (NonEmpty a) but this isn't true here since Maybe signals failure by returning Nothing errors within IO can be signaled with exceptions. The author hints at how these two parser types are related later on with:

    checkNoDuplicateKeys :: (MonadError AppError m, Eq k) => [(k, v)] -> m ()

There are MonadError instances for both IO and Maybe so the general parser type is something like

    MonadError e m => a -> m b

Admittedly this could have been made clearer if it was the intention and returning Maybe is preferable to throwing exceptions in languages like Haksell.

If you were translating this approach to other languages like Java or C# though you proabably would throw exceptions to indicate failure e.g.

    interface Parser<A, B> { B parse(A input) throws ParseException; }

so I don't think your objection holds in general.

> I'd like you to define "strictly worse" here

I'm saying you would always prefer to be handed an instance of an `a` instead of a (Maybe a) since it's more precise. You can trivially construct a (Maybe a) from an a but you can't easily go in the other direction. You either need to produce a dfeault value or use a partial function like fromJust to obtain an 'a' from a 'Maybe a'. The motivation for the post is to show how using a more precise data type allow you to remove these from the rest of the code.

> But why are variables in this scope (`main`) so important

The issue doesn't happen in main, it happens throughout the rest of the program. The high-level structure is something like:

    main :: IO ()
    main = do
      maybeDirs <- getConfigDirs
      maybeDirs >>= restOfProgram

main only has to handle the parse failure and report any errors which will look similar regardless of whether getConfigDirs has type Maybe (NonEmpty FilePath) or IO (NonEmpty FilePath) (and throws an exception). But the representation of the directory list could be used anywhere in restOfProgram. Given a chain of applications fun1 -> fun2 -> ... -> funN, if funN accesses the file list with head and receives a (Maybe FilePath) there are three options:

1. Use fromJust since the list should be non-empty 2. Produce a default value 3. Propagate the Maybe in the return type of funN

Option 1 is messy, 2 is also unlikely for a low-level function and 3 forces fun1 to fun (N - 1) to either handle or propagate the partiality. Yes using >>= and <=< etc. can hide this plumbing but can be made unnecessary in the first place.

kingdomcome50 · on Aug 5, 2021

    > I'm saying you would always prefer to be handed an instance of an `a` instead of a (Maybe a) since it's more precise.

I disagree with this. `Maybe a` is more precise because it more closely represents the actual system within which we are working. It is simply a fact that our configuration directories might not exist. It is only within the author's own head that they prefer a concrete type because they value being able to point to their variable and say, "look I have this value! It's right here!" in a procedural sense, more than adopting a more functional approach.

    > You can trivially construct a (Maybe a) from an a but you can't easily go in the other direction. You either need to produce a dfeault value or use a partial function like fromJust to obtain an 'a' from a 'Maybe a'

Again, the above is just not accurate! Or it is accurate in a very specific - "I want this particular value in this particular scope" - kind of way. Even in your example, we can be statically certain that `restOfProgram` will receive a value of type `[FilePath]`[0].

This is starting to feel like a waste of time. You are very much hung up on trying to defend the idea that using `Maybe` is something to be avoided. I understand where you are coming from. I really do. But you are simply not going to convince me because I prefer to model systems as a whole and I prefer to avoid doing extra gymnastics to solve already-solved problems. Throwing an exception? C'mon... we both know that example sucks.

My critique of the post really has nothing to do with choosing `Maybe` vs validating. My critique is that the author's code is utterly failing to exemplify parsing over validation! Using `Maybe` to chain parsers together in order to build an input would have been perfect. Unfortunately, they kind of mucked it up halfway through because they appear to be afraid of `Maybe`. It's a shame given that the post seems to have gotten around.

[0] This whole `NonEmpty` non-sense is a sideshow that's not worth discussing (other than to further illustrate how `Maybe` can be used to simplify multi-step parsing). What happens when you need the Nth element? You just keep re-defining the type to include more values? When we get to `NonEmpty6` I think maybe we will have realized we are on the wrong path. For our purposes it's better to think of `[FilePath]` as `Input` and not get bogged down in the specifics of its shape. The important bit is that it might not exist.

lkitching · on Aug 6, 2021

The entire point of Maybe is to imbue some type 'a' with an extra value - Nothing - along with a tag about which case you have. So (Maybe a) is always inhabited by more values than 'a', and that is the sense in which a variable of type a is 'more precise' than one of (Maybe a). I'm not saying Maybe is bad in any way - as you point out sometimes you do have to deal with the possibility of not having a value e.g. looking up a key in a map, looking up a user from a database etc. In Haskell there's no 'null' value which inhabits each type so you have to use Maybe, but even in languages like C# or Java where reference types all contain null I would still prefer to use Maybe/Optional to be explicit about the possibility. I don't think we disagree here. But at any point in a program you would always prefer to receive an 'a' over a (Maybe a) if you had the choice since there are fewer cases to deal with. This is the same reason languages like C# are adding support for non-nullable reference types.

Type-driven design is based around encoding invariants as much as is practical in the type system (what constitutes 'practical' is constrained by the type system you're using). The (NonEmpty a) type is just used to demonstrate a very simple example of this principle. In the same way that type 'a' is smaller than the type (Maybe a), so (NonEmpty a) is smaller than a [a] which means the operations on it are similarly more precise, which shows up in the two version of head:

    head :: NonEmpty a -> a
    head :: [a] -> Maybe a

But this is just one example - you could replace it with different representations of a user in a web service

    type User {name :: String}
    type User = JsonValue

and the consequent difference in the types of the accessor for the name:

    getName :: User -> String}
    getName :: JsonValue -> Maybe String

Far from being a 'sideshow' this is the main point of the approach - using a more precise representation makes all the operations on it similarly more precise globally throughout the program.

In your post the argument to restOfProgram has type [FilePath] but in the post it is (NonEmpty FilePath) so you need to handle the potential non-emptiness of the list everywhere you try to access it, either by propagating missing values to a higher level or using 'unsafe' functions like fromJust. It's defensible to prefer using a simpler representation type and dealing with the imprecision, but it's not doing the same thing - the types for a lot of the internals of your program will be quite different. This is probably the main philosophical difference with Clojure which prefers to use a small number of simple types along with dynamically checking desired properties at the point of use, something which tools like spec and schema make quite convenient. But people use static languages because of the global property checking, so it seems odd to me to endorse explicit modelling of missing values with Maybe while rejecting doing the same thing for non-emptiness since they are both lightweight approaches.

The insight of the original post is that if you choose to try make your types precise in this way (and most Haskell programers would I believe) then the process of checking the properties you want to enforce from a less-precise representation is inseperable from the process of converting into the narrower representation. This narrowing process could fail and must therefore encode the representation for the failure case. Your insistence that Maybe should be used as the one true failure representation is wrong I think, throwing exceptions in Haskell is rare but but they could have also chosen (Either String) for example. Maybe isn't even a particularly good representation since it doesn't contain any way of describing the reason for the failure, just that it happened. I agree it would have been nice to see an example of parser composition using <=< etc. would have been useful there but it's not the main point of the article.

kingdomcome50 · on Aug 6, 2021

I understand the purpose of `Maybe`.

    > In your post the argument to restOfProgram has type [FilePath] but in the post it is (NonEmpty FilePath) so you need to handle the potential non-emptiness of the list everywhere you try to access it

This is what I'm talking about. You are wasting energy on this line of thinking. Sure the author chose to parse a string into a list which then introduces the possibility of that list being empty. But we could have just chosen a different abstraction to hold our configuration that didn't suffer from this problem. Say:

    getConfiguration :: () -> Maybe { cache :: FilePath }

Now it's always non-empty. Don't get stuck on some intermediate representation. Again, I am uninterested in the details of the particular format of some input. My interest (and the thrust of this discussion) is about how to handle an input that might not exist. Specifically in terms of "parsing instead of validating".

    > Your insistence that Maybe should be used as the one true failure representation

I cannot stress this enough (I've said this at least twice now), I am not arguing that `Maybe` is "the one true way". I am arguing that the author is failing to exemplify how to parse your inputs vs. validating them. I am arguing that the code they wrote to help substantiate and illustrate their point about parsing accomplishes no such thing. It actually shows how to validate an input in a way that is confusing and no different than (in TS):

    // returns a non-empty list of string
    getConfigurationDirectories: () => [string, ...string[]] = () => {

         const dirs = getEnv("dirs").split(",");

         if (dirs.length < 1) throw "ERROR";

         return dirs;
    }

The above is not best-understood as a "parser". The above is validating the input. Trying to redefine "parsing" to mean "the result has a different return type" helps no one, and introducing `Maybe` into their example (while on the right track) isn't really necessary because they aren't using the `Maybe` (other than maybe as a crutch to satisfy the compiler).

lkitching · on Aug 6, 2021

The requirement from the post is for a non-empty list of file paths - your Cache representation only contains a single item so is obviously not suitable. In case it's not obvious: head is not the only operation that might be required and the author isn't using (NonEmpty a) as a wrapper around a single value. The requirements for the configuration are stronger than those provided from the input and the configuration type used by the program encodes that property in the type. That property is then enforced globally throughout the entire program and only needs to be checked once at the top level.

> The above is not best-understood as a "parser". The above is validating the input

Yes the example you gave is an example of validating in the author's formulation because it does not enforce the property it's checking in the return type. You check the list is non-empty but this information is not available anywhere else in the program. A parser would return a (NonEmpty String) as its result since that does enforce the constraint.

> Trying to redefine "parsing" to mean "the result has a different return type"

That's not a redefinition of parsing, that's what a parser is.

kingdomcome50 · on Aug 10, 2021

> The requirement[0] from the post is for a non-empty list of file paths

I'm sure you could imagine my example record containing more keys no?

The requirements from the post are arbitrary and could (should) be anything that best-illustrates the thesis of the post. For example by choosing a representation of their "configuration" that suffers from this silly problem of containing unknown content after parsing, the author introduces the whole `NonEmpty` gymnastics. It's totally avoidable. The irony is that the author was so close to getting it right!

> You check the list is non-empty but this information is not available anywhere else in the program

The function in my example does statically define that the returned list is non-empty. A "parser" would maybe return the non-empty list if parsing was successful. That's how parsers work.

[0] The requirement from the post is, in fact, to only have a single file path. That is the only data actually being used (i.e. required). The other intermediate data structures are a choice of the author.

lkitching · on Aug 17, 2021

> The requirements from the post are arbitrary and could (should) be anything that best-illustrates the thesis of the post

The post is showing how using more expressive types can simplify your code through a very simple example. You started out saying the NonEmpty type is unnecessary and now you're complaining it's not complicated enough, so it's not clear what your actual objection is here.

> The function in my example does statically define that the returned list is non-empty

The type given in your example was

    getConfDirs :: Maybe [FilePath]

this is inhabited by (Just []) so no it doesn't statically guarantee that.

> A "parser" would maybe return the non-empty list if parsing was successful

That's exactly what the nonEmpty function in the post does:

    nonEmpty :: [a] -> Maybe (NonEmpty a)

kingdomcome50 · on Aug 17, 2021

> The post is showing how using more expressive types can simplify your code

This is called "using a type system" and has nothing to do with parsing or validation. You just don't "get it" I suppose. I feel like we are talking right past each other. You are so hung up on the types that you are just failing to take in the essence of what's going on.

I'll say this one more time: the specific type returned by `getConfDirs` is completely irrelevant -- other than whether it is wrapped in `Maybe` or not (because this best-illustrates parsing vs. validation). The returned type is an implementation detail that is chosen by the author. It is not necessary or "required" that we parse a string into a list (read that again). It's really quite simple:

    // this is parsing
    getConfDirs :: Maybe Config

    // this is validation
    getConfDirs :: Config // might throw

How these functions are implemented is not relevant. The irony is that choosing a list representation makes for a good example of parsing precisely because we can't know how many elements are in the list. That is, it gives the author the opportunity to further-illustrate parsing by:

    // more parsing
    maybeCacheDir <- (confDirs >>= first) // or second, third, fourth, etc.

The author's example code is NOT illustrating parsing. Period. They are essentially illustrating a constructor named `getConfigurationDirectories` -- which is the most classic case of validation imaginable (TS):

    // 1. This is their first example
    class ConfigurationDirectories {

        dirs: FilePath[];

        constructor(env) {
         
            let dirs = env("config").split(",");

            if (dirs.length < 1) throw "Error!";

            this.dirs = dirs;
        }

        get cacheDir() {

            if (this.dirs.length < 1) throw "Cannot happen!"; 

            return this.dirs[0];
        }
    }

    // 2. This is their "fixed" example with all sorts of unnecessary "NonEmpty" gymnastics because they have chosen the wrong abstraction
    type NonEmpty<T> = [T, ...T[]];

    const eg: NonEmpty<string> = []; // error 

    class ConfigurationDirectories {

        dirs: NonEmpty<FilePath>;

        constructor(env) {
         
            let dirs = env("config").split(",");

            if (dirs.length < 1) throw "Error!";

            this.dirs = dirs as NonEmpty<FilePath>;
        }

        get cacheDir() {
            
            // now the compiler also knows dirs is not empty
            return this.dirs[0];
        }
    }

    // 3. This is BETTER than their "fixed" example because it is even simpler
    class ConfigurationDirectories {

        cacheDir: FilePath;

        constructor(env) {
         
            let dirs = env("config").split(",");

            if (dirs.length < 1) throw "Error!";

            this.cacheDir = dirs[0];
        }
    }

    // usage
    const config = new ConfigurationDirectories(getEnv());

    const cache = initializeCache(config.cacheDir);

You see how the above translates to the author's code? Nothing above qualifies as "parsing". Their examples are defining a function with a guard that validates the input. The author then get confused and decides to go on this side-quest of how to trick the compiler because the first function wasn't actually accomplishing their goal (they have to validate twice!). But you know what would have avoided all of it? Parsing (i.e. actually using the types to simplify their code). Simple. Linear. Fail-safe. Parsing.

I'm not really sure why you are dying on this hill. You can't "win" this argument. The best you can accomplish is to learn something yourself through this discussion. It's just a matter of fact that the author is (mis)using `Maybe` to the detriment of their examples. And that isn't an attack on the author! You needn't defend them. We've all written code that wasn't perfect. This is just another instance.

Maybe if the title were something like, "How to validate with static guarantees in Haskell" I wouldn't have said anything...

* I thought you were referring to my TS example (which does statically define the return-type to be non-empty)

lkitching · on Aug 20, 2021

> You are so hung up on the types that you are just failing to take in the essence of what's going on.

The post is about type-driven design, which is about representing invariants in the type system where possible. The post is very clear about this. The example chosen to illustrate it is a very simple one where a list is augmented with an extra property (non-emptiness). The exact representation is not important, so yes they could have created their own custom type instead of using (NonEmpty a) but this is beside the point. You have now made two attempts to 'improve' this representation, first by just using a plain list (which the post explicitly rejects) and now by just using a single 'cache dir' instead. You can't 'simplify' the solution by just throwing away half the requirements - it's a collection of items which must also be non-empty.

> The author's example code is NOT illustrating parsing. Period

Once again, the author is very clear about what they mean by 'validation' and 'parsing':

    The difference lies entirely in the return type: validateNonEmpty always returns (), the type that contains no information, but parseNonEmpty returns NonEmpty a, a refinement of the input type that preserves the knowledge gained in the type system.

The entire point of 'parsing' in this approach is to obtain a refinement of the input type in the representation. Your 'better' example is not a refinement of a list.

You appear to be insisting that validation is just anything that throws exceptions, but this is wrong - validation is when the properties being checked are not reflected in the input type. Parsers have to be able to signal errors, and exceptions is one of the ways of doing that. This is why your previous example of 'parsing' is just validating:

    const dirs = getEnv("dirs").split(",");
     if (dirs.length < 1) throw "ERROR";
     return dirs;

The type of this expression is just `[String]` which does not guarantee the non-emptiness being checked. If you have another definition of validation vs parsing you need to state it clearly, because your counterexamples do not contradict the definition in the post.

> Their examples are defining a function with a guard that validates the input. The author then get confused and decides to go on this side-quest of how to trick the compiler because the first function wasn't actually accomplishing their goal (they have to validate twice!). But you know what would have avoided all of it? Parsing

This is literally what the post is showing by switching

    getConfigurationDirectories :: IO [FilePath]

with

    getConfigurationDirectories :: IO (NonEmpty FilePath)

you keep insisting this is 'not parsing' but the post explains why they think it qualifies and you haven't given your own definition which contradicts it.

> It's just a matter of fact that the author is (mis)using `Maybe` to the detriment of their example

There's no mis-use of Maybe in the post, and as I've already explained, the point of the post is to eliminate Maybes. It's only used in two places - to represent the partiality of the head and nonEmpty functions.