Hacker News new | past | comments | ask | show | jobs | submit login

Parsing outputs is a point where unix philosophy always breaks down for me. It looks like this:

  {program -> human-text -> parser}+
When it could look like this:

  program
    -> {struct-text -> program}+
    [-> formatter] -> human-text .



I don’t think your approach is a problem. I do think that JSON might be not the ideal format for struct text. Especially if you really just use it as a flat dict of key value pairs.


Maybe not, but more often than not you really have an array of objects, and that's where JSON helps.


Look at my original example. The JSON punctuation just muddled the text format without adding any additional value. The point of JSON is the (very limited) type system, nested schema, and heterogeneous format (if array A has two elements B and C, they don’t need to be the same type). None of that seems applicable to what you want out of a UNIX utility.


What I was getting at is:

    cpu: 0.2
    kw_read_s: 0.12
is describing a single 'object'. How do you describe an array or list of these objects? Something like:

    cpu: 0.2
    kw_read_s: 0.12

    cpu: 1.3
    kw_read_s: 0.4
Do you use a blank line to denote a new 'object'? In JSON it would be done this way, and this conforms to how the vast majority of command output maps (as they tend to be rows of columnar data):

    [
      {
        "cpu": 0.2,
        "kw_read_s": 0.12
      },
      {
        "cpu": 1.3,
        "kw_read_s": 0.4
      }
    ]
The other benefit to JSON here is that the formatting doesn't matter. This could also be expressed as a block of text with no spaces or newlines between elements:

    [{"cpu": 0.2,"kw_read_s": 0.12},{"cpu": 1.3,"kw_read_s": 0.4}]
Finally, this can also be streamed, using JSON Lines:

    {"cpu": 0.2, "kw_read_s": 0.12}
    {"cpu": 1.3, "kw_read_s": 0.4}


The OP specifically advocates against your proposed pattern and points out why nesting isn’t ideal. Also JSON doesn’t support fixed point numbers so JSON.encode(JSON.parse(input)) != input even if your equality assumes that formatting doesn’t matter.


You're not wrong, but sometimes simple formats aren't as simple as they look. It sounds like what you want is basically a format like /etc/os-release:

    NAME="Rocky Linux"
    VERSION="8.5 (Green Obsidian)"
    ID="rocky"
    ID_LIKE="rhel centos fedora"
    VERSION_ID="8.5"
    PLATFORM_ID="platform:el8"
    PRETTY_NAME="Rocky Linux 8.5 (Green Obsidian)"
    ANSI_COLOR="0;32"
    CPE_NAME="cpe:/o:rocky:rocky:8.5:GA"
    HOME_URL="https://rockylinux.org/"
    BUG_REPORT_URL="https://bugs.rockylinux.org/"
    ROCKY_SUPPORT_PRODUCT="Rocky Linux"
    ROCKY_SUPPORT_PRODUCT_VERSION="8"
On the surface this seems great, but those quotation marks are kind of annoying. Is it possible there's an escape syntax that's used in case the name also includes quotes? eg VERSION="8.5 (Green \"Aqua\" Obsidian)"? Is it also possible you can embed newlines in between the quotes too? Who knows... Thankfully with JSON there is a simple spec.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: