Show HN: I wrote a program to convert lines of text into trees

petethepig · on March 29, 2021

Maybe this is not super relevant, but my favorite hack is that any tree-like structure like this can be browsed with ncdu. Here's a gist for breaking down redis traffic by command for example: https://gist.github.com/petethepig/0f33c910fb2edad8969a5775e...

loevborg · on March 29, 2021

Whoa, I didn't know that! This is super useful as a general-purpose tree viewer!

For the record, the relevant commands are

    ncdu -o /tmp/files.json

and

    ncdu -f /tmp/files.json

rakoo · on March 29, 2021

Neat! Is there some documentation on the format ?

mellosouls · on March 29, 2021

https://dev.yorhel.nl/ncdu/jsonfmt

birchb · on March 29, 2021

Author here: Often I have to digest log files and lists of Azure resource names. I prefer to work with hierarchies of things, so I wrote this simple filter. Turns out to quite handy, especially when combined with awk and its friends.

jarmitage · on March 29, 2021

Great, very handy! Just FYI `readlink -f` doesn't work on macOS

May I suggest the name txtree?

birchb · on March 30, 2021

Sorry macos people, I don't own an Apple. (Not strictly true, I have an iMac rescued from a dumpster, but it's running Ubuntu now). Apparently greadlink is available https://stackoverflow.com/a/4031502 for you.

jedberg · on March 29, 2021

Very cool tool! Any chance that you're going to add it to Homebrew?

eevilspock · on March 30, 2021

https://github.com/birchb1024/frangipanni/issues/2

birchb · on April 6, 2021

Mac binary https://github.com/birchb1024/frangipanni/releases/download/...

birchb · on March 30, 2021

I refer you to the answer I gave earlier.

inadequatespace · on April 6, 2021

Ah yes, I was just thinking that this could be useful for `aws s3 ls` and similar.

(Really I just want `tree` on `s3`. There's `s3-tree` but that uses the entire bucket rather than a prefix)

clankyclanker · on March 30, 2021

I love this as a post-processor for log-files. We actually have something like this at work. Unfortunately, it's applied to log-files by default. It's terrible: it breaks grep, and I haven't been able to figure a way around that.

Have you found any solutions for grepping on matching lines? Presumably, one could use awk, but awk takes a lot of cycles.

birchb · on March 30, 2021

Grepping the _input_ is, like, Standard. Knowhatimean bruv?

mklein994 · on March 29, 2021

Neat. I'll add this to my toolbox.

Somewhat unrelated: I discovered some time ago that the column command (from util-linux) can print trees of hierarchical data (up to 2 levels deep).

From the man page:

  $ echo -e '1 0 A\n2 1 AA\n3 1 AB\n4 2 AAA\n5 2 AAB' | column --tree-id 1 --tree-parent 2 --tree 3
  1  0  A
  2  1  |-AA
  4  2  | |-AAA
  5  2  | `-AAB
  3  1  `-AB

column(1): https://github.com/karelzak/util-linux/blob/master/text-util...

birchb · on March 30, 2021

I did not know that. But then it's BSD, right? http://harmful.cat-v.org/cat-v/.

breck · on March 29, 2021

Awesome. You really nailed it. In my experience the output for spreadsheets turns out to be key so good job highlighting that (https://github.com/birchb1024/frangipanni#output-for-spreads...).

One suggestion: it may help to generalize the newline as the node separator. You may already be doing this (my go is rusty) but instead of https://github.com/birchb1024/frangipanni/blob/7543b4ee15ae7... be able to override the "newline" as the node separator, like you've done with the "spacer" param.

What do you do when the same line is encountered?

birchb · on March 30, 2021

OK good idea, I was coming round to that myself. https://github.com/birchb1024/frangipanni/issues/5

shok3001 · on March 29, 2021

really cool! But I think you should have called it "birch"! :)

Any way you could add build instructions to the README?

Y_Y · on March 30, 2021

I also wanted to vote for "birch". Apparently "frangipanni" is a nice flower, but it reminds me of an unpleasant (imho) German dessert ("frangipane").

birchb · on March 30, 2021

Sure. https://github.com/birchb1024/frangipanni/issues/6

birchb · on April 6, 2021

AnonHP · on March 29, 2021

This is really very nice and useful! I read through the examples and thought about something that would be a good addition and then see that “-skip” was just implemented! I can skip (pun intended) using an additional layer of cut or awk because of this.

The detailed examples are great too, showing different use cases and features.

Thank you very much for creating this tool and sharing it.

softwaredoug · on March 29, 2021

TIL from the README, you can insert `img` tags into a markdown file, and github will honor the align tag, etc

```

```

eevilspock · on March 29, 2021

I love what it does for log files. Abbreviated example from the README:

    May 10 03:17:06 localhost systemd: Removed slice User Slice of root.
    May 10 03:17:06 localhost systemd: Stopping User Slice of root.

becomes:

    May 10
     03:17:06 localhost systemd
      : Removed slice User Slice of root
      : Stopping User Slice of root

quietbritishjim · on March 30, 2021

Definitely very interesting! But I think I'd often end up stuck in the middle of a file with no context about the time of the current line, and struggling to scroll up to capture the parent without shooting past it.

Maybe that could be resolved if it were combined with some sort of GUI / TUI that collapsed subtrees with default (maybe with child count and total anscentor count) e.g. folding by indentation in Vim / Emacs

timonoko · on March 30, 2021

Seen this about 50 years ago. It was method of saving expensive Teletype ink ribbons. Instead of printing same repeating messages on mainframe log, it printed only those parts that were different from previous line.

And of course empty line was "ditto". Except this wasted paper and was later replaced with "...".

oandrew · on March 30, 2021

There is also

  tree --fromfile

e.g.

  echo a/b1 a/b2/c | xargs -n1 | tree --fromfile

  .
  └── a
      ├── b1
      └── b2
           └── c

 2 directories, 2 files

mdeck_ · on March 29, 2021

In retrospect, it’s not clear to me why I somehow expected this project to be something more literally arboreal. I even got my hopes up further when I clicked the link and saw that photo of plumerias.

Yet, this project is cool enough that I’m not even disappointed.

birchb · on March 30, 2021

Um you can get actual trees with graphviz, or my favourite, YeD. I will add an example of that. https://github.com/birchb1024/frangipanni/issues/7

motohagiography · on March 29, 2021

This looks like what I always wanted for normalizing data sets and grepping logs but couldn't articulate. Thank you!

kevinmgranger · on March 29, 2021

I've been thinking about making something like this for a while, this is great!

I always thought it was weird that "do one thing and one thing well" stopped short of dealing with tree representations on the commandline.

globular-toast · on March 29, 2021

What do you mean? The only reason this didn't exist is nobody did it yet. It doesn't contradict "do one thing and do it well".

macintux · on March 29, 2021

For me, the problem isn’t that it doesn’t do one thing well, which is clearly does, it’s that there are no other tools to process trees.

So it fits well with the UNIX ethos of a small tool doing one thing well, but not so much the concept of pipelines, which is closely related.

Obviously not this tool’s fault.

eevilspock · on March 30, 2021

I does take advantage of pipelines, as long as it is on the end. You can use awk, sed, cut, sort, grep and a hose of other tools to massage the data before it gets put into graph form.

To become the front-end of a pipe requires that its or some other hierarchical format become an effective standard shared by a number of tools. It can happen.

amelius · on March 29, 2021

Looks nice. Perhaps a next step could be an ncurses program that allows you to fold/unfold the trees at arbitrary places, and select entries to reveal their full path (useful for copy+paste).

roydivision · on March 29, 2021

I find I think about things a lot in tree structures, documentation, todo lists, technical information. I’m really keen to give this a go in my work. Thanks!

densekernel · on March 29, 2021

What a beautiful idea. Like the application of quick analysis of ls or logs. Although now we are typically outputting JSON and collecting for Kibana.

birchb · on March 30, 2021

did you try the '-format json' option?

kazinator · on March 29, 2021

  $ find /etc/network | ./frangi.tl
  etc:
      network:
          if-post-down.d:
              wireless-tools wpasupplicant avahi-daemon 
          if-down.d:
              resolvconf wpasupplicant avahi-autoipd 
          interfaces.d interfaces if-pre-up.d:
              wireless-tools wpasupplicant ethtool 
          if-up.d:
              ntpdate wpasupplicant 000resolvconf openssh-server ethtool avahi-autoipd slrn avahi-daemon 

  $ find /etc/network | ./frangi-cheat.tl 
  /etc/network
              /if-post-down.d
                             /wireless-tools
                             /wpasupplicant
                             /avahi-daemon
              /if-down.d
                        /resolvconf
                        /wpasupplicant
                        /avahi-autoipd
              /interfaces.d
              /interfaces
              /if-pre-up.d
                          /wireless-tools
                          /wpasupplicant
                          /ethtool
              /if-up.d
                      /ntpdate
                      /wpasupplicant
                      /000resolvconf
                      /openssh-server
                      /ethtool
                      /avahi-autoipd
                      /slrn
                      /avahi-daemon

  $ cat frangi-cheat.tl
  #!/usr/bin/env txr
  (let (old-path)
    (whilet ((line (get-line)))
      (whenlet ((path (tok #/[^\/]*/ line))
                (canon `@{path "/"}`)
                (pos (mismatch path old-path)))
        (let ((cpos (max 0 (+ pos -1 [sum [path 0..pos] len]))))
          (put-line `@{"" cpos}@{canon [cpos..:]}`))
        (set old-path path))))


  $ cat frangi.tl
  #!/usr/bin/env txr

  (defstruct (node name) list-builder
    name

    (:method equal (me) me.name)

    (:method ensure-child (me child-name)
      (let ((children me.(get)))
        (or (find child-name me.(get))
            (let ((new-child (new (node child-name))))
              me.(add new-child)
              new-child))))

    (:method print (me stream : pretty-p)
      (let* ((old-im (set-indent-mode stream indent-code))
             (old-id (get-indent stream))
             (children me.(get))
             (is-bottom (none children .(get))))
        (unwind-protect
          (cond
            (children
              (put-line `@{me.name}:` stream)
              (set-indent stream (+ old-id 4))
              [mapdo (op print @1 stream) children]
              (when is-bottom
                (put-char #\newline stream)))
            (t (put-string `@{me.name} `) stream))
          (set-indent-mode stream old-im)
          (set-indent stream old-id)))))

  (let ((supernode (new (node :root))))
    (whilet ((line (get-line)))
      (let ((path (tok #/[^\/]+/ line))
            (node supernode))
        (each ((comp path))
          (set node node.(ensure-child comp)))))
    (each ((top-child supernode.(get)))
      (pprinl top-child)))

kazinator · on March 29, 2021

C version. Maybe this logic should be built into GNU find as an option!

  $ find /etc/network | ./frangi-cheat 
  /etc/network
              /if-post-down.d
                             /wireless-tools
                             /wpasupplicant
                             /avahi-daemon
              /if-down.d
                        /resolvconf
                        /wpasupplicant
                        /avahi-autoipd
              /interfaces.d
              /interfaces
              /if-pre-up.d
                          /wireless-tools
                          /wpasupplicant
                          /ethtool
              /if-up.d
                      /ntpdate
                      /wpasupplicant
                      /000resolvconf
                      /openssh-server
                      /ethtool
                      /avahi-autoipd
                      /slrn
                      /avahi-daemon

  $ cat frangi-cheat.c
  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>

  int main(void)
  {
    char old_line[FILENAME_MAX] = "", line[FILENAME_MAX];

    while (fgets(line, sizeof line, stdin)) {
      char *p = line, *o = old_line, *nl = strchr(line, '\n');

      if (nl)
        *nl = 0;

      while (*p && *o) {
        char *op = p;

        if (*o == '/' && *p == '/')
          o++, p++;

        size_t lp = strcspn(p, "/");
        size_t lo = strcspn(o, "/");

        if (lp == lo && !strncmp(p, o, lp)) {
          p += lp;
          o += lp;
          printf("%*s", (int) (p - op), "");
          continue;
        }

        p = op;
        break;
      }

      puts(p);
      strcpy(old_line, line);
    }

    return feof(stdin) ? EXIT_SUCCESS : EXIT_FAILURE;
  }

Note: yes, we could swap pointers between two buffers instead of strcpy.

kazinator · on March 30, 2021

New version:

- swap pointers to flip buffers instead of strcpy

- handle corner case of directory printed after contents (find -depth) by avoiding printing nothing but spaces, or nothing but spaces followed by a slash

  #include <stdio.h>
  #include <stdlib.h>
  #include <string.h>

  int main(void)
  {
    typedef char buf_t[FILENAME_MAX];
    buf_t buf[2] = { "" };
    buf_t *pline = &buf[0], *line = &buf[1];

    while (fgets(*line, sizeof *line, stdin)) {
      buf_t *tmp;
      char *l = *line, *p = *pline, *nl = strchr(l, '\n');

      if (nl)
        *nl = 0;

      while (*l && *p) {
        char *op = l;

        if (*p == '/' && *l == '/')
          p++, l++;

        size_t lp = strcspn(l, "/");
        size_t lo = strcspn(p, "/");

        if (lp == lo && !strncmp(l, p, lp)) {
          l += lp;
          p += lp;
          continue;
        }

        l = op;
        break;
      }

      if (l[0] && (l[1] || l[0] != '/'))
        printf("%*s%s\n", (int) (l - *line), "", l);
      else
        puts(*line);

      tmp = pline, pline = line, line = tmp;
    }

    return feof(stdin) ? EXIT_SUCCESS : EXIT_FAILURE;
  }

birchb · on March 30, 2021

I love your enthusiasm, and thanks for the code. But 1) the above only works with sorted input and 2) adding it to find that would be like adding '-v' to 'cat' which as we know is considered harmful. http://harmful.cat-v.org/cat-v/

kazinator · on March 30, 2021

Actually, what it works with is a partially sorted input: an input in some tree-order input, which is what any recursive file system traversal will always put out. It will work with depth first or breadth-first order, with usefully different results, and that order preserved.

All it is doing is hiding the redundancy with spaces to improve the human readability of find's output.

Here it is on the same data in breadth-first. Note the lack of any lexicographic sort: "interfaces" is flanked by "if-down.d" and "if-pre-up.d":

  /etc/network
              /if-post-down.d
              /if-down.d
              /interfaces.d
              /interfaces
              /if-pre-up.d
              /if-up.d
              /if-post-down.d/wireless-tools
                             /wpasupplicant
                             /avahi-daemon
              /if-down.d/resolvconf
                        /wpasupplicant
                        /avahi-autoipd
              /if-pre-up.d/wireless-tools
                          /wpasupplicant
                          /ethtool
              /if-up.d/ntpdate
                      /wpasupplicant
                      /000resolvconf
                      /openssh-server
                      /ethtool
                      /avahi-autoipd
                      /slrn
                      /avahi-daemon

My first program in TXR Lisp in the grandparent comment builds the tree structure from the paths in any order and then prints that, so the paths could be scrambled into random order, yet it will recover the tree structure.

However, not munging the the output in that way and just tweaking the original output for readability has some advantages

shipit · on March 29, 2021

Exceptional!

I am a heavy user for `find <> | xargs grep` -- this makes my life so much sweeter. Thank you @birchb!

jedberg · on March 29, 2021

It's like super enhanced tree command! Very cool!

kevmoo1 · on March 29, 2021

Super cool, yo! Very unix – do one simple thing well.

lnenad · on March 29, 2021

This looks awesome, great job, thanks for sharing.

dvirsky · on March 29, 2021

This is a really wonderful idea! Thanks, OP.

rurban · on March 29, 2021

[flagged]

dang · on March 29, 2021

You've been breaking the site guidelines badly. If you keep doing this we will have to ban you.

Please review https://news.ycombinator.com/newsguidelines.html and stick to the rules from now on.

rurban · on March 30, 2021

Yes, sorry

birchb · on March 30, 2021

No worries mate. Your point is basically 'What a fuss over a very simple bit of code.'? I'm as surprised as you are.

macintux · on March 29, 2021

> Be kind. Don't be snarky. Have curious conversation; don't cross-examine. Please don't fulminate. Please don't sneer, including at the rest of the community.

https://news.ycombinator.com/newsguidelines.html