Hacker News new | past | comments | ask | show | jobs | submit login
Johnny Decimal (johnnydecimal.com)
387 points by ralgozino on June 13, 2023 | hide | past | favorite | 193 comments



If only life was that simple that it could be enclosed into series of two digits categories.

The problem with such strongly hierarchical system is that it fails if there is some document, note, picture, etc. that would be useful to keep in multiple locations. Obviously we can introduce links between objects, but I believe tags are more comfortable to use.

Hierarchical system, folders are artifacts of the physical world in which a single object, tool, pipe, screw, book cannot be in two places at the same time. In the abstract world of computers a note about new game could be in #games, #fun, #to-check, #interesting-ideas, #great-graphics, etc.


Personally - I've come to the absolute opposite opinion. To be overly blunt:

"Tags fucking suck."

They are literally the worst possible way to store and organize your information, and they are only useful when you just want a random sampling of a category - not a specific document or piece of information. Ex: Great for social media or looking at old photos or just playing a song from a genre you like, bad (fucking terrible) for organization and structure.

---

Hierarchical structures have downsides, but the exact thing you complain about (artifacts of the physical world) is exactly their strength... You have a body that is adapted to the physical world - routing and navigation through a series of ordered steps is a VERY well developed human skill. We are primed to be able to remember things like:

- Go left at the tree,

- Straight until you hit road

- Right at the road

- continue until you hit a red house with a big garden

- etc...

That skill set maps directly into the hierarchical system of folder:

- Find the "documents" folder on the desktop

- scroll down to "my super sweet project"

- open that folder

- Find the "icons" folder

- open it and double click "exactly_the_thing_you_wanted.jpg"

------

You can absolutely still make horrible, unorganized messes - but if done well (ex: this article is actually a fairly good system) it's a much, much better system than tags.


Your example about navigating roads has nothing to do with hierarchy. And, in fact, most road networks are not hierarchical and the interconnectedness is their strength:

https://en.wikipedia.org/wiki/A_City_Is_Not_a_Tree

Your brain doesn't organize information hierarchically. Let's say I ask you:

1. Name a band that starts with "B".

2. Name a band from England.

3. Name a rock band.

If your brain stored bands in a hierarchy, you'd only be able to come up with "The Beatles" as an answer for one of those questions. You'd have to figure out whether to categorize the Beatles by name, location, or genre and it would be absent from the other categories.


Or you'd have to do an inefficient search in order to find something that matched, which would be slow, but not impossible.

Or you'd have to maintain several redundant hierarchies.

(I agree with you that our subjective experience and speed in thinking of things is evidence that we probably don't mentally represent things this way.)


Strong disagree. https://blog.oup.com/2016/11/evolution-human-memory/ https://www.scientificamerican.com/article/how-gps-weakens-m... Navigation memory is the most core type of memory- most other forms of memory evolved later. There’s a reason why GPS usage is correlated with dementia. Human memory actually evolved out of a sense of navigation.

I strongly agree with the commentator who likened the hierarchal folder structure to the physical world, it’s a much more direct mapping of how human memory actually works.

Humans aren’t actually magical AI computers of energy floating in midair, they’re made of physical meat. Even if some abstract concepts (like tags) may make more theoretical sense (I agree with people who say that certain things can be classified in 2 different locations), it may not play to the actual structure and advantages of the human brain.


> I strongly agree with the commentator who likened the hierarchal folder structure to the physical world, it’s a much more direct mapping of how human memory actually works.

But the physical world isn't hierarchical at all. It's spatial. It's much more like a graph than a tree where there are usually multiple paths between any two points.

If you have to pick up your kid from school and stop at the grocery store for milk on the way home from work, you probably do not:

1. Drive to school and get kid.

2. Drive back to work.

3. Drive to grocery story to get milk.

4. Drive back to work.

5. Drive home from work.

Or:

1. Drive to school and get kid.

2. Drive to grocery story to get milk.

3. Drive back to school.

4. Drive back to work.

5. Drive home from work.

If the physical world was hierarchical, all navigation through multiple waypoints would look like this kind of stack pushing and popping.


I'm telling you that all navigation through multiple waypoints DOES usually look like this kind of pushing and popping (just on a massive scale).

So here's a possible day for me:

I work at corporate office A, it's near the highway entrance. I have to pick up my kid - they are at school down the local street heading west. I travel west and pick up my child.

Now I need milk. The closest grocery is back east, just past my office, so I drive back by my office and pull into the grocery.

Then I load up and set off for home. To get there, I need to take the highway to the north, so I head back past my office on that same street and get on the highway using the closest entrance.

I take the highway until I'm home.

---

That sure seems like a normal day to me. It's exactly what you said folks would never do, but it's super common. And it's hardly something the modern introduced with cars - there's a cost function to travelling anywhere in the world, and people like to connect using low cost paths - which tends to model a folder hierarchy.


Sure, some routes end up being tree-like, because trees are a subset of graphs. But just as often you see waypoints like:

1. Leave the office.

2. Drive to the grocery store.

3. Drive to school.

4. Drive home.

Where there is no backtracking between them.

> And it's hardly something the modern introduced with cars - there's a cost function to travelling anywhere in the world, and people like to connect using low cost paths - which tends to model a folder hierarchy.

A tree doesn't minimize the cost for any given trip or for the aggregate cost of all trips between pairs of points. Because a tree has only a single path between any two points, it has the highest possible aggregate trip cost for all possible trips while still being connected.

What it does minimize is the cost of building and maintaining the paths. Since there is only a single path between any pair of points, it has the fewest redundant edges. If you were tasked with building a road network for a country and your sole goal was to minimize the amount of concrete used, you'd build a tree.

If your only goal was to minimize the aggregate distance all travellers took, you'd build a fully-connected graph where every pair of destinations has a dedicated road.

In practice, road networks are designed to minimize both road maintenance costs and drive time and balance those opposing forces. The result is more connected than a tree but less connected than a complete graph, something like a semilattice.


It seems that navigation memory theory should imply not a hierarchical structure, but a wiki-like structure with many links. In a tree, there’s only one path to a given element, which is not the case in the physical world.


> Navigation memory is the most core type of memory- most other forms of memory evolved later. There’s a reason why GPS usage is correlated with dementia. Human memory actually evolved out of a sense of navigation.

That seems very possible, and probably important, but it's hard for me to relate that to the experience (as an "anatomically modern human") of having other kinds of associative memory that are very effective and don't have a discernible spatial or other hierarchical component.


I agree. But are there any better solutions than manually ln -s? I'm in a band, and also manage booking for a venue. I have $venue/poster/$date\ $bands/$posterfile. I also have $band/poster/$date\ $venue

I don't know of any system that lets a single poster be in multiple places at the same time.


If you want to model this using your filesystem, that's exactly why symlinks (shortcuts on Windows and Mac) were invented.

On Mac, you can write tags on files and then use Spotlight to search for them. Pick one (more or less arbitrary) primary category to use as the directory for the file, then write tags for the other ways you want to be able to search for it.


Tags are superior because tags can model hierarchies, but hierarchies cannot model tags. There are far too many times when a single document crosses multiople categories that are served by tags. I used Outlook for 15+ years and thought tags were a joke, then moved to GSuite for 13 years and learned to use tags, now I"m back on outlook and I feel like I'm suffocating without them. That's two decades of experience with both systems. Not to make a fallacy / whizzing contest out of this, but how long have you tried both systems? I'm guessing not as long.


> Tags are superior because tags can model hierarchies

Tags are inferior because tags must be coerced into hierarchies.

Tags are inferior because they do not properly link hierarchies that they model without extensive software support (which is present for file directories by design, and absent for tags). I have yet to see a hierarchical tagging scheme work well when you need to do something like change a mid-level directory name (you end up having to re-write many tags, often without good software support for what you're trying to do)

Tags themselves are fine. It's a perfectly valid way to label data. It is not a good way to organize that data for human recall and reference.


> It is not a good way to organize that data for human recall and reference.

Yet here I am: using them for recall and reference faster than hierarchies (after 30+ years of using both).


And here I am, using Johnny Decimal for over five years and I can find everything all the time. As Johnny himself said below, if it doesn't work for you - that's cool - use something else. But you assertion that this can't work is not correct. It's just that it can't work for YOU.


Hierarchies are better because they form a natural hypertext.

I'm in my documents folder. I see a list of all the categories of stuff I have. Whatever I'm looking for, it's in one of them. I go into a folder, and I see all the categories in that folder and none of the stuff outside of it. I've narrowed my focus and increased my depth. I can browse.

Sure, tags are more flexible, but (1) I find I almost never actually need them, because in most cases a hierarchy is good enough, and (2) tags don't function as a hypertext and won't let me explore. A big list of tags is much harder to dig through than nested folders.

Granted, it doesn't stop at tags or hierarchy. You can use both—on top of which, there are hierarchical tags, soft links, hard links, and even textual hyperlinks. But out of all of these, I find hierarchy to be the most important one. Given the choice among all of them, I always start with hierarchy and I typically find I don't need anything else.


Pretty sure Categories is what you're talking about for outlook.


I’ve thought a bit about tags++, that is adding some logical and not-so-logical features to them.

For instance there are ideas from OWL where you could define a category instead of other categories and their attributes, for instance tag D could be the union of tag A and tag B and the complement of tag C.

Implication is also useful both as a way to implement subclassing but also containment relationships. For instance on Danbooru a character that has several forms would have the various forms of the character imply that character and the character would imply the media property that the character comes from.

I am looking at what a tagging system looks like in the transformer age and one key idea is a kind of three value logic around tags which can be in a “positive”, “indeterminant” and “negative” state. If you are training a machine learning system to auto tag you will need (1) a number of examples where a tag does not apply (the tag not being applied is not evidence that the tag doesn’t apply, poor coverage of negative examples is one reason why YouTube recommendation is worse than TikTok) and (2) to deal with cases where the ML model tags something incorrectly. If the model tagging something puts it in an indeterminant polarity and that result can later be switched to negative or positive that is a great way to manage the situation.


> ideas from OWL

What is OWL? Except for a good lesson in why not to use common and hence impossible to search for words as names for a project.



They used to call the semantic web that OWL is a part of “Web 3.0” which failed to make an impression or was overwritten with the “Web3” moniker for NFT grifts by exceptionally ignorant people.

I learned OWL the hard way, I had been involved with the semantic web for 10+ years on and off and didn’t meet anyone who knew how to do meaningful modeling with OWL until last year, and that even includes famous academics who”ve written books in it.


OWL and RDF interest me immensely, intellectually. I've never been positioned to use either one professionally, but it looks fascinating. Is there a shorter path to successful modeling than the hard way? Is there a good source on this?


RDF is not magic and OWL is… showing its age.

If you are willing to eat the up-front cost of coordinating global resource identification— a daunting task make no mistake, you get non-trivial dataset integration almost for free. Imagine if concatenating two ginormous JSON documents describing different aspects of the same entity would amount to a useful merge into a single combined JSON. If you Need this with a big N, RDF has no alternative.

The rise of SSDs has also more or less obviated the need for clustered indexes as a practical performance consideration. For the small price of trebling your storage footprint, commodity RDF triplestores will index _all_ your attributes/columns without a schema (usually red/black or equiv). Will it scan an integer PK over 100b records as fast as postgres? No. Is that use case in your hot path? Also no (most likely).

Edit: as for OWL, just take the plunge into rule based inference directly. From forward chaining inference (if you want performance and decidability guarantees) all the way up to full blown prolog or [miniKanRen](http://minikanren.org/) (if you want it in a library in your runtime of choice)


I strongly disagree.

Everywhere where you have a lot of stuff to manage (photos, music, videos, documents, links) hierarchies don't work and only tags can tame all the chaos.

The analogy to "path finding" doesn't hold, imho. That's not how our brains organize information! We organize memories by association and not by some hierarchical structures.


there have been many, MANY historical attempts to organize the worlds knowledge hierarchically. They have all failed to achieve their goals spectacularly.

some of the most common reasons

- things exist in multiple categories that aren't in the same branch of the tree

- different state of mind during data retrieval means you expect the same item to be in different categories.

- different humans think the same thing belongs in different hierarchical locations

there's also been a LOT of scientific research around informational organization. It all came to the same conclusion. Hierarchies have interesting promises but fail when it meets the practical reality of the human brain.

in the end hierarchical organization of knowledge is a terrible solution expect in VERY restricted cases.


Do you have any suggestions of where to start reading on this? A seminal paper or cluster of papers? I want to deep dive on this not just to map out where it doesn't work but also to get a map of the restrictive cases where it does work.

edit: never mind, I just put your quote into gpt-4 and it passed me on to Eleanor Rosch, prototype theory and some other interesting works. I feel like this is my own modern lmgtfy moment.


Tags are great as an adjunct to a thoughtful folder hierarchy, IMHO.

Links are great as part of that too, they can provide shortcuts.

Real-world use: I am an artist, and I have found that the best way to organize my work is with a series of yearly directories. If I begin a large, multi-year project, it goes in a directory within the year I start it; I'll make a link to it that lives next to all the yearly directories.

I also use OSX's tags a ton. Files get marked as 'in progress', 'complete', 'paid for', 'commission', and 'experiment' (and a few other things). When I want to decide what to work on in any particular day it's super easy to open up the saved search for "everything in progress" that I keep on my desktop; this shows me everything in those yearly directories that's marked as 'in progress', whether it's personal work, client work, whether it's part of a large multi-file project with its own folder hierarchy or just a single file in the yearly directory. I also have a saved search for 'commission'+'in progress' for those days when I know I want to work on clearing the commission queue. And whenever I spend some time just fooling around with different effects to create interesting looks, I'll save my scribblings with the 'experiment' tag; when I decide to use it later I can easily tell Illustrator to open a file, and look through the 'experiment' tag to find the file full of some crazy procedural explorations, regardless of how long ago I did it. This habit has saved me hours of digging for that one file where I did that cool trick once.

Trying to organize all the files in my artwork directory with just tags would be a total fucking nightmare, the subdirectory for a multi-year graphic novel has its own folder hierarchy that's several levels deep, and when I know that what I want to work on today is "getting the prepress files together for book 3 of the graphic novel" it's definitely great to be able to just hit the top-level link to the graphic novel directory, then go into "books", then "3", and have its own little file hierarchy in there.

Tags by themselves are not very good for serious organization, but they can be very good for pulling things out of a hierarchical structure. They take work - I have to remember to mark a new file as 'in progress' and possibly a 'commission', though that's become routine, and changing something from 'in progress' to 'complete' is a pleasure. But it's work well worth doing to create a nice little network of shortcuts and secret passages through the terrain of your thoughtfully-laid-out tree of folders.


> You have a body that is adapted to the physical world - routing and navigation through a series of ordered steps is a VERY well developed human skill.

I find that this skill is better utilized with a system that has hyperlinks like Obsidian.

Also purely hierarchical systems break down over time, they can be supported with tags. https://karl-voit.at/2022/01/29/How-to-Use-Tags/

> To my surprise, we tend to think in hierarchical categories all the time. As I have written in my article on Logical Disjunct Categories Don't Work, the real world does not fit into disjunct categories.

> Therefore, we should embrace multi-classification more often. If you do want to learn more about the rationale, you may as well read the first chapters of my PhD thesis or the book "Everything is Miscellaneous" by David Weinberger, just to give you two resources of many.

> Long story short: tagging does take away the burden of finding one single spot in a strict hierarchy of entities which is actually a heavily intertwined network of concepts we do find in the real world. It's far from being a neat hierarchy. Everybody who tries to put "the world" into a strict hierarchy will fail.To my surprise, we tend to think in hierarchical categories all the time. As I have written in my article on Logical Disjunct Categories Don't Work, the real world does not fit into disjunct categories.


The only reason we're even discussing the topic is because search is so poorly implemented in client operating systems. Tags suck, hierarchical structures suck, everything that isn't search sucks. Search still kind of sucks, but it sucks much more because the search available on your own computer for your own files is about thirty years behind the state of the art.


I hope hierarchical aren't disallowed sometime in the future - I could see it happening for phones.


Can't have both ... tags and hierarchal?


Yes you just use a wiki with a traditional tree structure and search. I use Obsidian which lets you do just that.


I've done both as well, tagging everything and then assigning the tags into exclusive hierarchical relationships (for discovery purposes and grouping), but it only works to subdivide within an existing noun like "talent" or "wood panels", without a seed noun tags start becoming too abstract and the object with those tags start to lose all semantic cohesion.

I think once you start talking about unbounded universal tagging with hierarchies, they are not compatible, you need search and weighting or intelligent interfaces.

Search and LLMs really are major organizational improvements in our lifetime imo.


hierarchal tagging is the one true path


I think that's the whole point of this system, when you have infinite tags it's impossible to maintain a correct taxonomy, you add #great-graphics to this game, but now you have to backfill it to all other games, or in the future you may miss them.

They created this so the hierarchy is unambiguous (as much as possible), you want a document, you are two steps away from it in an easy to find way.

tag systems have far too much maintenance and adding a new tag is almost impossible to do exhaustively so you have a lot of partial tags.


> you want a document, you are two steps away from it in an easy to find way

This isn't a response to the parent commenter's point, right? They were describing how many projects have items where a resource easily fits within the scope of N different categories, at which point they become max N steps away from it, not max 2 steps.


Two thoughts:

1. This is much, much less likely with the enforced limits on categorization in the post.

2. No - you are still 2 steps away. Make a choice about where that item lives. If it's shared across many categories, maybe you really need a distinct category like "Ambiguous" or "Shared"


> No - you are still 2 steps away. Make a choice about where that item lives.

You misunderstand. The max N steps are at the point of recall, not categorization decision.


This is a great point about tag maintenance *if you have to make the tags yourself*. However, if you have a simple ML system that you can run to categorize your files and pull out good single word descriptors that have a large explained variance over your files, you can run this and check the tags that are constructed.

I think there's a good way forward that uses typical hierarchical Johnny.Decimal filesystems, with an overlay filesystem with tags that can update the tags every so often based on the content in the files. Obviously letting the user have a hand in this via a TUI/gui would be helpful for choosing tags for which they're comfortable.

Unfortunately I haven't settled on a good filesystem with tags (how to do this with ZFS?) or how to interact with it as a network filesystem served to many different OS (cifs with tags?).


It doesn't seem to me like a simple ML system, it needs to be able to extract tags from all kinds of filetypes (video, games, images, assets, text, ...), at a decent speed and then it has to assign tags to what you would also assign, because if it doesn't do that then it's even worse, because you can never find anything as your mapping and the ML mapping would not be the same.


> However, if you have a simple ML system

Or the old-school method, a community of people with tagging powers and a few moderators to do sanity checks.


#great-graphichs problem is not something category based system will solve either, as you have the same problem. Nothing will, to be honest, maybe AI eventually and even it can't do it in all the things.

> you want a document, you are two steps away from it in an easy to find way

This is not how people work in general. This kind of thing might be OK for institution for taxonomy like collections.


>Hierarchical system, folders are artifacts of the physical world in which a single object, tool, pipe, screw, book cannot be in two places at the same time.

Many think hierarchies come from limits in the physical world but that's not what's happening. Yes, that's some of the cause but does not explain all of it.

The deeper rooted reason is that hierarchies are a convenience to aid the human mind. Even without any limitations of physical shelves, the brain likes to:

- notice the relationships from the general-to-specific and navigate them with spatial cues of dirs parent-->child-->grandchild-->etc

- group related items together -- using spatial cues of moving file icons into a file system folder

The world the the blog essay is working in is the os file system. The various files have to be put somewhere on the file system. Since putting hundreds/thousands of files into a single flat folder is useless, one creates some child subfolders to organize it it in some way.

The tagging system assumes a different mechanism (e.g. a separate "database" of tags which filesystems like Microsoft NTFS and Linux ext4 do not have natively.) This happens above the native filesystem. (Incidentally, by placing a file into a subfolder, the name of that folder and the names of parent folders above it act as an "implied set of tags" for free.)

That said, both hierarchical folders and tags solve different needs. Also, hierarchies simulate/approximate "tags" by "virtual folders" and 1-to-n softlinks. Likewise, tagging can simulate "hierarchies" via compound-multi-word-tags.


Your argument seems to come up a fair amount in these discussions. In the end, you have to deal with storage of many items, and you can either browse or search. The browse approach requires you to know where you'll be browsing in the future. The searching approach requires you to know what to search for. No system is going to deliver all relevant documents, but you can do a good enough job with a hierarchical system plus search.


I think this is exactly right, and it is a facet of the same discoverability issues that crop up when people talk about GUI vs CLI - one is more useful when you're discovering, and one when you are searching. Tags are really set-based search operations like a SQL query, but the 'primary key' is the filename, and if you knew that you'd just search for it. You're rarely going to have a tag or attribute that can pinpoint a single document.


The article points out that it is too easy to create duplicate files. Part of that ties into what you're talking about. Part of that deals with how people deal with files (e.g. few people use versioning outside of software development). The article is suggesting that a strong hierarchical system will help to avoid that problem.

Of course the other problem with tags is management. Placing something into multiple relevant categories involves more effort. Failing to place something into a relevant category makes it harder to find since you are now dealing with either a flat file namespace (worse yet, a disorganized one) or a flat tag namespace. In theory, some of this can be handled by letting someone else handle the tags (e.g. the creator, the publisher, or the seller), but that has its own problems since there is frequently a conflict of interest (e.g. irrelevant tags are applied to increase the visibility of a product).

At the end of the day, we have to accept there is no perfect system of categorization. Some will prefer hierarchies. Some will prefer tags. From the tone of the article, it is clear that they prefer hierarchies.


> At the end of the day, we have to accept there is no perfect system of categorization. Some will prefer hierarchies. Some will prefer tags. From the tone of the article, it is clear that they prefer hierarchies.

I’m the Johnny who wrote Johnny.Decimal and this is basically it.

The OP clearly isn’t one of the people for whom finding JD is a massive mental relief. I know those people exist: they write and tell me.

Others find the idea baffling. Stupid, even. That’s fine. If this helps you, enjoy it. If it doesn’t, use something else.


Thank you. "If this helps you, enjoy it. If it doesn’t, use something else." is a sane, humble, and adult attitude. You have my respect.


I built a hierarchical note-keeping system for myself and have been intending to add tags to it, but I've never gotten around to it -- because the hierarchy is generally "good enough" after I added two features: linking, and grep.

Grep is self-explanatory. Linking works like hard links in Unix, where the same note appears as a child of multiple different parents (added a command to find "orphans" in case you unlink it from everywhere).

At this point I might not even bother adding tags.


> Linking works like hard links in Unix, where the same note appears as a child of multiple different parents (added a command to find "orphans" in case you unlink it from everywhere). > At this point I might not even bother adding tags.

What you described with hard links is exactly how I use tags, so that would satisfy my need for tags as an organizational tool.


While I haven’t gone so far as to attribute a numbering system to my organization, I have done well at organizing things into red-line distinctive categories. The idea is to create categories that _cannot_ overlap. If there’s any commonality between them that’s not useless, they need to be grouped at a higher level.

As an example, if you’re organizing your toolbox, you don’t mark a drawer “hand tools” because it’s a useless categorization. You mark one “socket tools” which will include everything from the sockets and wrenches themselves to adapters that connect a socket to an impact wrench (but an impact wrench does not go in there because it is not exclusively a socket tool). If it really does come down to something that may really fit in two categories (hey, there’s always exceptions), you put your mindset in the place of yourself when you want to look it up: what’s the most common situation in which you’ll be looking this thing up?


This is the crux of it right here. You need to decide up front, thoughtfully and carefully - where you are going to put something. Just like in the physical world. Then you need to adjust and adapt it as you go. All the benefit comes downstream from those small additions to the workflow when you go to save something digitally.


I spent some time studying the world of professional home organization(as seen on Youtube) and the core concepts always come down to these:

* Allocate space up front in the form of containers

* Position containers around workspaces

* Use containers appropriate to the type of object and its use(e.g. "rounds in rounds" - put round bottles on turntable racks so you can spin to access)

* Duplicate objects you need to use in multiple locations, e.g. scissors for the kitchen and for the office

* Label spaces where things belong

And the key thing to it is that this isn't a hard rule like always organizing hierarchically or always labelling. The hierarchy helps compress space(that's why books and folders are powerful) and the labels help define uses, but in many instances, the level of organization you need is an open bin with some dividers - the drawer organizer, cube storage, cardboard box, book bin, cafe tray etc.

Computer file systems are somewhat resistant to unlabelled open-bin storage because that means you're allocating with less precision, but I think everyone in practice knows that they will shove things in "Documents" or "Downloads" and just periodically purge it.


I solved this problem with hard links. I became fan of Hierarchical systems, it just works.


Workflowy [1] solves this problem by supporting mirrored nodes as well as tags.

[1] https://workflowy.com/


Gödel's incompleteness theorems strike again.


But step 2 is to just "Make sure the buckets are unambiguously different."! How hard could that be? \s


Paraphrasing Greenspun's tenth rule [1]

Any sufficiently complicated library management system contains an ad-hoc, informally-specified, bug-ridden, inconsistent implementation of half of the Dewey System [2].

[1] https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule

[2] https://en.wikipedia.org/wiki/Dewey_Decimal_Classification


I think you're being unfair. There's nothing bug ridden or inconsistent here, just a simple categorization system that looks like it would be pretty decent for small to medium sized projects.

It's also not informally specified. The shared link is literally the specification document. It's written in a kinds of informal style, sure, but that's a different kind of informal - Greenspun's informal means "not written down at all".


>I think you're being unfair.

Uncharitable. The fact that this is called "Johnny Decimal" is a nod to Dewey Decimal in the first place


I suspect that rsecora was going for humorous parallelism rather than meaning any dig at the linked project.


This is exactly why the name was chosen (I'm Alex).

https://johnnydecimal.com/00-09-site-administration/01-about...


As this thread currently has a lot of critics, I just want to put in a personal plug for JD. I've been using it for some time now for family and personal data and it has been enormously helpful. It's true that it is occasionally vexing to have to choose one category for a given thing, but (a) it's usually not, (b) it's ok to reorganize categories, (c) I have found that often if there's something important that fits equally well in more than one category, (c1) either I need to refactor my categories, or (c2) it's probably going to be ok if I just pick one and allow myself to recategorize later. This almost never happens anyway.

And in the mean time, all my stuff is searchable, browsable, findable, and tidy.

I'm not saying it will work equally well in all environments or for all purposes, but for mine, it solved many years' worth of stress.


> it's ok to reorganize categories

This is an important point. A person’s interests and areas of responsibility evolve over time; so refactoring is not only permissible; it’s probably also helpful to unload accumulated organizational cruft that’s no longer relevant.

When it comes to indecision about where a file goes, I’ll often just place a .txt file in the “wrong” location pointing to the correct spot. Or an alias.


I have a similar experience for personal life. I use this too - it's imperfect, but it's a nice balance of complexity and utility that doesn't get in the way once you set it up.


When you refactor, do you have to renumber everything in the categories you refactored? Isn't that a lot of work, and doesn't it break the whole system if you can't rely on the JD numbers to be permanent?


If you refactor, how does that affect email searches including pre-refactor subject lines?


I imagine you could just reply to the old thread with the new category.id. Or only reply to yourself if it's only your organization system. Email search should include the email bodies.

(I'm not a user of this; just guessing)


Agreed. I find this useful too. Especially for random stuff I use once a year.


This is effectively how formal military instructions are structured - and generally US code for that matter, with chapters generally reserved for certain functions going down to the .01 decimal specificity [1]

Way back in 2010 or so I published a series of instructions for the 36th Wing that followed this kind of naming/information numbering convention which was frustrating to fit into, but ultimately once you understand the framework it's faster to write.

That isn't to say it isn't confusing and complicated - which happens to everything at scale - simply that this kind of structure for documentation is pretty common and literally battle tested.

[1]https://www.esd.whs.mil/Portals/54/Documents/DD/iss_process/...


Number-based organization systems (e.g. US code) work best when there are frequent references to specific nodes in the hierarchy (e.g. legal citations) and there is no guarantee that they're being accessed digitally.

But there is a good reason why I navigate to news.ycombinator.com and not 209.216.230.240.

For digital resources like URLs or file systems, using numbers as prefixes or primary IDs only makes sense if their ordinal values represent the most important and intuitive way to browse through the hierarchy.

But in most cases, the name rather than the number is the most important thing, and it's very easy to sort or filter by name -- whereas sorting or filtering by number is only useful if there's an inherent ordering (e.g. date modified) to the numbers.


> But in most cases, the name rather than the number is the most important thing, and it's very easy to sort or filter by name

Names can also be difficult if not done correctly / uniformly. For instance, "Category Name", "CategoryName", "category_name", and "category-name" can all return differently through search.

I don't think the key is names vs. numbers vs. whatever else, I think it's more important to pick a system that works for the use case, then define / document / communicate it as wide and loud as possible.


>The Chief, Directives Division (DD) assigns numbers to DoD issuances based on the established subject groups and subgroups provided in Tables 1 through 9 and the recommendations of OSD and DoD Component heads with equity in a particular issuance.

What an opening sentence.


Commercial construction specifications are done in this way as well. So all electrical specifications are in division 16000 (or 26000 nowadays) and subdivided from there.

This method of only being two levels deep is interesting. If it works, that's great, but there's nothing to stop you from going three if required, e.g. 10.20.30. But keeping everything constrained has value in itself, if only in that it forces you to think in larger discreet chunks.


I incorporated some of these ideas like 10-15 years ago.

My top level relations:

* Fun: Sex, drugs, rock & roll

* Home: Rent, buy, interior, yard, cars, places

* Meta: This system

* Mind: Philsophy, language, math, art, music, science

* Money: Accounts, investments, Bitcoin

* People: Family, friends, everyone

* Self: Fitness, health & illness, spirit, food, fashion

* Tools: Computing, devices, productivity, maker, crafts

* Work: Career, job

Roget's original thesaurus, which divides every word into 6 (or something) top-level relations was also an inspiration.

These are my root items in Workflowy (with its infinitely nested bullets).

I star active projects so they show up in the sidebar. I shift-drag (to mirror) items out of projects into the root (above the relations) to serve as my daily todo list. All in all, simple, efficient, and comprehensive.


I was trying to imagine what my 10 categories might look like and it's very similar to this! I tried getting into using Obsidian and used top level categories such as: Ideas, Lists, Learning, My News, Reminders, Misc.


This is great. I noticed I have about 9 or 10 top-level bookmarks folders to do the same. My categories are very similar


This is a great top-level list. You should add it to the JD site via a pull request!


If anyone has implemented this successfully/satisfactorily please post your folder hierarchy so everyone can compare notes and improve their organization.


I have only used this (alone) for a few weeks because it is the first kind of organization that really resonated with me. I understand it may not be for everyone, but when it comes to organizing small to medium projects, it's really good IMHO. I use the standard organization because I'm not creative. Every project has his directory with a prefix (like "FMW01 xyz" for "firmware, first project, named xyz"), and subdirectories named "00-09 System," "10-19 Project management," and (my choice) "20-29 Data" with "20 Inputs" and "21 Outputs."

I have a template with empty folders and files (like Notes.md, Todo.md, etc.), and I can copy-paste this template for each new project. As long as I improve my template, every future project will have the new structure.

It's like the GTD system (which I also enjoy), but for organizing your thoughts, notes, and files in different projects. It's weird because I'm not fond of naming folders with numbers but this time it seems to work. Every project has the same structure and I'm not lost. I guess it's good for people who needs a serious structures as it forces you to have a good organization.

Interestingly, I had a boss 10 years ago that was using an equivalent method with a template and numbered directories. He was successful at managing projects and I think I discovered his secret.

Last but not least, once a project is done, I can zip it and reuse its number.


We use a loose version of it in my small company.

It helps with two things: - 1. A little easier to be consistent across projects so not to reinvent the wheel every time - 2. The prefix increments as new folders are added during a project, painting a convenient picture of “progress” as things move along.

We tend to have: 10 to 19 reserved for admin stuff, like Admin, Incoming, Outgoing, Documentation, Meeting notes, etc.

Then anything from 20 onwards is ad-hoc per project

We also timestamp children of Incoming and Outgoing, with an ISO prefix. This is very useful to keep track of what was received and shared and when.

Overall the goal is to have as little protocol as possible to prevent total chaos. Anything more than that is usually too much to ask or doesn’t stick longer than a single project.

  10. Admin
  11. Incoming
    2023-10-12 sender, subject
  12. Outgoing
    2023-09-01 Estimate
  13. Documentation
  20. Design
  30. Production
  40. Blah


I also use variant of incoming/outgoing, its very convenient.


I've been chipping away at moving to my own flavor of JD over the last year. One of the first things I did was add one higher level with broad categories, numbered as x00. Tis way things are broadly organized, I still don't have to 'fundamentally' go more than two folders deep or 'have more than 100 folders', but I can use it for my entire work life despite having 100-ish actual technical projects.

Backporting old docs to this system is a real chore and honestly, I haven't been very disciplined about that part, besides moving old Project folders under the top-level Projects folder. But this is always going to be an issue with any new filing system, and I don't think there's a lot of value in doing it. Maybe would be an interesting programmatic exercise. But I, hotsauceror at his keyboard, am NOT going to go and retroactively assign a 753.0026 etc identifier to every document lol...

My rough, rough hierarchy is as follows:

  100 - Administrative
    - 110 Interview Notes
    - 110.001-eng-john-smith.md
    - 120 Onboarding
    - 130 Performance
    - 140 Training + Certification
    - 150 Travel + Expense

  200 - Analysis
    - 210 Code Review
    - 220 Performance Tuning
    - 230 Technical Specs

  300 - Documentation
    - 310 HOWTOs and Runbooks
    - 320 Technical Specifications
    - 330 Environment
    - 340 Processes

  400 - Meetings (this is a catchall)
    - YYYY-MM-DD-annual-project-plan.md
    - YYYY-MM-DD-budget.md
    - YYYY-MM-DD-new-policy-rollout.md

  500 - Operations
    - 510 Stack #1
      - 510.001-turn-it-off-and-back-on-again.md
    - 520 Stack #2
      - 520.001-reset-proxysql-after-network-partition.md
    - 530 ...

  600 - Troubleshooting (another outlier)
    - yyyy-mm-dd-stack-2345-bad-plan
    - yyyy-mm-dd-stack-1234-cpu-peg
    - yyyy-mm-dd-stack-3456-non-yielding-scheduler

  700 - Projects
    - 701 Project 01
    - 702 Project 02
    - 703 Project 03...

  800 - Reports

  900 - Training
    - 901 Brown Bags / Lunch+Learn
    - 902 Terraform Certification
    - 903 AWS Certification
I have recently added a 000 - Logs folder for places like coding journals, another trendy suggestion that pops up here on HN from time to time that I may or may not stick with...


It seems like nothing of use is gained by replacing folder names with numbers that index those names aside from making the path shorter. In a library this is useful because books have to be stored physically in order, but a computer does not have these restrictions. You could just as easily apply the same set of rules without the numbers and see similar results, with the advantage that the names of things reflect what they are. You also wouldn't have to create silly rules like "1- is always project management", because under the new system, "project management" will always be project management.

He does seem to address this at least somewhat[0], but the justification is so flimsy it's hardly worth addressing. In essence, he doesn't like alphabetical ordering because the index can change when something new is added. He would prefer new folders to be inserted at the end of the list. He is evidently unaware that folders can be sorted by creation date.

[0] https://johnnydecimal.com/10-19-concepts/11-core/11.02-areas...


> It seems like nothing of use is gained by replacing folder names with numbers

It forces you to whittle your categories down to ten (and sub categories). I would argue that in and of itself is a useful constraint.


You don't need numbers to do that.


Using numbers makes it easier.


I have invented a superior system - Johnny Binary.

It's basically the same as the described system except you are forced to categorize your files even more severely since every level of the hierarchy only allows two subcategories.

It must therefore be superior, right?


When I first started out I used Johnny Unary. I dumped all my documents into a folder called - get this - "Documents". It actually worked remarkably well for a number of years.


All of my 100+ development projects exist in a single folder. Everything is easy to find because a) I'm usually looking at half a dozen of these projects actively in any given week and 2) the others have appropriate names that I am able to recall quickly, and 3) modern search functionality is fast and user friendly.

I've tried organizing by language, target platform etc. and all I ever found was that it did exactly as described elsewhere: a) projects did not fit nearly into one category or another and b) extra clicks were required to navigate to them. It also maps to how they are organized on GitHub and various package repos, which invariably give you a single searchable list.

Adopting modern and quirky organisation systems are, IME, frequently just premature optimisation, and most data in massive amounts such as photos are best organized by organic means, e.g. photos are best organized by date.


Yeah, there was a system/OS/UI concept I came across years ago that I can't find anymore, but every document on your system is in one time-ordered stack/stream and then I guess you just have filters and such to manage random access.


Very funny but hopefully we can all see that “constraints are good” does not imply “you should be as constrained as possible”


The most useful part of the process is simply thinking about how to organize your files.

The 03.65 like naming can indeed be switched out for something with words, but I believe the best of both worlds is to make the words "unix-like", i.e. small, and explanatory.

For instance *~* 10 main directories (code, doc, vid, etc) with *~* 10 subdirectories (note, tv, movie, etc) is nice to try to fit your data into, but if one of the subdirectories has only 8 things, it's not the end of the world. This tends to work extremely well for "longer term" storage (a drive mounted beside your OS for data when 'finalized' or 'semi-finalized') but the mess of OS and everyday files isn't as appropriate for it.


I find that the numbers are really helpful when trying to find related items. Things that are topically connected sort together and can be filtered by common criteria.

I use SimpleNote a lot for JD content and put the category in the title of each note. I type a piece of the JD number in the search box and it instantly filters down to relevant notes. Sort by title sorts by topic.


I agree, these mental maps you have to create is adding extra mental overhead, which apparently the author aims to reduce... odd.


Couple caveats that I think should be included:

Use this for your own files where no one else has to find anything.

Avoid reorganizing other people's files.

If you do the organizing, it may make sense to you, but may not for other people.

Adding the decimals has the primary benefit of nothing being recognizable from before, so that new brain maps can be made, not horribly and painfully mangled, warped and twisted from the old maps.

If you have to navigate one of these systems and you didn't create it, use search and hope files are named well, and hope the creator didn't go overboard with making foldets. Otherwise, welcome to a little hell of clicking into a million empty folders and never being able to find anything.

Has anyone mentioned Aristotle yet? His abstraction of categorizability works, but is so obviously wrong once you have to accomplish any practical task.

For us, organic folder structure development for as long as possible, or avoiding folders as much as possible is better. Then, some intelligent and pragmatic decision making, and no hard and fast rules. We are human friendly first, where file systems are primarily intended for human navigation.


This pops up on HN regularly. Was extensively discussed at https://news.ycombinator.com/item?id=25398027

Interestingly, we have a similar "BASIC line numbering" system in our company. Allows for easy traversing the directories if you can remember the numbers (I cannot), such as "05_Contracts/15_Employees/041_John_Doe/07_Testemonies".


First time I heard of it.

I like how simple the core concept is explained, but I feel it would box me into categories when I like tags more (categorizing items in multiple orthogonal domains). OTOH maybe well thought-out categories would bring more structure than tags.

My current notes strategy is to prefix the date to markdown filenames (for example '2023-05-31 canvas scan transform matrix.md') and put them into single dir. These are active journal-style notes that I'm free to update over next days while they are still in focus. Every few weeks the list of nodes gets busy and I 'archive' older notes into sub-dirs (personal, hobby project, work project) and backup the whole structure. The method requires minimal maintenance and the full text search works well for my needs.

Edit: I like how the author leverages the CLI auto-completion and I try to do the same, but I think Johnny would work against my brain. When naming the directory or a script, I put myself in mind frame where I'd want to use it and I'm trying to recall its name. So I give semantic names like 'build-android.sh'. If it's a new thing I try to come up with a short catchy name for it. Having to recall the `10-19` category each time I want to access specific subscope seems like too much cognitive burden. Just theorizing, haven't given it a shot so far.


Where I work we do use that structure. I can never find anything. For me remembering the numbers is as remembering IPs instead of URLs. The problem is not the naming of the directories, the problem is that the next idea after “johnny decimal” was to make a standard structure. Because this structure has to serve the full company, is HUGE! So irrespective of project or area size, you have an structure of 10 levels with 30 directories in each level. The names are very generic, and sooner or later somebody has a different interpretation of where document X should be placed… we have lost days searching for lost documents…


I have been using JD for a while now, to the point that I built a CLI for it (using Deno).

But I just enjoy the speed of feeling like I can cd to any directory at any time in like... 8 keypresses (`jd 20.21` is an alias I use to cd).

https://github.com/bpevs/johnny_decimal

https://johnny.bpev.me/

Edit: I had a separate hierarchy I used on my work machine when I was still working at a larger company, but this is the one from my personal machine (with some redacted)...

  10-19 Notes
    10 Quick [Daily-life kind of stuff]
      10.01 Daily Notes
      10.02 Cooking
      10.03 Listening Notes
      ...
    11 Research
      11.00 Device Setup
      11.01 Project Name 1
      11.02 Project Name 2
    12 Reference [Basically categorizing random notes]
      12.00 Unsorted
      12.05 History and Current Events
      ...
      12.28 Spatial Audio
      12.29 Music, Cognition, and Computerized Sound
    13 Travel
      13.01 中文
      ...
      13.10 Maps
    18 bpev.me
    19 Documents
      [Various documents here]
  20-29 Projects [Active Projects]
    20 Code
      20.00 gists
      20.01 bpev.me
      [insert projects I am committing to often]
    21 Media
      21.01 Music
        [insert Music album work here]
  30-39 Archives
    30 Code
      30.03 favioli
      30.04 johnny_decimal
      .....
      basically, maintanence-mode projects.
      If I start committing on a more regular cadence, I move to `20 Code`
    31 Media
      I have a separate, date-based hierarchy within these...
      31.01 Music
      31.02 Photos
      31.03 Videos
      31.04 Memes
      31.05 Screenshots
    39 Backups
      39.01 Contacts
      39.03 bpev.me
      39.04 Savefiles
      39.05 Applications


> Nothing is more than two clicks away

> An important restriction of the system is that you’re not allowed to create any folders inside a Johnny.Decimal folder.

This being said immediately after a screenshot with three levels of directories confuses me. One problem I immediately identified with this system is that I would have to take extra steps to peek into the applicable directory to see what the current index is...

I'm always looking for a good organizational methodology. This seems to be per project, no? Any suggestions for a system for overall data organization?


> This being said immediately after a screenshot with three levels of directories confuses me.

Me too, but reading more I understand this now. A "Johnny.Decimal folder" is a folder that starts with a name like 12.04, meaning it represents a unique item. It will already be inside two other folders, the 12 folder and the 10-19 folder. The point is that while 12.04 can be a folder if the unique item is actually multiple files, you can't have more folders inside 12.04, because that's considered too much nesting.

> This seems to be per project, no? Any suggestions for a system for overall data organization?

Multi-project organization is covered later on: https://johnnydecimal.com/10-19-concepts/13-multiple-project...


I'm enjoying using it for overall family/personal data organization. Took me a year to migrate in, and I allow myself the privilege of reorganizing as needed; but I've found it super simple and super stress-relieving.

I do allow myself subsubdirs wherever it makes sense though. E.g. right now I have a file browser open to "64.05 TV Shows" (60 - 69 is "Media"; 64 is "Video"), and within 64.05 I have one subdir per TV show. I don't feel obliged to give each show a special number, and I also don't feel troubled by each show being a sub(sub)dir. This system is searchable and browsable within my tolerances.


Gosh, that's a really awful explanation. Not sure I get this correctly, but the gist is, you organize things by nesting general Categories with specialized categories and put a number on them. With the "lifehack" that the first digit is the general category, and the second digit is the specialized category? And then every folder under a specialized category gets another number? And this is only meant per Project? Not globally? Meaning every project can have slightly different categories & numbers? Have I understood this correctly?

How does this handle inter-project-files? What exactly is a project even in this context? How does it handle things which can be in multiple categories? This smells for like someone pressing everything into a hard form to circumvent the flaws of their tools, instead of getting better tooling.


I don't plan to use this, but I think I get why it might work.

(1) Although it's just a hierarchy/tree, which is nothing new, its size and shape is (supposed to be) a sweet spot. There are trade-offs with hierarchy sizes and shapes, so a sweet spot is a plausible idea.

(2) By limiting the size of the tree, you force people all across the organization to share the same parts of it rather than giving them private spaces they control exclusively. This means they are forced to work together on how information is organized. This could encourage there being one coherent idea of how information is organized. Everyone will have to agree on how it's organized, and everyone will be more familiar with how others' stuff is organized.

(3) The numbers are small enough that you can remember them and talk about them. When you ask someone where something is, they can give you the answer directly instead of promising to send you a link. (It's like how you can read an IPv4 address off one screen and go type it into a config file on another computer, whereas unfortunately this is not easy with IPv6.) This increases the odds of success in finding the info.


I wrote an article about this system a while ago: https://www.dsebastien.net/2022-04-29-johnny-decimal/

I rely on it a lot for my personal data and projects. The simplicity and constraints have a positive impact on the usability of the organized information


The older I get the more I appreciate the intrinsic value of constraints.


I think JD main value resides in the restrictions it suggests. They will work for some people, for others they will not, and others like me will adopt JD in an informal way. For example my most used folders, loosely corresponding to main areas of focus have unique numeric prefixes, but inside them the folders do not follow the numeric prefix approach. What I appreciate is having the same numeric prefix in all the applications I happen to use, like GMail labels, task manager projects, Evernote notebooks, and file systems.


It's always boggled my mind how disorganized most companies are with written information. It's always a wiki here, 7 different file shares over there, most of the latest data is on workers' desktops named "mgmt report 04032023 latest jb edits 2.0.doc". Constant stream of "can you send me the thingamajig file?"

And yet, we've all been to a library. Information organized by topic, then by author, and inside the books everything is further organized into chapters, and then there's an index referencing all of that (plus a card catalog/search system).

I use something similar to the Johnny Decimal system described at work, except the high level is by project not by topic. I find chronological filing split into projects (i.e. chunks of time/money/effort) matches my workday better.


Libraries are also easy. Books are done, one dimensional pieces of linear writing. The text itself is the thing you care about.

Companies run on mental models that are occasionally partly solidified (and ultimately ossified) in a textual format.


Libraries have a physical common place where everything always is.

Seems like better tech could improve this.

Arguably Whatsapp history already has, because at least stuff tends to collect in one place and be searchable, as opposed to being on desktops sent to individuals on request and forgotten.


The thing with libraries is that they're full of librarians. For some reason, it has fallen out of vogue for all but the oldest/largest companies (and government agencies) to hire librarians to work outside their libraries.


And, librarians are professionals who are trained specifically in the challenges around managing info, etc...Like many other areas, many corporations don't value long-term attention to things that will help them the most...in the long term. Its just too much short-term thinking...as well as, "oh hell, we don';t need to hire librarians...that takes money away from stockholders...Everyone in the company will just figure out how to manage the data at some point in some fashion on their own...etc." :-p



Every time I think about implementing this I realize the categories I have today and the categories I have five years from now are unlikely to mesh well.

At least based on my priorities from five years ago.


I've been thinking about that too. We either make broader categories or allow ourselves to deprecate and refactor category numbers in the future.

To me though the overwhelming benefit of the process is the act of bucketing. Another strategy then would be to bucket down to 8 categories instead of 10 — like line numbering in BASIC you allow yourself a bit of space if needed in the future.


This seems like one particular example of a good general set of principles: organize things intentionally, put things in one place, use hierarchies with a branching factor of about ten. The specifics beyond that are probably not worth arguing about.


Beautiful sentiment, but sadly akin to muttering a poem to a raging River - I can’t think of anything more HN-y than arguing passionately about directory organization systems!


I'm messy, I like being messy.

I cannot follow any of those organizational, rigidly structured methods. They make me anxious, I much rather live in my mess and let it automatically prioritize stuff for me.

Things I don't know where I left are likely unimportant, and no energy should be wasted on them.

I think I finally made peace with my mess.


When I first saw this, I thought it looked silly and too simple to be useful.

The other day I looked at my DEVONthink database I’ve populated over the last 15 years or so, and what do ya know. It has a couple dozen top-level folders, each with a handful of folders inside, and that’s about it. I didn’t deliberately set out to do this, but “Banking/{Bank1|Bank2|Bank3}”, “Medical/{Me,wife,kid}”, “Taxes/{2020,2021,2022}”, and so on evolved that way anyway.

I love the idea of tagging, but turns out nearly all the information I care to store long-term can be filed more easily than it can be tagged. It’s rare that I want to have the same doc in 2 places, mainly limited to when I’m collecting information to send to someone else (e.g. filing taxes, applying for a business loan). When that happens, I just - shocker! - make copies of those docs in a new folder I’ve created to collect everything I need. DEVONthink makes the copy a zero-sized reference to the original doc and gives each copy a special icon so you know it’s a duplicate.

So basically, Johnny Decimal couldn’t possibly work for me, and yet I ended up with a sad version of the exact same thing on my own naturally. Well, huh. Maybe it’s not so silly after all.

(Also, regarding tagging: the idea of a database with a few tens of thousands of files in the same namespace, searchable by tagging, gives me hives. I know people do this all the time, and it’s a “me problem” that it bothers me, but oh, how it bothers me.)


What I like about DEVONthink is actually not the “duplicant”, but the “replicant”.

My organization also evolved to a simple hierarchy over time, but the fact that files can live in several directories at the same time is very useful in some cases. When there is ambivalence where a file should go, it can just go in two places – but it‘s not a duplicate, so you don‘t run into uncertainties which one is the latest version, etc. So it’s a bit like tagging (which in DT you can additionally do), but also not quite…


Yep. I didn't get into the specifics, but I use the replicant feature all the time for the kinds of things I mentioned.

Plus, it's so freaking good at finding stuff wherever you might have happened to have squirreled it away.


I'm going to try this with my Firefox bookmarks. They're already a bloody mess, and I need to get rid of 404ing ones anyway. Except that it'll be Johnny Octal. (reason: https://en.wikipedia.org/wiki/The_Magical_Number_Seven,_Plus... )

This whole discussion of hierarchy vs. tags feels like discussing if hammers are better or worse than screwdrivers, with each people assuming nails or screws out of nowhere. Some things for example organise themselves naturally into hierarchies, such as biological species (both the "old" taxonomy and cladistics are tree-based models.); odds are that the same applies to tags, with some junk out there being specially well suited for tagging.

There's also the possibility that different people do work better with one or another.

It would be specially useful to identify corner cases where each fails to deliver. Both systems are bound to have flaws; the "right" one is not the perfect one, but the one with the flaws that are easier to address and/or tolerate.

A few people mentioned items that could be assigned to multiple nodes as a shortcoming of the hierarchical system, but isn't this rather easy to solve with a disambiguation rule? e.g. for Johnny Decimal, "if an item can be reasonably assigned to two numbers, pick the smaller one." I also don't see much of a problem with synonyms, or in this case links.


I was working on something like this but for physical objects.

Fine grained categories take up a lot of space and involve a lot of containers,which then creates more objects to manage. They also take Moore effort to put things away, for only small gains in retrieval speed unless your memory is good enough to find the category box right away.

My theory is an organization system should be optimized for storage rather than retrieval as that is what takes time and effort.

But I have a lot of trouble with numbers or abstract symbols and don't want to spend forever learning them, so I use three letters abbreviations.

All the categories are based on observation of what is already close together, rather than by trying to create a system logically, to take advantage of things I've seen in one place long enough to remember, and not have to relearn the location.

So, I have a category BAM, for bulk artificial materials. This exists because there was a bunch of paint, some cleaning supplies, and paper towels stored together.

There's also TAM, tapes, attachments, and materials. This has some screws, some ratchet straps, and some balsa wood and steel wire, some foam tape, and some keyring split rings, and a bunch of other stuff.

If things overflow a container, I split them into subcategories.


I was inspired by Johnny Decimal and developed my own system for organizing my personal/family files. I don't have the same "search for it" requirement, or need to talk to other people about the organization of the files and don't use any of the numbering system they have, which may sound like it's not inspired by Johnny Decimal, but it sure is!

At the root I have a small number of ALLCAPS folders. Those each have a small number of ALLCAPS folders themselves, and nothing else. In a few cases the hierarchy goes a little deeper than that, but not much deeper.

An ALLCAPS folder can either be part of this ALLCAPS hierarchy and contain other ALLCAPS folders and nothing else, or it can be an ALLCAPS leaf: contain normal folders and nothing else.

The final rule is that nothing is allowed to depend on any relative hierarchy until you get into one of the folders inside an ALLCAPS leaf.

What this means is I can reorganize any time I want by moving or renaming any ALLCAPS folders at any time, or any of the folders inside an ALLCAPS leaf. I find this distinction relieving. I don't have to get the perfect organization forever, I just have to organize it in a way that works for me right now, and I can reorganize any part of it at any time without worrying about it.


This is one area where LLMs will help tremendously. I've always hated the Save operation, because it forces you to think about a name that describes what you're working on, even though the idea isn't fully formed yet.

I'm pretty sure Microsoft will integrate LLMs to automate file naming, and I hope other systems follow suit.

More interestingly, LLMs will easily organize data hierarchically based on the contents. I hope this becomes a reality this or next year.

I hate manually organizing a filesystem.


I agree with this. Having an OS option to scan and tag your documents into some taxonomy that is a built-in in your file browser would be quite attractive. I'm sure Microsoft and Google are both working on it.

e-Discovery applications like Relativity have been doing this for years. You run a PCA against a bunch of OCR'd documents, look for correlations between words or phrases within documents, look for repetitions of those particular correlations, call them 'issues' or 'motifs' and slap a label on them. Attorneys used to use it to scan millions of documents in a discovery set and auto-flag them for possible privilege issues for further review, and even automatically mark them as such.


Johnny here.

I was mid-reply and I realised I was typing out my problem statement, so I’ll just paste it here. This is a work in progress.

---

# The problem

When we kept everything on paper, organised people had these things called filing cabinets. They stored all of their documents in them in a structured way so that they could find them again.

Now those same people store all of their files in arbitrarily named folders on their company’s shared drive and wonder why they can’t find anything.

## Information wasn’t always free

When we kept everything on paper, generating information came with a cost. Paper cost money. Typing out a document took real effort. Duplicating a document meant a trip to the photocopier.

Every document produced was a tangible thing. It was there, on your desk. You couldn’t ignore it.

Now anyone can duplicate anything, instantly, invisibly, for free. We assume this is an improvement.

Is it?

## You had to be organised

When we kept everything on paper, you had to be organised. There was no other option.

If you weren’t organised, the information was lost. Not lost as in ‘it’ll take me a while to find it’: lost as in ‘gone forever’.

Now you can be disorganised, but at what cost? The cost is the time it takes you to find a thing; it is the risk that the thing that you find is a duplicate or an old version. It is the constant frustration that comes from knowing that something exists, but having no idea where it is.

We all feel this every day and we have come to believe that it is normal.

It is not normal.

## Why aren’t we given training?

When we kept everything on paper, it was someone’s job to organise it. This was an occupation: you were trained. You became an expert.

Now we employ Gen Z’s who didn’t grow up with the concept of ‘a file’ yet we expect them to navigate the byzantine hierarchy of the company’s SharePoint.[genz]

[genz]: https://www.theverge.com/22684730/students-file-folder-direc...

You work at a keyboard all day, so we make you sit through a module so you know to bend your knees when you lift a box.

But when it comes to information management: you’re on your own.


When it gets bad enough you need this for an organization, you hire a "librarian"[0], it's literally their job to classify and keep track of information. They have a whole degree program called Library and information science.

Let the experts handle this stuff. How many times have you found some super important production piece being handled in a disaster of Excel and 400 different versions all named ridiculous things, and nobody knows which is the right one to use? Why? Because they didn't bring software development in soon enough.

0: Librarian is our commonly understood word for the broad profession of information management, but the experts tend to have many different job titles for their discipline, get a subject matter expert(I'm not one) to help you track down the right job title for your specific project.


I think most people here are applying this to their personal notes/projects.


Johnny here. I think more people should apply this at work.

I work on large IT transformation projects and the disorganisation and resulting waste of time and money is borderline criminal.


I found the latest powerpoint in my emails instead of the sharepoint, can confirm, borderline criminal.


I assume this is sarcastic, so allow me to offer this point directly from recent experience.

I worked on a multi-million dollar transformation project for the federal government.

Nobody had any idea where anything was. Eventually they found the 'latest PowerPoint' in their email, instead of SharePoint.

Except it wasn't the latest. They just thought it was. So they updated it, and distributed it, and then looked stupid and had to do it again. And lost all respect and trust from the customer.

And then when the project failed the auditor came in to try to figure out why, and it was reasonably obvious.

Meanwhile the taxpayer is paying for this. That's me and you. If this mid-sized transformation project has a $10m budget and every person wastes just a minute every hour looking for a file -- conservative, in my experience -- then we're talking hundreds of thousands of wasted dollars.

So, yeah. Borderline criminal in my view.


I like PARA a lot, which has some great ideas: https://fortelabs.com/blog/para/


Never used such system, but I'm inclined to believe in its promises. In addition to what I've recently commented in another HN post [1], this system also slightly resembles the classification system used in accounting. At a first glance those account numbers look cryptic and arbitrary, but soon enough you realize how helpful they are on enabling accountants to communicate and creating journal entries.

[1] https://news.ycombinator.com/item?id=36301140


Question aside. Can anyone recommend any opensource de-duplication tool(s)? I've realized that I have the same data over many drives but manually going through them even for a single drive will take a ton of time. I'm wondering if there's something smart enough where you input paths to be scanned and magically outputs de-duplicated data to a single coherent place...

Edit: Some corrections. I forgot to mention which OS: GNU/Linux and/or BSDs.


My research into this many years ago turned out that jdupes was the right / best solution I could find for my usecase.

https://github.com/jbruchon/jdupes

Though that works fine from a script perspective I'd like some more interactive way of sorting directories etc. Identifying is just the first step, jdupes helps with linking the files (both soft and hard links comes with caveats though!) but that is mostly to save space, not to help in reorganisation.


It seems to me that is not a trivial problem to solve: de-duplication + reorganization. Maybe I'm incorrect. It also seems the kind of problem where it could be super-easy to screw it if you go with a custom made script plugging different tools...


I've never tried it myself but the README mentions several other tools.

https://github.com/dpc/rdedup/



I used DupeGuru (https://dupeguru.voltaicideas.net/) in the past but I'm not sure it's the best solution for you. Try it, it's open-source.


rmlint.

I've tried many, but rmlint is the most flexible and reliable. Esp. the tagging works really well.

https://github.com/sahib/rmlint


what os? for use in a console, there's rdfind or fdupes


Ok, this made me look into why not the Dewey.

It seems too hard to memorize the numbers for first time placement.

So let's make a program that asks us when moving it into our collections?

`dewey <file to organize>`

Will then lead you down a tree of decisions. Insta-organized. It's so good I just might try it.

(The file will move to wherever your organized files are specified in your .config/dewey.conf)

On Windows this could be a right-click -> Dewey, where it then pops up a small window to pick the categorization.


My grandpa was very interested in libraries. He had drawers full of index cards[1] for his personal library, organized using the Dewey decimal system[2].

When he first got a computer, back in Windows 3.11 days, it only seemed natural to use what he was familiar with. So he would store documents and emails in directories based on the Dewey decimal system.

However a problem quickly arose. A document might pertain to multiple topics. With index cards this was simple, you just noted the book or document on each of the relevant index cards.

With files however it was less clear. The only way he found was to save the same file in multiple directories. With the obvious nightmare of keeping it all in sync.

It got somewhat better when I taught him how to make shortcuts to the documents, but still...

[1]: https://en.wikipedia.org/wiki/Index_card

[2]: https://en.wikipedia.org/wiki/Dewey_Decimal_Classification


Universal Decimal Classification solved this issue by being fully build to do faceted classification. It does take more work to create classes though, and the class notation can get very complex.


We implemented this for our shared storage at $DAYJOB. We had a long tail of decade old files on our shared drive, so we started again with the Johnny Decimal system on a new one. It's helped tremendously for us for finding stuff.

I had previously implemented it on my personal Nextcloud instance, but found it to be less impactful, as I already tended to over-organize my digital files.


Oh, now I understand where the directory structure comes from at work.

I hate it.

The problems I have with it (some of them implementation details that can probably be fixed)

- On smaller projects, you have a big directory tree of nothing, with maybe a quarter of the directories being populated. This is because it starts from a template.

- You tend to get long directory paths, enough to get over MAX_PATH in some instances, don't fit in a single line, etc...

- Remembering arbitrary numbers is hard. Try using arbitrary numbers in your code for your variables, I am sure it will be appreciated...

- And especially when there are several number based systems in place. So you have the software version number, the ticket number, the number system used by your customer, etc... Do you really want another number system on top of that?

- The article says there is no overlap. There is never "no overlap" in the real life. For example, as a dev, I should have nothing to do in the "sales" folder, except that the technical specifications are here because they are part of the contract. It really belongs in both "sales" and "dev".

- I still use search as may primary tool.

Note that someone mentioned the military. I have worked on defense contracts, they are the worst. Acronyms and codes everywhere, I guess they are too special to name things with regular words. And I am talking about the unclassified stuff, it is even worse when confidential information is involved: "The name should follow the ZB4455 convention, ZB4455 is in document L45.34c, can I have L45.34c? No it is classified, but actually, it just means it should be lowercase and start with an underscore." So I wouldn't take what the military does as a good example.


An IT professional criticizing the military for using acronyms and codes is a pretty bold stroke...

Besides, if you named everything with regular words your poor MAX_PATH would be working overtime. There's a time and a place for abbreviations and codes, and if a multi-theater, technically-advanced military force with global and extraplanetary reach isn't one of those times and places, then I don't know what would be.

But I do agree with you about assigning an arbitrary number to a project. 773.0034 is not that helpful a descriptor and I wouldn't want to see a whole "Downloads" folder full of those. But it does help you find things quickly.


> multi-theater, technically-advanced military force with global and extraplanetary reach

Neither here nor there but god I need a drink. And it’s only 8am. Reading that sentence reminds me of the nationalistic radio broadcasts mentioned in A Canticle For Lebowitz before everyone is annihilated in nuclear fire…


I'm haven't used this system (thankfully), but my first thought is complementary to the comment you made about small projects: in big projects, probably it's more useful to break down by component rather than type of document. E.g. the front test spec is better off grouped with the front end architecture diagram rather than the test spec for a bunch of other individual components.


I love everything about this: the concept, even the name. I feel Johnny Decimal just needs a graphic. From a few minutes of Googling, I think something like this: https://clipart-library.com/img1/1252227.gif


Related:

Johnny.Decimal – A System to Organize Projects - https://news.ycombinator.com/item?id=36300472 - June 2023 (1 comment)

Johnny.Decimal - https://news.ycombinator.com/item?id=25398027 - Dec 2020 (187 comments)

Johnny.Decimal – A system to organise projects - https://news.ycombinator.com/item?id=13770827 - March 2017 (2 comments)


More Karl Voit: (1) "Managing Digital Files (e.g., Photographs) in Files and Folders" at https://karl-voit.at/managing-digital-photographs/

(2) "TagTrees: Improving Personal Information Management Using Associative Navigation" at https://karl-voit.at/tagstore/en/papers.shtml

(3) "TagTree: Storing and Re-finding Files Using Tags" at https://karl-voit.at/tagstore/downloads/Voit2011.pdf


As said before about in the post "BIG DATA is just data", a lot of information is worthless after 1 or 2 years and most after 5 years. Long term value data seems to be stored in IT systems' DBs rather successfully.

And I have so far always find important emails (notably because important topics are easily found emails chains and far more often than not in the dedicated meeting report).

Structuring data is cultural so you should rather learn to use the system used by your organization. Only super small teams and solo-founders need to think about how to store data. Most workers should follow their community to let other people find the information.

Folders, drawer, cabinet have been around for 3 centuries at least and imho, we are not gonna reinvent the wheel with this or that way to structure information.


If your organization has a system, by all means use it.

The whole point of Johnny.Decimal is that most organizations have absolutely no system to organize information. It’s tossed into a huge pile.

Even organizations that have systems concern themselves only with organization-wide needs. Individuals still have needs that the organization does not address.


Decimal organization is a good system... but it's explained here in a completely obnoxious way that makes you want to hate it.

Firstly, I strongly recommend just reading up on Dewey Decimal[0] (which is what JD cribs almost everything conceptually from), there's a decent explanation about it on Wikipedia. Should help you "get" the categories you might want to make a bit more.

Secondly, don't marry yourself to JDs limitations. The site likes to evangelicize about some things that really aren't as important as you might think. Feel free to ignore something if it doesn't work for you - in particular the "no subfolders" rule might just... not be worthwhile to follow.

Personally I've always pretty much ignored this rule - if you look at Dewey, the left hand of the number is meant to be a classification for the broad category while the number on the right is meant for the broad project. In other words, applying a decimal organization system to specific files? Yeah not what it's meant for, don't do that.

Even in a library, where Dewey is used, an individual books Dewey Classification isn't actually unique to that book. For example all books on MySQL will have the same Dewey Class.

Build it as a system that works for you, don't try to forcefully refit your system to match the explanation of this website. Also, don't use it for small projects. That'll just make it a bigger mess than it's worth. Stick a small project in a bigger folder system, it'll work way better that way.

As for mental mapping - keep a readme file to just list the broad categories in the top of the structure, it'll help a lot. The site recommends spreadsheets but really, that's wayy overkill and will just cause dumb overhead each time you have to add a file.

[0]: https://en.wikipedia.org/wiki/Dewey_Decimal_Classification


I’m conflicted here, because you made some great points that I’m excited to think about/try, but you’re so angry! I guess I’m just joining an ongoing decimal-themed flame war from many years ago, lol. For example, this:

“ Dewey Decimal[0] (which is what JD cribs almost everything conceptually from)”

seems a little uncharitable! It’s pretty openly a specialization/variation on DD, and I’d be surprised if many people on here (or really in the culture at all) weren’t vaguely aware of DD from their school days. So “crib” seems a little pejorative imo

Re: substance, I’d be interested in a clarification if you find the time: why do codes for individual files bother you so?

You need to differentiate them somehow, and the first pure-DD solution I found doesn’t apply at all:

“ we also add to the end the first three letters of the author's last name (or, if no author is given, then the first three letters of the title). In our example, the author is James Brock, so BRO is added to the end of the Dewey call number to get 595.789/BRO.” - https://www.oakland.edu/Assets/upload/docs/SEHS/ERL/Document...

It just seems plainly helpful to have numbers before files, especially for ones that you’ll be returning to and/or recreating for other projects a lot, e.g. documents within your usual project management system.


Eh, it's by and large annoyance with a lot of these "here's how to get organized" guides. I have a bad tendency to kinda accrue files and as a result need these types of organizatorial systems to make sense of it all.

The problem is that rather than being descriptive (as in "this works for me, see what works for you"), lots of these organization guides are prescriptive, which helps pretty much only the person who wrote them to begin with. It gets really grating after a while, especially if they offer things like templates that are a pain to actually refit for personal use. (Which to be fair, JD doesn't do, but the author very clearly has that type of workflow in mind - older versions of the JD website straight up recommended using airtable for organizing stuff, template iirc included.)

My annoyance with numbering individual files in JD in this case is pretty much the result of "nobody else works in your Dewey decimal system". Like, start working with any kinda enterprise-y management tool and you'll very quickly learn that a lot of software is not written with JD in mind because they assume control over an entire folder and organize it in a way that makes sense to them. That is a problem that often combines with when you start receiving external files which are a folder of dependencies with one file you can open in the aforementioned tool. Yes, you can often spend time to edit the internals to "correct" that document to the Dewey decimal system, but that creates extra overhead and can also sometimes gravely annoy the other person if the document has to be send back and forth a couple times.

In that case, it's just way more straightforward to assign a unique ID to the parent folder instead of spending upwards of 30 minutes fiddling with every incoming file.

As for adding author last name - that's just for shelf organization in libraries, libraries sort all books on author/title alphabetical level. DDC just adds another organizational layer on top of that for scientific books (most fiction and (auto)biographies usually ends up organized outside Dewey entirely for practical reasons). You can have multiple 595.789/BRO in a single library (dictionaries for example with multiple books will have the same DDC).


Melvil Dewey[1] started this way, but then things got bigger, and a cast of clerks were born to serve the system.

[1] https://en.wikipedia.org/wiki/Melvil_Dewey


I've used this system for a few years in the past. It's definitely handy for some people, but it didn't fit my use case. I now store all important documents/photos/backups in the cloud and consider the computer to be basically throwaway.

One organizational system many programmers may appreciate is keeping your git/GitHub repos in the same place, under `.../g/<username>/<reponame>`. Huge fan of this method.


Hmm this seems unrelated to me - why not implement Johnny decimal around or within git repos? And what about it would change if used for cloud directories instead of local ones?

Probably just missing something obvious!


The only relation was that I personally used to use Johnny decimal to group my different projects, but then moved to the git repo namespace setup and no longer had a need for Johnny decimal. They aren't really substitutes so I understand your confusion!


This can work extremely well for one or two people. It becomes a problem when different people need to agree on what are the 10 things, categorization and maintenance.


And even when defined, at some point some document will be “in the middle”, one coworker will place it in 10, the other in 50. Has happened to many much more times that I can remember


I have been using JD for several years to organize both my personal documents as well as my business’ documents. I think the system works really well.

One thing I have learned to do which bends the rules a bit is to use date stamped folders in the lowest level instead of XX.YY.

Examples of places where I use this with success is for folders containing: meeting minutes, travel documents, receipts, etc.


If i did this with my e-mail i would have over 1000 in some folders.

"It’s very unlikely you will end up with a hundred categories." -the page

Exactly this will result in about 20-30 folders for most, with any real amount of documents some folders might hold 100-1000 docs.

The advise you should take from this is that forcing structure is useful. Look att large code repos for example.


I'm going to allow that some things, like photos for example, can live in their own folder apart from the Johnny Decimal data hierarchy.

(Also, it would force me to consider ... do I need 1000 files here? I've certainly been known to join related documents into a single PDF, Uber-document, if you will.)


I am obssessing over this when maintaining my knowledge/artifact base. Currently I am keeping it in git repository with few categories and I use 3 mechanisms - tags on end of file names and directories, iso8601 dates as prefix on some locations, and nothing on thrid ones.

So,

1. notes a-la gists use tags:

    'notes/Rsync notes #cli #foss #notes #x-platform.md'
    'notes/Windows initialization #windows #powershell.md'
    'notes/Modafinil notes #medical #nootropic.md'
2. event-like things use both dates and tags

    'work/meetings/2023-01-03 Project XYZ meeting #project-xyz.md'
3. stuff I just collect dont use anything or some of above

    'dms/wallpapers/w1.png; w2.png ...'
    'dms/shopping/2023-06-13 Dyson Absolute 15/README.md; receipt.jpg'
I keep basic folder hierarchy very limited for now. I use vscode to commit any change on save and pull git on folder open, making this behave like always in sync cloud a la Github Gists, especially together with vscode sync that brings my plugins, configs and shortcuts everhwhere.

CTRL+P to quickly find stuff by name or tags, and vscodes very fast ripgrep search to get files containing any content - so I just need to remember any word or phrase to find it. If I can't remember anything I browse over tags (having handy script to display all of them) or dates (since I usually know a time range). As another mechansism, I use double commander file manager with its fuzzy file names search to get interactive lists by typing tags or keywords while in particular folder.

To encrypt some pages I use GPG with vscode extension.

This serves me well, and I don't get lost, either when searching for previous knowledge or when trying to find where the single one is.

I evaluated Johnny Decimal prior to this, and it didn't fit this workflow - seems ad hoc enough so I can live without it and has nothing tags or good search can't solve. Also, it feels not flexible enough particularly as stuff can't have multiple categories. Tags are much better mechanism for information organization, you just need to keep them organized, keep their number relatively low, and have mechanism for delete/merge/move/rename which is simple enough here as it is all on the file system and is a few shell commands away.


Terrible advice. Abitrary rules (make 10 folders!) is just utterly bonkers for everyone except a small subset of people who could categorise their life in this way.

It really grates on me when people offer solutions that work for them, as if they will work for everyone.

No.


I don't think Johnny Decimal is for you.


The system is spoiled by confusion between division into 10 and division into 100. This creates extra levels so that the implementation does not live up to its "two clicks away" promise.

For instance, in the site's own structure, we have

  11-core/11.01-introduction
But that would leave two digit categories at the top level. The top level is organized by groups of ten and so we need

  10-19-concepts/11-core/11.01-introduction
One question is what if 10/11 gets more than ten items, so there is an 10/11/11, 10/11/12?

Isn't there a division into ten needed there?

If the bottom level never goes beyond 00-09, the zero is redundant. It's actually a three level system with a branching factor of 10, and you might as well just have

  10-concepts/11-core/1-introduction
I would just have

  10/11/1
and have symlinks

  concepts -> 10

  10/core -> 11

  10/11/introduction -> 1
Using the numbers as prefixes for the symbolic names means that someone who remembers the symbolic name but not the number cannot use tab completion nicely. They have to use tab completion to scan the entire directory level, then type the number, then tab complete again.

Symlinks going from symbolic to numeric is probably the right direction. The OS symlink resolution then teaches the users what the categories are:

  $ realpath --relative-to=. concepts/core/introduction
  10/11/01
There could be accelerator symlinks at the top level:

  11.1 -> 10/11/1
Now you get the full benefit. If you remember that introduction is 11.1, you actually have that as an instantly navigable identifier in the system.


> One question is what if 10/11 gets more than ten items, so there is an 10/11/11, 10/11/12?

I’m not following this (and thus, I think, your entire point). I think you might be slightly misunderstanding something, the files inside a category(11-core in the example) would never have a prefix other than the category - 10/11/11 is the only option - 10/11/12 would be breaking the system.

Once you’re inside a category, there is no division into 10 anymore. The 11 category would allow documents from 11.01 to 11.99. And as I believe is mentioned in the spec, if you need more than .99 you likely have too broad of a category or area.

For what it’s worth, I’ve used this system at work and in my own notes for around 2 years and haven’t run into this problem (yet).


OK. So if the 11 category can go to 99, why can't the top level just go from 00 to 99 as well without being broken into batches of 10 requiring another level.


Because most of the time the 11.99 is in a date series. So, the 11.01 item is probably not referred to much, and it's easy to zip down to stuff you're doing now. But 100 subfolders are too much to actually go looking through, especially if you're trying to `ls` the folders.


I stumbled on this system several years ago and found it useful as inspiration for organizing my external storage.

My top level categories are `inbox` (stuff that isn't sorted yet), `Media` (stuff that other people made), and `Vault` (stuff that I made).

`Media` contains `Audiobooks`, `Books`, `Courses`, `Films`, `TV`, `Music`, and `Broadway.

`Vault` contains `Backups`, `Projects`, `Audio`, Video`, and `Photos`.

Anything one layer deeper is either a file of the type described by the parent folder name or a folder containing related files (ex: `Video/2023-06-12 makers.dev 119` is a folder containing the raw recordings and processed end video and audio for my podcast).

I've got about 10TB and tens of millions of files organized in this system. It works better than anything else I've tried.


I do something similar. Where is always seems to fall apart is with things I collaborate on with others. Sometimes, joint projects get their own home (i.e., they become an organization, or at least get their own public repository of some sort). So in addition to "inbox" "media" and "private" (my version of "vault") I've also got a "shared" category.

It's still not perfect, because ultimately the subcategories of "shared" need to actually be accessible, or mirrored, or it's not actually true. And sometimes, a project goes into "shared" aspirationally, even if I have no collaborators yet, as a subtle reminder that I might share it someday, so I don't want to put anything in that folder that I'm not comfortable being public or semi-public.


> and the cues that Google uses to determine what’s useful — the links that are the fabric of the internet — just don’t exist at work.

Says someone who’s never worked at Google and used Moma. I still don’t understand why Google doesn’t offer Moma as a on-prem thing to replace JIRA’s suite. Is the market too small? They used to have an on-prem appliance way back when but surely a container package is all you need these days?


I freaking love this sidebar design


What about project folders like git repos and all that ? How do they fit into this system ?


I genuinely can't imagine this working at all for any sort of software development project.


Johnny here. Correct: don’t use it to organise anything that smells like a software project.


Absolutely no. For all the reasons listed here: https://heyluddite.com/post/4043411544/how-to-name-folders


There's a lot of nonsense in that article, talking about evolutionary conditioning to alphabetical folder organization. Hundreds of millions of humans can't organize their documents alphabetically because they don't have an alphabet. They don't seem to have a problem with that.


I thought most languages have a collating order — I think even a slightly generous reading is what they mean by 'alphabetical'? Even Chinese (an important edge-case) has the traditional radical-and-stroke ordering mechanism.


Collation order is not necessarily best for organizing. Many of us think spatially. Having related things near each other can be useful. Same with putting the most commonly used or most important things near the top.


The algorithm at the end is what I go through every single time I search for something. I think this article has a point!


The algorithm at the end:

  Start -> Open Folder -> End
In reality, natural language uses synonyms that often start with different letters. So without numbers, I still need to scan every directory one by one.

With numbers, I assign categories according to the phase of the process in which the item occurs. For example,

  1 plans
    |- A first draft
    |- B Lisa's notes
    `- C design
  2 analysis
    |- A exploratory
    `- B design implementation
  3 deliverables
    |- A May 2023 report
    |- B June 2023 presentation
    `- C August 2023 report
I can limit my search to folders and items that are in the low/medium/high range, according to what I am looking for. But alphabetically sorted, this directory structure would look much more ad hoc:

  analysis
    |- design implementation
    `- exploratory
  deliverables
    |- August 2023 report
    |- June 2023 presentation
    `- May 2023 report
  plans
    |- Lisa's notes
    |- design
    `- first draft


As if Melvil Dewey and William Gibson had a child...


Part of me originally thought that Johnny Decimal would be in groups of 10.

But first visit to their web site shows numberings exceeding 10.

Ok.

Still a novel idea worth pursuing.


The fact that this indistinguishable from satire is an incredible feat. Very well done.


This seems pretty backwards in the age of AI, where semantic search can ingest and numerically sort embeddings with extraordinary finesse.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: