Right there with you. I have this vague dream that one day, the tech will be there to automatically collate and reference all my stored data more easily, sort of like how I work with physical sources.
Even as I type this, though, it feels more and more like a pipe dream.
Minimal: Every page I visit grab headings (easy) / major themes (harder). Add these to a local searchable db. Type ahead interface showing summary info.
Bonus points - automatic summarization (GAN based perhaps?)
My Pipe dream...
I wonder what the actual quantity of unique text a human actually sees in their lifetime. Could this be easily stored? Like every word I ever read anytime?
Working with physical sources = being able to spread ten books out on a table and easily flip between sections in a tactile way (physical pages). That’s really what I mean. For some reason physical objects are much simpler to work with (for me) and opening ten windows (even on a giant screen) just doesn’t work for me the same way.
The minimum useful version for me would be something that recognizes all media types AND allows commenting/notes in a standard format that links between them and is easily manageable.
In particular that would mean for me:
- read/highlight/notate/manage/organize functionality for epubs, PDFs.
- import physical books via ISBN and allow me to attach my notes there (ideally with a companion phone app that lets me scan/photograph relevant sections.
- photo library (ideally containing the aforementioned book pics while leaving them linked to the notes).
- multiple routes for surfacing old stuff.
But in all honesty: the reason I think it’s a pipe dream is because I’m fairly certain the limiting factor is being human, not the technology. Like I’m kind of hoping for a new paradigm for digesting media, but I also recognize that’s a pretty steep ask.
Relatedly I’ve been trying to build what I’m talking about out of emacs since COVID started, and I get some of the way there by org roam + org noter, but the difficulty of connecting emacs with various work cloud services and such has proved quite daunting.
But I’m still at it, because most tools of this nature are dev centric, whereas I’m prose centric, so much of my frustration comes from having tools that are CLOSE but fall apart in the last mile workflow (for my purposes).
Honorable mention to hookapp for mac, which I’m still wrapping my head around but may end up solving some of my nicher problem areas.
Thanks for going into detail on that. This makes me wonder if information organization and traversal is truly the killer app of AR + AI. Arrange your data like physical objects. See the connections like literal threads in 3D space. Have a librarian who can actually understand your natural language queries help you find things.
For me it absolutely would be. Allow me to interact with my (personal, DRM free, already owned and stored locally) digital media as if it were physical media and you will have my money day one (unless Zuckerberg).
Awesome. Let's go back to "what is the simplest useful implementation"... Let's say you have some static media like PDFs. Let's say you can use VR/AR if you want, but you can also just use your flat screen with a handheld motion controller.
Would it be useful to create a 3D world where those PDFs take up space and stay where they're put? Let's say you open an app on your computer that shows you a virtual room, and there's a menu that lets you select a folder from your computer full of PDFs. You open the folder and a pile of physical documents appears as either a stack, a spread, or something messier.
Viewing from there would be really important. You'd need to bring the camera in really close to give you the ability to read anything. Or magnifying glasses. Maybe the ability to clone and transform pages and parts of pages while leaving them linked to their parent documents.
I dig everything you’re saying and think the way forward is something like that. But right now implementations are very limited by screens, in my experience. Like, even if there’s 360 of a virtual space avail to me by dragging the screen, it ends up having the same issues — it seems like, for me, a persistent space is needed. There’s also some research to suggest that having such permanence is helpful to learning — it’s easier to remember info from a physical book because your brain can associate the CONTENT with the actual physical experience of the book itself, which leads to greater retention (I think it’s similar to how mind palaces are meant to function).
A mix of AR glasses and physical simulacra is where I expect it to go eventually. As in, I have 10 book “blanks”, and when I use AR glasses, they become whatever I wish (from my library). Ideally I could then flip around the blanks and see the content of the books — making it much more like my actual physical workflow.
Tl;dr you’re 100% right that the means and method of viewing are critical to what I’m after. But I think the main through line for me is that if I’m having to manipulate the camera AND the content, it leads back to the same issues I currently face. The point for me is being able to interact as I usually do, but with access to “physical” versions of everything I have digital (albeit with digital convenience — such as exif data readily available on the “back” of a photograph).
Hope that all makes sense and thanks for treating this so seriously!
Even as I type this, though, it feels more and more like a pipe dream.