The authors actually do briefly mention this concern in the section titled 3.7 Algorithm extensions (under Log truncation):
> However, in practice it is easy to truncate the log, because apply_op only examines the log prefix of operations whose timestamp is greater than that of the operation being applied. Thus, once it is known that all future operations will have a timestamp greater than t, then operations with timestamp t or less can be discarded from the log.
The main issue seems to be that we need to know about all the replicas that we want to be able to synchronize with - but I guess there isn't really a way around this.
I did miss this section in my skim, but it's also the first place I went as well. It doesn't seem like you could ever be sure that no old messages would be sent with an earlier timestamp if you don't know the set of nodes that are participating in the protocol.
Moreover, if you allow the set of nodes you do know to make operations maliciously (e.g. a single bad node enters the set and tries to cause conflicts), things get much more complicated, and I don't really think you can escape the impossibility results in relation to consensus protocols...
Hi, one of the authors here. You're right that in order to ensure that you won't receive timestamps lower than some threshold, you need to know all the nodes in the system, and you need to hear from all of them (if even just one node is unreachable, that will be enough to hold up the process). That's quite a big assumption to make, but unfortunately there doesn't really seem to be a good way around it. Lots of CRDT algorithms have this problem of requiring causal stability for garbage collection.
Using a consensus protocol is possible, and would have the advantage that it only requires communication with a quorum (typically a majority of nodes), rather than all nodes. However, it has the downside that now a node cannot generate timestamps independently any more — generating a timestamp would then also require a round trip to a quorum, making the algorithm a lot more expensive.
My understanding[1] is that you would not use only a Lamport timestamp but rather a tuple (Lamport, actor ID), where the actor ID is a globally unique per-node ID (say, a UUID), which would then be used as a tiebreaker in cases where comparing by the Lamport timestamp alone would give an ambiguous ordering.
This should not be problematic, since the Lamport timestamps are only equal in cases where there is guaranteed to be no causal relationship between the events (i.e. the user did not see the effects of one event before submitting the next event), so it's fine to pick the ordering arbitrarily.
> This is because sub-par VR technology (e.g. the Quest 2) is simply not good enough for someone wanting to work several hours per day in a VR Computer instead of their laptop -- even if most people don't realize this yet.
Do you mean that the Quest 2 is good enough to do, say, programming work on for several hours a day, or just that it's a decently good gaming headset?
The last VR headset I tried was the Oculus Rift, and that was nowhere near being usable for work. I'm really curious about the SimulaVR, but it's a bit outside my price range. So if you use the Quest 2 for work, I'd love to hear about your experience with it -- what software do you use, is the resolution good enough for working with text for hours at a time, etc.
I use the Quest 2 almost exclusively for my day job as a programmer (any time I don't have to be on camera in meetings), and have been using VR to do this for years - I'm the guy behind this article: https://news.ycombinator.com/item?id=28678041
The Quest 2 is remarkably capable for its form factor, but has some significant limitations and requires a lot of babysitting to get it tuned "just so" to make it that productive. Reaching that flow state, or even making it more productive than a traditional physical screen layout, isn't particularly accessible, certainly not yet on a mass appeal level. So yeah, it can work, but there's a LOT of room for improvement.
I work in VR when I'm not in meetings. I use Immersed for it. I love it.
The text readability isn't perfect, but it's fine and usable. (Others don't consider it very usable, which can mean either they didn't spend the time to figure out the ideal setup for them or it's simply not usable for everyone yet.) There's a lot of after-market customization that help tremendously: better headstrap, upgraded facemask, prescription lens covers.
We're definitely in early adopter territory. It takes tinkering to find the best setup for yourself. Some people don't have the time or desire for that, some people just don't find something that works after trying it out. It's not sustainable for widespread adoption yet, but it'll get there.
It's improving every day as the Immersed team is adding new features along with the Quest opening up APIs. For example, right now you cannot see your keyboard. Most users get by with touch typing. You can bring in a VR version of your keyboard that is calibrated to the position, but it's pretty finicky. Quest is opening up an API soon for what is called "passthrough", which will allow the user to see the camera view outside of the set. Once a passthrough keyboard feature is implemented in the Immersed tool, I believe it's going to be a significant feature that will make it even easier to work in VR.
I used a Quest 2 for work for a few weeks while my monitor was being repaired. My biggest problem was not being able to see the keyboard. The display was not a problem for me. I was quite glad to have my monitor back anyway. For that matter, for all of the PCVR games I was so excited to play, I've gone back to playing them mostly on the monitor. I'm quite happy with the Quest 2 visuals, but the comfort (for longer periods) and controls are inferior for anything more complex than beat saber and golf.
I was wondering how it compares to Syncthing[0], which I currently use to sync files between my phone and PC.
It looks like the main difference is that Recall doesn't require you to install anything on the receiving device. You can just install Recall on your phone, and then download files via a web page. So basically you get the privacy of Syncthing with the convenience of Dropbox. That's really cool!
The primary difference is that Syncthing is focusing on synchronization, meaning that some files/folders take space on both devices and are kept in sync. It's essentially a cron-ed rsync (Dropbox is built around the same concept internally).
Recall is primarily a file server. The analogy that I have is that it's a zero-config FTP server with a web client with a nicer UX. Auto-sync of select files is on my roadmap, but at this point it's just a 'remote access to your files in a browser'.
Another big diff is that Syncthing is open source, so it's security and privacy are verifiable. Recall is closed source at this point (although you could peek at the JS client to figure out how it works), and I'm not yet ready to make a promise to open it in the near future. So, be aware.
Thank you for your kind words, I really appreciate that!
Neat! I enjoyed your thoughts about choosing a data structure and editor component. I'm also building something in this space (https://thinktool.io/, https://github.com/c2d7fa/thinktool), and I decided on a somewhat different data structure, with pros and cons.
Instead of having pages that contain blocks inside of them (like Roam), I decided to have just one type of item. These items can have other items as children and parents, and they also contain text content which can link (bidirectionally) to other items.
The main advantage of this approach is that items can have multiple parents. So in practice, you never have to think about whether a note should be its own page or just a block inside a different page. This just feels more elegant than Roam's approach to me.
The main disadvantage is that you're now working with a graph rather than a tree. You can end up with funky situations like an item that's its own parent. Since I still want the user to interact with the app through a tree-based UI, I have to add an intermediary data structure representing what the user can actually see. This data structure needs to be built lazily as the user expands and collapses items, so we don't try to represent an infinite loop.
Whenever the user does something (adds an item, edits it, creates a link, etc.), we need to update both the intermediary data structure and the persistent data -- and ensure that these changes are kept in sync. For example, adding a link is a simple change in the graph, but it may require multiple updates to the tree.[1]
I also ended up going with ProseMirror for the editor component (after trying Quill, Slate, an input-based approach and using contenteditable directly). I'm really happy with ProseMirror, even though integrating it wasn't entirely pain-free.[2]
Jonesforth is insanely cool. The linked mirror seems to be missing 'jonesforth.f'. Maybe try check out this one[1] for the full implementation.
I recently tried porting Jonesforth to UEFI[2], so I could run it directly on my hardware without needing an operating system. I was actually surprised by how easy it turned out to be.
Okay, admittedly I ended up rushing a bit towards the end, and the final result is very bare-bones - it can do "Hello, World!", Fibonacci numbers, and then that's pretty much it. Still, it was a lot of fun, and I would totally recommend a project like this, especially if you don't usually work with "low-level" development.
I also ended up writing a blog post[3] to help people get started writing assembly for UEFI. The best resource is probably the OS Dev wiki, though. It has a ton of great resources.
Is jonesforth singlethreaded or does it utilize all available cpu cores? I only ask because a lot of these little baremetal languages just do the bare minimum needed to boot to a repl without exploiting the full capabilities of the hardware
It's single-threaded because it's a learning tool not a forth you'd ever want to use in an environment where performance would matter. Single-threaded or not, forth interpreters [not compilers] have terrible interaction with branch prediction - they will never perform well on modern CPUs. Virtually every other instruction executed is a jmp that cannot be predicted and thus will collapse the pipeline.
"..Many studies go back to when branch predictors were not very aggressive. Folklore has retained that a highly mispredicted indirect jump is one of the main reasons for the inefficiency of switch-based interpreters."
"The accuracy of branch prediction on interpreters has been dramatically improved over the three last Intel processor generations. This..has reached a level where it cannot be considered as an obstacle for performance anymore."
There is an older paper by Anton Erl that already showing variations in performances for the same implementation technique from one generation of Pentium to another and of course between AMD and Intel.
Personally, I stopped worrying and used the most convenient implementation for my use case (portable interpreter written in C). Your set of primitives and how you code your Forth programs usually have a much larger improvement potential.
That paper refers to interpreters in the traditional sense. However threaded code is not interpreted in the same way. After every instruction is a computed jump. There are benchmarks on this in the jonesforth code you can actually run and you will observe the exact problem there.
Note that the paper does compare a switch-based dispatcher to a computed-goto version ('jump threading') - cf figure 2. The latter used to have a significant performance advantage over the former, which is apparently no longer true (cf figure 3 (a)). That of course doesn't invalidate your point.
Any language that arranges data in arrays or large structs does well on modern machines, especially with vector and SIMD extensions. To be fair to Forth, there exist machines that are well-suited to running "threaded code"[1], it's just that they are not machines that are commonly available today.
I don't buy this. I understand the use of your term 'threaded', but these are unconditional and therefore can be incorporated into the instruction pipeline with little or no overhead. Here's a very old SO post, CPus won't have got worse since then
"But in general, on modern processors, there is minimal cost for an unconditional jump. It's basically pretty much free apart from a very small amount of instruction cache overhead. It will probably get executed in parallel with neighbouring instructions so might not even cost you a clock cycle. "
If you do this (or -serial mon:stdio and leave the VGA output), you can do console I/O via com1, and it works pretty well. As a bonus, this is viable on real hardware too, although most consumer level motherboards don't do serial consoles :(
I like to think about software that helps you think. Knowledge management, wikis, outliners, task managers, mind-mapping tools, and so on. There's a piece of software called TheBrain (https://thebrain.com/) which I really like; it does a great job of letting you make connections between different topics that are connected in your mind.
However, I prefer the usability of an outliner. So I'm making a web app that is basically TheBrain but visualized as an outliner instead of a mind-map.
There's a prototype available here if the concept sounds interesting to anyone else: https://thinktool.io/
I found this article really useful. It's important to note that the author is
explicitly talking about what you do in your leisure time, not what you do for
work.
I have often found myself trying to start side-projects that I didn't really
care about, because I had some abstract, perhaps irrational idea that I
"should" be working on this particular thing. Usually it's something that will
teach me a new skill.
Recently, I've gotten a lot better about evaluating whether or not something
is worth doing in my free time, but I still don't really have a good criterion
for when I should pick one thing over another: When should I spend my free time
learning about X, and when should I learn about Y instead?
One suggestion may be to pick whatever is the most useful or the most likely to
help you earn money in the future, and indeed this is more or less the rule I
have been using in the past.
I posted this article because I thought it gave some remarkably actionable
advice for picking between different side-projects, namely:
> The basic idea of personal energy management is that you should focus on
increasing your personal energy and lifting up your mood in your leisure
time, instead of working on things that drain your personal energy. Hobbies
should lift your mood, not drain you. This makes you better perform all the
other tasks.
This is a different outlook from what you usually hear, but it makes a lot
of sense to me when presented in those terms.