It really depends on your Forth environment. The language is a great one for bootstrapping an operating environment since a small blob can define basic routines and building blocks for more complicated operations.
The language is a dynamic synthesis of incremental compiler and Basic Input/Output System where more generic and cross implementation concepts can be added or re-defined.
Skimming this I don't see any actual reference to a standard method for accessing a filesystem, I suppose in cases where a filesystem is relevant there'd either need to be a driver for that written in native forth, or a host operating system which could be called via a syscall of somekind.
I expect Forth to be used in different contexts, but mostly in this order: extremely low level (high performance in L1 cache execution), early boot (a slower / read only FS driver is fine), or toy / experiment / prototyping (syscalls might be good enough). Anything else can probably afford to be written in a more easily maintained language.
One benefit of the early boot context is that, in theory, a minimal compatibility standard could be written and one 'interpreted' (hooks on to the definitions of functions already in the context stack) driver could function on vastly different hardware as long as it didn't actually use native assembly.
When I implemented this in a home rolled Forth on the Apple //e under Prodos (mid eighties) I used the Apple sparse file with random IO as block orientated drive. The Starting Forth link suggests that's a common mechanism. I suspect a lot of micro Forths just hit the disk directly.
As much as I love Forth, I am a bit troubled by the list in the article:
Some interesting but by no means complete Forth properties:
no errors
no files
no operating systems
no syntax
uhm... Forth does have errors like Stack Underflow. When I programmed in Forth we did have the notion of files. It most certainly does have a syntax also.
>colorForth does it differently. There is no syntax, no redundancy, no typing. There are no errors that can be detected. Forth uses postfix, there are no parentheses. No indentation. Comments are deferred to the documentation. No hooks, no compatibility. Words are never hyphenated. There's no heirarchy (sic). No files. No operating system.
He probably means "blocks" instead of files and "a condition system or exception system" instead of "stack underflow."
ColorForth is actually pretty much an image-based system. Moore observes that PCs have a lot more memory than he needs (indeed 1Mb is huge for a Forth system), so the image is loaded into memory and the memory is saved to disk.
It should be mentioned that Forth is one of those languages that do configuration by editing the source code. This is probably the recommendable practice, instead of INI files or worse, XML (why write another parser when you already have one?).
For really big data, binary format would probably be preferred (I actually have no experience in the area), with blocks (as defined in an extension of the ANS Forth standard) acting like Linux' mmap.
Strings are a little bit of a pain to deal with in Forth, so you'd rather avoid them where possible (in addition to the consideration that plain text data are inefficient). You'd rather use an hex editor or write visualization and manipulation tools for binary data rather than processing text strings. The fact that Colorforth uses a custom source format to be edited by a custom editor (the colors in the source code is not syntax highlighting, the colors are part of the "syntax") is symptomatic of this kind of approach.
A condition/exception system would probably be in the category of "hooks" according to Moore. Stack underflow/overflow are not even an error condition in some of the chips he made that feature circular stacks. However, it is true that for casual Forth programmers like me, a stack underflow condition is usually the result of an error in the program. In a way, just like Haskell programs are "most certainly correct when they compile", Forth programs are most certainly correct if they don't stack-underflow (or just segfault because that's a typical result of messing up the return stack).
Safari online has a lengthy interview with him that is an excerpt from a book with lots of interviews on programming Titans. I read the one from Moore every year and I don't even write Forth. It is simply that good and thought provoking.
Haha, yes that is a good way to explain it. Moore isn't necessarily wrong, but I don't think he is 100% correct either. Not everybody is a software engineer who can spend all day writing code. There are a lot of folks who write shell scripts and the like who need to do talk to computers without getting into machine code and Forth. Does a lawyer need to know the machine? Sounds inefficient use of time.
People talk about forth having no syntax the same way people talk about lisp having no syntax: not entirely correctly. Of course, every textual representation of anything has a syntax, but when people talk about something not having a syntax what they really mean is that the syntax is uniform: there is only one global (uniform) syntax.
Not quite. You have access to the reader in Forth, and it's quite typical to write a word that takes over for the reader for some time and then hands back to the usual reader. Julian Noble used this to implement a subset of FORTRAN for computational physics (more convenient than postfix for doing that) and to make state machine transition tables into code. Sadly, he is no longer with us.
Common Lisp also provides access to the reader, so you can do the same trick, but it's not a ubiquitous thing to do the way it is in Forth.
While I agree with the general sentiment that Forth does have a syntax, this statement is not true in Forth. Instead, syntax is extremely local, because words may take over the parser and compiler to implement their own syntax.
For example, "(" (which is used for comments) is typically implemented as a word that immediately executes during the compilation phase, takes over the input loop and just reads and discards characters until it has discarded the matching ")" character before returning control to the regular compiler. It has its own little parsing loop going on with its own syntax, totally overriding any global syntax for as long as it feels like.
I'd argue that Lisp has a syntax in that you do need at least a context-free grammar to describe valid Lisp programs.
With Forth, on the other hand, while there is a rule for deciding when one word ends and another ends, and how to interpret literals, I think that, for all practical purposes, there aren't actually any strings that are syntactically invalid.
As the ANSI Forth standard puts it:
> This Standard does not require a syntax or program-construct checker.
It gets philosophical fast, but something like a stack underflow error is a feature of the interpreter. It is not a part of the language itself. A misbehaving Forth program will happily underflow through all of memory.
According to the language spec, it seems to be non-compliant to handle stack overflow that way (even if it's obviously a good feature):
> In all integer arithmetic operations, both overflow and underflow shall be ignored. The value returned when either overflow or underflow occurs is implementation defined.
If you're going to run an interpreter as a unikernel, arguably you should pick a typed assembly language and use that to enforce desirable properties about the assemblies you're running (e.g. protection domains). FORTH doesn't really make these things feasible so I find that choice quite puzzling.
Implementing a Forth is a great incremental process, all you need is a processor with stacks, registers, and RAM. Words are just lists of addresses. Once you have the base execution routines, e.g. DOCOL, NEXT etc., implementing the rest of the interpreter is easy. It can even be done on a calculator or in-game computer![1]
It would have been very nice to see comments with stack diagrams in the code examples. FORTH code without them becomes very hard to understand in very short order.
Let’s say you run a database in a Docker image. That image will run at least two processes: a kernel and a database server. Whenever the database server wants to call code in the kernel, a task switch is needed.
In a unikernel, that overhead is removed by having only a single address space. You can see it as taking the full source of the kernel and the database server, linking it into a single executable, and stripping everything you don’t need, including everything the kernel would need if it supported multiple processes.
Is that a good idea? Not if the service you’re running wants to be multi-process, not if you trust your kernel code more than the code of the service you run on top of it, and not if you want the ability to look inside your containers (e.g. by ssh-ing into it)
What you do get in exchange is smaller containers (does anybody know why the image in this example is so enormous? I expected something that easily fits in 64 kB), a smaller attack surface, and faster boot times.
Hold on here, kernels do not run as a separate process. And there is no task switch, just a privilege level switch.
A kernel may or may not exist in a separate address space depending on hardware and software features (e.g. the arm split paging regime or the meltdown mitigations). It's just that the kernel memory is not accessible to userspace code.
> Is that a good idea? Not if the service you’re running wants to be multi-process
You can run multiple VMs. That's probably easier to use too, since you tooling for VMs is based around spinning up more machines, not processes.
> not if you trust your kernel code more than the code of the service you run on top of it
I don't really see how that applies. The application code is still the same. It's just that the OS doesn't exist any more. If it doesn't exist, it can't have bugs.
> and not if you want the ability to look inside your containers (e.g. by ssh-ing into it)
True. But I can't easily ssh inside a process either, which is what a unikernel is more comparable to. The only reason I ssh into my VMs is because the application process is wrapped up in an OS. If that OS did not exist, there would be less reason to peek into it.
To be fair, on a unikernel, you're probably not running on bare hardware, but instead a VM and have the same context switches to access emulated hardware.
Docker containers aren't running a kernel. They're using cgroups/namespaces to isolate themselves from other programs running on the host kernel.
If you run five containers on a host, you're not running five kernels. You're running five instances of a Linux userspace/application, sharing the same kernel, but isolated from each other.
1) FWK (full weight kernel) a stripped-down version of a full featured kernel only having the things needed to run a particular application. Remember as the upstream kernel progresses you have to some how find a way to update your modified kernel to keep up with security fixes and all.
2) A LWK (light weight kernel) a small kernel built from scratch that is added as a library to an application (it will likely be incompatible with many existing applications built for other systems) - a pro is that system updates are practically non-existent. You don't have to rebuild your image with Debian patches or anything like that since the kernel is minimal and unique. And exploits have minimal impact in any case.
3) Then there is the hybrid approach where you have a LWK that runs along side a FWK with the light weight kernel handling most of the system calls while the full weight kernel (e.g Linux) handles system calls for compatibility. A multi-kernel approach.
I have no idea what route Nanos takes since I can't see the code and its not mentioned anywhere. Keep in mind a Unikernel is usually a single address space kind of thing, so you shouldn't be able to fork or exec. Since Ops permits arbitrary Go/C/C++/Ruby programs I wonder what happens if you try to exec something.
In any case these systems are more secure than containers because an exploit in the code running in the container can affect the host kernel (There is no real isolation). With Unikernels, in some cases, the system will actually unplug cores from the host kernel and boot the Unikernels on those cores utilizing available NUMA features of the CPU. This allows for more isolation and the kernel itself is minimal so it has small attack surface and, in some cases, added performance characteristics.
There are good debugging tools these days for libOS's (Unikernels) and they date back all the way to MIT's exokernel design and even further back. They have caught traction recently because:
1) They are easy to make (You are either stripping down a working kernel or building a very small one - think smaller than minix3).
2) Cloud computing (security) and HPC (performance on commodity) make them relevant again.
3) Not to mention they are fundamentally a better technology than containers, IMHO there is no debate, containers will be replaced by libos(s)/unikernels.
"You don't have to rebuild your image with Debian patches or anything like that since the kernel is minimal and unique. And exploits have minimal impact in any case."
This isn't correct. If there is a bug in the unikernel library you are building into your unikernel you can still exploit it. Imagine if there is a DHCP parsing error in the networking stack which can lead to Remote Code Execution. Or imagine if there's a deserialization error in a web framework which can lead to remote code execution (RCE). Or In both cases you need to follow the upstream and rebuild the unikernel.
Additionally, I think that unikernels are less secure if anything. If you break into the application you are already in kernel mode and can access any file or privilege resources belonging to the unikernel. Also, there is no hope for defense in depth techniques like sandboxing or resource brokers. With a normal application you would need to do some sort of privilege escalation or sandbox escape after hacking the application.
I'm all for attack surface reduction, but little to exploit is not nothing to exploit. Your shell, sshd, and networking stack have probably all been exploited many times.
> Your shell, sshd, and networking stack have probably all been exploited many times.
Yes, I've exploited them myself with script-kiddy tools. Its not hard. Remember those your average libos application may not have any of those things (if not only an simple IP stack).
I really doubt unikernels will ever be more than a fad. Hardware isn't a good abstraction layer, because 99% of developers aren't kernel developers.
And sure, a kernel's containment mechanism is subject to exploits. But so are hypervisors. At least if you exploit an application in a (properly made) container, you're just an unprivileged user inside the container. You're not root, in the container or out. If you exploit a unikernel application, you can directly go and target the hypervisor.
Docker containers all share the same kernel. Unikernels each have their own kernel, but they only talk to virtualized hardware provided by the hypervisor. Actually they don't separate "kernel space" from "user space" because they depend on the hypervisor to keep the applications separated.
You can think of containers as virtual Linux environments, implemented natively in Linux. Contrasted with Linux, the overhead is pretty minimal. Since Unix systems have some "virtualization" built-in natively, being multi-process, it fits nicely, imho.
A unikernel is quite different. It allows you to create a bootable application with built-in operating system.
One really doesn't have much in common with the other.
The language is a dynamic synthesis of incremental compiler and Basic Input/Output System where more generic and cross implementation concepts can be added or re-defined.
https://en.wikipedia.org/wiki/Forth_(programming_language)
Skimming this I don't see any actual reference to a standard method for accessing a filesystem, I suppose in cases where a filesystem is relevant there'd either need to be a driver for that written in native forth, or a host operating system which could be called via a syscall of somekind.
I expect Forth to be used in different contexts, but mostly in this order: extremely low level (high performance in L1 cache execution), early boot (a slower / read only FS driver is fine), or toy / experiment / prototyping (syscalls might be good enough). Anything else can probably afford to be written in a more easily maintained language.
One benefit of the early boot context is that, in theory, a minimal compatibility standard could be written and one 'interpreted' (hooks on to the definitions of functions already in the context stack) driver could function on vastly different hardware as long as it didn't actually use native assembly.