Until well into the 2000s we deployed all our services as chroots. I even sold a server side product successfully which was distributed as chroot so it would install almost on any Linux distro without having to care about dependencies. It worked really well and felt quite a bit more accessible and lighter than docker. We just had a script which basically did:
wget https://bla/image.tgz
tar xzf image.tgz
rm image.tgz
chroot image/start
and the product would be installed and up & running. A lot of clients thought it was magic. No Apache install? No MySQL install?? It works on Fedora & Centos & Debian & Ubuntu & ... ?
I use this currently still a lot to run new software on unsupported hardware/kernels. For instance, I run a lot of new (Ubuntu) software on my OpenPandora under the default OS on that device; I have an SD with a chroot with Ubuntu and things like dotnet core. Runs fine. Gives devices like that a second youth.
Docker has mostly succeeded in making this simple concept complicated. A "container image" is just a tape archive of some files. The runtime is just like your chroot, but with namespaces and cgroups. That's about it.
On top of all this, they designed a new programming language / build system for making those tar files. They invented a container registry so that instead of just "downloading a file", you have to use their particular protocol to download a file. They have their own version of systemd so you can see what programs are running. They tacked on some very complicated networking stuff. They tried to build a couple of container orchestration frameworks on top of all this. I think all the "extra stuff" is where people run into problems, and it's isn't even the core of the value it provides.
People have realized that it's all a little much and the good bits have been distilled into their own projects. Many build systems can just generate an OCI registry file directly (remember, it's just a tar file, and you don't need all that 1970s JCL stuff to make a tar file). Various runtimes (containerd/CRI-O) exist for running containers. And there are orchestration layers on top, for actually managing sets of containers, network rules, etc. (ECR, Kubernetes, etc.).
(Worth noting is that Docker itself does use all these spin-offs. It uses containerd as the runtime, it uses buildkit for building containers now, etc. They even ship a Kubernetes distribution, at least on Mac and Windows.)
I never thought about that. Is it really just that? Then how come Docker is this huge innovation that conquered the world and nobody can live without it anymore?
It's also an ecosystem. The biggest innovation for me is being able to browse the docker hub and get official distributions to start from for basically anything out there.
So the explanation for that is in how moderately sized companies work. There are system administrators, and there are developers. The developer writes some software, tells the sysadmin what environment it needs to run, and the sysadmin builds and maintains that environment.
Docker allows the programmer to write more code to build and maintain their environment, cutting the system administrator out of the loop; the sysadmin just has to provide a server able to run generic Docker containers, and doesn’t have to care about what’s in them.
Basically it’s a bit of a power grab, and a bit of developers being upset about slow processes in their companies (“why does it take two weeks to deploy a new version of this dependency”) and finding a workaround. Then the rest of the world started cargo-culting them.
I agree with the initial assertion as to how companies work when they start to grow to medium or larger. The rest, though, feels an awful lot like projection on the parent's part; someone's an insecure admin or programmer reveling in their superiority at a moderately sized company -- not meant to be an ad hominim but you're really specific there, buddy.
The enterprise is about supportability, scalability, and ease of maintenance. Officially support software and common tools/platforms/SOPs are a must unless we're building something brand new from scratch. DIY charoot jails may work just fine... until the two admins who really know it go on vacation or quit or get fired for mining BTC on company gear. Now Johnny-New-Hire needs to unravel it, and it may not be documented well, or even work that well without an admin kicking it occasionally.
Were I to put a job advert out for DevOps roles needing Docker I could usually get someone with a base level of competency, and get them reasonably quickly. Likewise if I need to outsource this to contractors or even offshore budget IT support it's a lot easier to find a Docker guy then it is running someone through whiteboard exercises to figure out how much they know about jails and makefiles. And if we're way out of our league with issues there is Enterprise Docker which is expensive but gets us official support from Marentis (or whoever owns them now), which may not be justified in terms of cost, but absolutely makes management sleep easier knowing we can escalate things if we can't fix it internally.
I think that's a pretty unfair assessment. It is likely that the underlying problem was that no Linux distribution had all the packages people needed. You need the latest version of Python, and a bunch of modules, and MySQL, and Nginx, and node.js, and... and that never happened. Debian stable had something from 6 years ago. Debian unstable didn't boot. Every time you got a new machine or new developer, you would have to painstakingly manually install all those packages on top of your OS base image, probably breaking other parts of the OS as you did that.
Docker lets you build all the dependencies once, and then run it anywhere. You can run your container on your Mac in a (managed!) VM. You can send it to Amazon to run on a random EC2 instance that you don't have to manually update. Or you can just run it on your normal server, without affecting anything else on that server.
It isn't about developers wanting to write more code. It's about developers getting annoyed with the difficulty of installing software.
Nowadays, creating a raw chroot is far from simple. Last time I tried, I had to handle the special mount points like /proc and /sys. I can't remember the details, but I believe there were other problems.
In my experience, systemd-nspawn provides a nice alternative. It's mainly a chroot that abstracts the low-level details, and it fixes some security concerns (with cgroups, etc) and adds some nice features: e.g. networking, or `systemctl status` from the host that will display the sub-tree for the processes of the sub-host.
Last time I tried to recover an ubuntu system by booting from a live sub, mounting the drive and chrooting into the target device i was surprised to find out that systemd refuses to work if it detects being run in a chroot.
If you're talking about what happens when you mount the root filesystem, chroot into it, and run systemctl then you're essentially asking to break out of the chroot and communicate to the service manager outside the chroot which almost always doesn't make sense.
The only time this is reasonable is when a container needs to manage services on the host a la what Portainer does for Docker.
You have a lot of options for accomplishing what you want.
* To run a service with a given root filesystem add RootFilesystem=/path to your unit file.
* Run a service from an image of a root filesystem use systemd portable services.
* To run your service using the init system in your root filesystem (could be systemd but doesn't have to be) use systemd-nspawn --boot.
Right, but what you're describing as running a pre-systemd "service" is really just "run a shell script that sources some environment files and then double forks".
Running a systemd service means talking to the /org/freedesktop/systemd1 object on the system bus typically listening on /run/dbus/system_bus_socket and asking it to start your service. This is all that systemctl really does.
mkdir ./rootfs
curl -O https://cloud-images.ubuntu.com/minimal/releases/bionic/release/ubuntu-18.04-minimal-cloudimg-amd64-root.tar.xz | tar -xJPC ./rootfs
mount -o bind /proc ./rootfs/proc
mount -o bind /sys ./rootfs/sys
mkdir ./rootfs/run/dbus
mkdir ./rootfs/run/systemd
mount -o bind /run/dbus ./rootfs/dbus
mount -o bind /run/systemd ./rootfs/systemd
chroot ./rootfs bash
# See that it works just by sending dbus messages.
dbus-send --system --print-reply --type=method_call --dest=org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager.ListUnits
# Now do the same with systemctl.
export SYSTEMD_IGNORE_CHROOT=true
systemctl list-units
THIS IS ALMOST CERTAINLY NOT WHAT YOU WANT. You're just talking to the host systemd. You won't see any of the services in your chroot since how could systemd know about them? Your chrooted root is also now just root on the host. Just use systemd-nspawn.
# undo all the bind mount junk.
chroot ./rootfs bash
useradd ubuntu -G sudo
passwd ubuntu
exit
cd ./rootfs
systemd-nspawn -x --private-network --boot
# login
sudo systemctl list-units
The scale at which containerised scheduling of this outcome saves more time than it consumes is much higher than many have been led to believe. I measure it in sysadmin FTE. If you have >0.5 FTE dedicated to managing deployments then there may be a payoff, because (amortized over a reasonable period e.g. 12 months) it takes at minimum half an FTE to setup, manage, and maintain a PaaS properly.
I've become accustomed to folks claiming things like, "I spend 5 minutes a week managing my kubernetes cluster". Then it turns out it took them a solid month to get it going properly, two weeks to containerise their application reliabily, and then next quarter they're stuck for days on some weird networking or logging issue, and nothing else progresses. Basically, they either don't value their time, or they're misjudging how much they've actually allocated.
It's often the same people who boast about how cheap their bare-metal servers are vs public cloud instances.
I used that for build machines. Had a btrfs filesystem on a loopback device I could just download as needed to seed the agents and then make a subvolume to work in. When all was done, I could just drop it.
A lot of was automated with "schroot" and definitely faster than spawning a new instance of a VM every time!
A similar approach is used by distributions such as Debian/Fedora/Ubuntu/RHEL.
In these distributions, you have tools ("pbuilder/*builder" in deb base distributions, "mock" in rpm based distributions) which create a chroot, grabs the build dependencies and install them from the repo, build the package and then throw away the chroot.
It's really convenient for many reasons:
1) tracking and referencing build dependencies is far better as you have to declare them explicitly, no more "but it used to compile on the old machine... what am I missing".
2) you don't add kludge on the host running the build as you start from a fresh chroot every time.
3) you can target many distributions/versions from one building host (I'm regularly building RHEL 7 and 8 rpms and Debian 7, 8, 9, 10 debs from my Debian sid for example, without too much worry of weird side effects due to my particular setup.
The downside is that builds can take a bit more time, but that's clearly something I'm willing to trade-off for more systematic and (semi-)reproducible builds.
Yeah, it's very nice (and fast) for creating all kinds of build and test environments without messing up the machine. Language like Python recommend virtual envs like that, but this is more consistent and you can actually install whatever around Python as well without messing up the parent system.
> We should never forget that at the root of containerization is a desire to keep everyones shit sorted. That's been a Unix thing forever.
That's literally the point of operating systems: To allow many people to share the resources of one machine without being able to interfere with each other. Modern containers and VMs are just reinventing operating systems, and unfortunately they're doing it on top of existing operating systems rather than building -- you know -- a better operating system.
I have been interested in an OS that would better implement virtualization and containers better, "closer to the hardware." I know most users aren't terribly interested in that sort of thing, but I think a lot of hacker-types would enjoy easily sorting projects into different areas. How nice would it be to easily spin up a couple different machines to test a server-client relationship on the same machine, without a handful of slow VMs?
I feel I don't know enough to see the issues with an approach like that, or know why it's not already mainstream.
Yeah, waiting for the Pyra as well. I got to test the Gemini which was crap (imho), the Pocket 1 (excellent machine imho), Pocket 2 (no nub and thus, quite shit; worse battery life too). I only bought the Pocket 1 because I liked it so much when testing, but the OpenPandora is still really one of my favorite machines for one the go hacking because I can just hold it in my hands instead of having to put it down on a surface.
hehe, yeah .. the Pocket1 is my current favourite handheld, there's just nothing quite like it .. plus, paired with a couple of iControlPad's, its kind of badass.
Been keeping an eye on GPD's other thingies, but I'm just jaded now. I just want a revised version of the Pocket1 that doesn't suck.. totally with you on the track-pointer.
1) There were/are sometimes ways to break out of them.
2) The process table is mixed in with your OS process table - making it hard to tell what is running chrooted and what isn't.
3) The network stack is shared
4) They share an OS
You can make a "spectrum" of environments where you run code. One one side is everything running on a single server in a single OS, on the right is everything having its own machine and OS. In between you have chroot, docker, blade servers, virtual machines, and other isolation techniques. chroot falls somewhere between everything running in one system, and everything running in docker containers on one system.
The "chroots doesn't offer any security" statement needs to be a bit more nuanced. A non-privileged process in a dedicated directory can't do much damage, and certainly can't escape the chroot without some sort of privilege escalation.
That does not mean you can slap chroot syscalls everywhere and call it secure, of course. But it is still an important part of dropping privileges, together with seccomp-bpf, control groups and the various ACL systems.
It is an important part of things like the OpenSSH privilege separation, where the early protocol is handled by a dedicated process in a read only chroot. It proven both simple and effective in practice, contrary to the idea that chroots are escapable.
None of the listed methods are novel. Most are elaborate ways of saying "privilege escalation".
If an untrusted process can escalate to root, ptrace to unrelated processes, or access memory outside your process, it was never really contained in the first place.
Do also note that none of the listed methods works in what the presentation itself calls a "reasonable chroot".
> Why not apparmor/selinux instead
It's not either or. Dropping privileges is something you preferably do in more ways than one.
This. If you want the security features, switch to a BSD (jails) or illumos (zones). These started spiritually from that same chroot place but were designed to incarcerate your software in a small subsystem.
The illumos ecosystem in particular got a lot of work from Joyent in this container vein—like how Windows can now run Linux binaries natively because they implemented the Linux system call table, illumos has Linux-branded zones that do the same “our kernel, Linux’s API, no virtualization” approach to Linux containers.
Windows runs Linux binaries by running Linux in a VM. Something newish is that Windows also implements Windows by running NT and a Windows UI in a VM.
Running on a hypervisor originated at IBM, on its 370, and is very mature technology. Arguably, an OS running on bare metal is practically an embedded system, these days; There are just so many things that make a hypervisor useful or essential.
The key insight IBM had was that the hypervisor runs under the control of one of its VMs. That means the hypervisor doesn't need to provide a full-featured, comfortable work environment; that is the job of guest OSes. Instead, it manages resources according to policies implemented in an "executive" guest OS not used for, or vulnerable to mistakes or malevolence in, regular user programs.
A modern example of such a system is Qubes, security-oriented OS that hosts and orchestrates Linux, BSD, and even Windows VMs.
WSL 2 is virtualization-based (and likely Microsoft’s primary path going forward), but WSL 1 was not— it actually did implement the Linux ABI on top of the Windows kernel, allowing Linux processes to coexist alongside Windows processes (with no actual Linux kernel involved at any point).
It’s actually a pretty neat architecture— I’m on my phone right now and can’t track down a link, but it’s worth reading about if you’ve got the time. Kind of a shame that they moved on to the virtualization approach, but understandable— they’re trying to solve the same sort of problem as Wine, where you’ve got to mimic all the quirks of a foreign OS and it’s also a moving target (so you’re never “done”).
File system access is super slow on WSL. This was one the drivers. If I recall correctly it is because some common Linux syscalls (stat?) are missing/slow on Windows NT kernels.
The filesystem in general is known to be much slower on Windows due to it's extreme flexibility, but Linux design decisions assumed a much more performant filesystem. Hence why running linux on windows slammed into the performance problem.
> Something newish is that Windows also implements Windows by running NT and a Windows UI in a VM.
That’s not really that new. Windows 3.x, running in 386 Enhanced mode, was based on a 32-bit pre-emptive multitasking hypervisor (the VMM). Windows apps shared VM 0, cooperatively multitasked, and were mostly 16-bit. VM 1 and above were for DOS apps, including 32-bit DOS apps using DPMI.
(In Windows 3.0, it is possible to start a subordinate instance of the Windows UI in VM 1 or above. This possibility was removed in later versions.)
This architecture was introduced in Windows/386 2.1x and maintained through Win 95,98 and Me. (In Win 95 and later, most of the 16-bit code in VM 0 became 32-bit.)
> The process table is mixed in with your OS process table
Just wanted to point out that this is true for Docker containers as well.
Something that has actually come in handy from time to time trying to diagnose things.
I think from within the container you can only see that container's processes, but outside (as root) you can see all processes, even those inside the container.
I'm curious as to why blade servers are in your mix—aren't they essentially separate physical machines sharing a common backplane for I/O? Seems basically identical (from a software perspective) as servers hosted in the same rack.
They also often share power among some number of blades. When the spectrum is how co aolidated software and hardware are, it is a bit apart from separate physical instances, as the fault tolerance isn't quite the same.
Apart from other comments same pro/con as containers: I seem to recall we had a commercial backup solution that ran like this - as suid root. And came with some quite outdated dependencies.
Vulnerabilities in libc (eg: dns resolver) bundled Sendmail, databases... They can be independently patched (good) or they can be independently left on ancient unpatched versions (bad).
If you use the system/distro library services, it can be easier to verify patch levels and know if any given hole (eg this week's sudo hole) affect your system or not.
For me, Docker is important because it lets me use Kubernetes. That's what I really want to use. If Docker used chroot instead of their own thing, I would be fine as well.
Sure; my favorite platform at that time was in fact Solaris and I had (and have) many Sun machines. However zero of my clients had either Solaris or *BSD. So what can you do...
SunOS became Solaris, which was forked into Illuminos which is the basis for SmartOS - Joyents virtualization platform used by many top tech platforms.
FreeBSD is probably used to serve more data over the internet than any other OS distribution. All Netflix streams go through FreeBSD.
So, one got sold at an auction sale and now is part of a second tier platform, the other has as its major modern success story that Netflix runs its servers with it?
Second tier platform? What are you comparing it to exactly?
SmartOS is an open source operating system that is widely used commercially as a kubernetes node (among others). It has native ZFS and DTrace, in addition to zones, etc.
So is kubernetes second-tier? I don't understand your point.
FreeBSDs success is invalid because its Netflix that heavily use it? Get real...
That Linux has been eating the world (server, mobile, even embedded) whereas SunOS died, Solaris is mostly forgotten.
The SmartOS is just yet another OS with limited adoption. Whether it's "used commercially as a kubernetes node" doesn't mean much. Itself is second-tier, not Kubernetes (an OS doesn't magically adopt the success/adoption rate/etc of the tools it's used with. Most Kubernetes nodes remain Linux).
As for FreeBSD, which I use to run back in the day for a spell, it has hugely declined in use from the late nineties.
>FreeBSDs success is invalid because its Netflix that heavily use it?
You got it backwards. Your argument that "Netflix uses it" as proof for FreeBSD success in 2020 is what is invalid.
One company (however big) using an OS doesn't mean the OS is a success, or hasn't declined in adoption.
Apple doesn't disclose much about their infrastructure, but they did post jobs looking for FreeBSD admins so I would imagine they don't have a negligible number of them.
WhatsApp was running primarily on FreeBSD (not sure how it is now)
So? As a feat is even less impressive that the one we just rejected as some major sign of it being a success (Netflix running FreeBSD).
I'm sure it also runs a lot more of servers. But it's less prevalent that it used to be, and Linux has eaten almost everything in that space.
Enumerating success stories is the first sign of a project with few of them (Linux doesn't have to enumerate how many companies use it). It's the same for language. C or Java don't need a "success stories" page. But niche languages will invariantly have one, and tout it as some proof that they're widely used...
I provided individual examples. There are many others. Go do your own research instead of shitting on other peoples rebuttals to your clearly uninformed position.
Why are you being negative about a particular distro? You clearly have a chip on your shoulder or an ulterior motive. FreeBSD also powers my NAS (FreeNAS) in my house. It also features ZFS as a first-tier filesystem. Use whatever the fsck you want, but stop being a jerk please. People STILL care about these distros and work to improve them in whatever "niche" markets you think they're in, no sense in you being a little beach about it.
Kind of. OS X uses the userspace and maybe some pieces of the kernel (like the network stack) but never used the BSD kernel wholesale. AFAIK, it doesn’t have jails.
Apple's preference for kernel-mediated containment is sandboxing, which is finer-grained and operates on applications as the unit. This is probably better aligned with their general focus on user-facing devices, although the granularity gets messy imo and overall the guarantees are weaker.
AppImage doesn't chroot, or as far as I'm aware even use namespacing. Most of the dependencies are packaged with the application in a squashfs image that is then mounted in tmp when it is run, so that's the only similarity.
I dont even read or write bash shell scripts regularly and i can understand what that line is doing fine. I would not be able to understand if the entire thing was one line, so i think there is a significant difference.
Just because something doesn't follow general rules for readable code, doesn't mean it is actually unreadable.
I don't think soygul is so much challenging the readability of the line, but rather pointing out the hazards of focusing on the # of lines of code as a unit of simplicity/complexity.
Do you actually know what it's doing, or are you guessing and inferring from patterns? There's a difference when you actually need to read the source for details rather than a quick skim.
I can definitely guess the pattern and I do write bash regularly, but I see at least 2 things I'd need to double-check the man page for the behaviour.
If you are asking if i have memorized bash syntax fully and know that everything he did was valid, than the answer is no. However the intent of the code and what each component is supposed to do is clear, which is what i'm looking for when reading code.
Heck, by this definition i'm not sure any code is really good enough. In my job i have to write php daily, i still need to regularly look at docs to figure out order of arguments for various library functions. I wouldn't be able to tell if explode($a, $b) in php has the right argument order at a glance without looking it up. But i understand the intent and generally assume the superficial aspects of the code are right unless that part appears not to work.
And furthermore, adjusting the number of newlines isn't going to help with the question of if that piece of code is using bash syntax correctly.
I don't see anyone claiming that the syntax is incorrect. But it definitely causes me to have to stop and do some mental processing (mainly thinking about order of operations) to make sense of it.
Good code should not only be "readable", but it should be as easy to read as possible. Potentially a lot of people will read this code, and all of them will need to spend extra-time processing it because it's hard to read. Long lines are usually harder to read, which is why most coding guidelines recommend limiting lines length to 80 chars, this one is 178.
If you pick bash as your programming language, don't you already throw "readable" out of the window?
Perl gets a lot of flak for being unreadable, but I personally have more difficulty understanding bash code than perl (not implying that perl is super readable).
That roughly corresponds to the following C-like pseudocode:
for (i = 0; i < argc && (arg = argv[i]).startswith("--"); ++i)
if (sep = arg.find("=")) opts[arg[..sep]] = arg[sep+1..]; else opts[arg] = "x";
It might be too much for one line, but it doesn't seem to be super complicated either. In fact that line is (as of now, commit 0006330) only the second longest in that file.
Of course, in real projects I don't write such codes. But that's not because it is harder to read, but because it is harder to alter; say, if you have another condition for say `-v` vs. `--verbose` then you would have to reformat it to multiple lines. The meaning of the quoted line is not hard to figure out if you know a bit of bash, and that's all I would expect from a showcase.
Of course what you fight for is readability and only readability. Especially in large projects. There will be literally 10 man-hours spent on reading this for each 1 man-hour of reading-followed-by-altering-it.
I love perl because it's fast, but it definitely follows the WORN pattern. Write Once Read Never. No matter how many commnets I leave for future me, it's always a voyage of discovery reading my own code, forget reading anyone else's :)
I wonder how you achieve this. The way I see it, Perl got that reputation mostly from a few sources: Heavy use of regexes when that still wasn't as common, the default variables like $_ and of course sigils (@list, $scalar, %hash). Sure, you can golf it to have some outrageous results in your Usenet signature while remaining McQ, but C has an obfuscation contest and never got as teased about that.
Sure, if you're coming from structured Pascal 101, that might be an issue, but in the day where deeply nested functional rat kings tend to replace the humble WHILE loop, is a Schwartzian transform that confusing?
> Heavy use of regexes when that still wasn't as common, the default variables like $_ and of course sigils (@list, $scalar, %hash).
And implicit variables. Also, implicit variables. Composed types didn't make it any easier; and references, that broke all the logic implicit on the sigils.
> and references, that broke all the logic implicit on the sigils.
They don't break the logic, they just follow different logic than many people assume. Either you're indicating a collection (a hash or array), or you're indicating an item. @ is a plurality indicator, while $ is an individual indicator. You reference the single item of the @foo array with $foo[0] for the same reason you say "the first apple of the bunch" instead of "the first apples of the bunch" when you want the first one. Yes, it's probably better not done that way (it's not something you can usefully change in Perl 5 at this point), but it does follow well defined rules, apparently just not the ones you assumed.
Implicit variables can usually be avoided (except for $_, but that's pretty normal these days, and I'm pretty sure it topical variables didn't begin with Perl), and often they're lexical so usually other people's code messing with them is contained. Except for $_ itself which isn't lexical, which is it's own story and a pain point. :/
Implicit variables all came from UNIX shells which are in use to this day. What was the exit code of the last program? "$?".
C works similarly. What was the error from the last system call? "errno".
Certainly there are problems with this; we've all seen the programs that output "Error doing something: Success". But it isn't Perl that invented this. It's a UNIX tradition.
Sure, the logic of the sigils is easily broken, but that just brings us back to every other language, with the added noise of the "$" sign (so pretty much to PHP).
As for implicit variables, I rarely see anything else than $?, $|, $_ and @_, of course. And I'd argue that $_ often makes code a bit easier to understand than cluttering it up with a variable declaration. No worse than point-free programming in more modern languages.
Don't get me wrong, I don't think that Perl is that well architected and that Perl5 would've required some more courage at dropping backwards compatibility, but what I don't get is why Perl is singled out here. It's not that radically different like e.g. APL or ScalaZ.
Compared to other languages of its day, this feels a bit like the "Reformed Baptist Church of God, reformation of 1915" joke. Then again, perfectly fitting into similar conflicts about brace styles or 1-based indexes…
Re: implicit variables, I understand there are environments where "use strict" is not an option. As someone working in Perl heavy shops for about 10 years, I hardly see any Perl program that does not use strict.
It fixes a hole lot of more important problems, but it's mostly aimed at problems that will make your code misbehave, not the ones that will make it hard to read.
One of the early foundational principles of Perl was "TMTOWTDI" (http://wiki.c2.com/?ThereIsMoreThanOneWayToDoIt for the unfamiliar). The intention behind this was cool: do "it" whichever way makes the most sense for your particular situation.
But the end result was horrible: everyone did "it" every possible way, and that, IMO, is the underlying reason for Perl's reputation for being unreadable.
I also think it had a great deal of impact on the development of later languages and principles, which tend to focus much more on removing freedom from the programmer and enforcing idiomatic ways of doing "it". Today, you're far more likely to encounter modern code written by different programmers which looks very similar -- at least on a line-by-line basis, anyway. At the architectural level, it's still a big game of Calvinball.
If that's really what you think and you think it's down to the language, look up Modern::Perl (0) and, if you're object inclinded, Moo (1), Mouse (2) or Moose (3). Then there's always perlcritic (4), too.
In my mind, it's down to it having been popular, attracting many amateurs. You'll find equally unmaintainable bash or PowerShell scripts and definitely as much garbage, WORN JavaScript code. Writing maintainable code requires both knowledge and effort.
Here's a small IO::Async + Moo app that I still consider to basically be how I'd write it today - I invite people to browse the source code and tell me what they do/don't find unreadable compared to more "raw" perl.
ALL of the listed above libraries are complete garbage, impossible to use productively among several developers.
I worked in a company that had huge Perl codebase, which made extensive use of the Moose library. After trying to make sense of it, I gave up and used plain Perl, writing it as unidiomatic and simple as possible, so that hundreds of other devs, also new to Perl, would be able to understand the code I wrote. This was the common sentiment - most of the people followed the same path.
The library is just a nightmare - Perl is dynamically typed, there is NO adequate IDE support (compared to the one statically typed languages have), so good luck with working out how the library works underneath. And if I cannot understand that, how on Earth will I understand what even my code is doing? (Never mind the others')
In my mind, the amateurs are those that created the libraries without any idea on how they are going to be abused, thinking everyone should use unreadable incomprehensible syntax coupled with unapproachable internals.
I apologize for the rant, I had no idea this topic moved me so much.
Basically your problem looks like dynamic programming languages are hard to work with?
I mean types do make software engineering craft a little tolerable and its not exactly a new thing to say here.
But how would this situation be any different than using Python or Clojure?
Talking of artificial bolt-on's. We are living in an era where we do 'from typing import *' and core.spec for Clojure all the time these days. How does this change only when it comes to Perl?
I do dislike dynamically typed languages a bit for building large systems, where are plenty of alternatives available.
However, I still reach out to Python and Bash and Perl for some one-off tasks or gluing scripts and I do appreciate the brevity and clarity they bring for this sort of problems.
Except when it comes to building somewhat large systems (I am talking ~5 mil LOC here) - then every kind of "abstraction", like this disaster of a library Moose, only increases the complexity of the project by a large margin, and acts only as job-security for the original authors of the code, making most of the codebase impenetrable for the rest.
I have not worked with similar large systems in other dynamically-typed languages, so I cannot compare other languages to Perl in that regard. I do know, however, that Perl is simply a disaster to use in that scale.
> Except when it comes to building somewhat large systems (I am talking ~5 mil LOC here)
I think most Perl developers would agree that if you expect you code base to reach in the millions of lines of code (at least if it's all one code base and not an ecosystem using an API), Perl (or any dynamic language) may be stretched to the point where its benefits are outweighed by its drawbacks, similar (on the other side of the spectrum) as if you used C/C++ to build a simple web app.
I work on a similarly nightmare-ish Perl codebase, that makes use of Moose and MooseX (and all the other various plugins people have made), along with various hacks Perl and Catalyst hacks only lifelong Perl monks can understand. The only way to figure out anything is `Pry` and `Data::Dumper` everywhere. Perl critic also conflicts with some of the other libraries in the ecosystem, like the one that provides `method` and `func` (not sure which one it is).
Perl is great for text manipulation and one offs, not large, production systems.
which will 'warn' out the dumped structure and then return it, so you can instrument code for debugging trivially.
DDCWarn also provides a DwarnT so you can add a tag that gets printed but not returned, i.e.
return DwarnT TAG_NAME => $foo->bar->baz;
There's not really a book on Moose, but the Moose::Manual pages plus Modern Perl plus The Definitive Guide to Catalyst work out pretty well between them for bringing people up to speed.
We use Mojolicious specifically in order to avoid needless complexity inferred by Catalyst. Perl is ok for small to medium production systems but library support is quite lacking for 2020. We'd probably use something else if we had to start fresh today.
Moose is pretty much a straight up implementation of The Art Of The Meta-Object Protocol, which is a seminal work on the subject (plus a few extra bits like Roles inspired by smalltalk traits and CLOS-style method modifiers).
Over-use of meta stuff is something people often get tempted into when they first use Moose and can make things a bit more complicated, but the core syntax is honestly simpler and easier to use than raw perl OO and I've found it much easier to cross-train non-perl devs to Moo(se) style OO.
If you want an opinionated/terse syntax that encourages you to only be as clever as strictly necessary, I'd suggest looking at http://p3rl.org/Mu
Moo is quite nice actually. It's quite minimal, with minimal dependencies. I discard most if not all Perl modules that depend on Moose and always look for alternatives. Not very keen on dependency hell and importing heavyweight modules in order to develop trivial functionality.
> there is NO adequate IDE support ... so good luck with working out how the library works underneath
LOL, WUT? Why do you need IDE support to figure out how the library works underneath? You have the code, you have the docs, what else do you need to figure it out?
Hmmm. Not sure if that is on purpose, but I don't suffer from that; I can read code I wrote in Perl 20 years ago just fine. I'm not even sure how you would write Perl code like you suggest. It looks noisy sure, but once you know what it means, how is it hard to read?
I guess if you do deliberate golfing/obfuscation you can make anything unreadable.
since strictures fatalizes most warnings as well as turning strict on for maximum "telling perl if I made a mistake to barf immediately rather than trying to be helpful and guess"
This isn't really that horrible; just condensing the boilerplate that comes with command line argument parsing. Sure it's quick-and-dirty, but it gets it done and moves focus your elsewhere.
Interestingly, I didn't know you could do this without an `eval`:
function bocker_help() { #HELP Display this message:\nBOCKER help
sed -n "s/^.*#HELP\\s//p;" < "$1" | sed "s/\\\\n/\n\t/g;s/$/\n/;s!BOCKER!${1/!/\\!}!g"
}
Initially I thought that bash supports reflection and is able to get the function contents, including comments. But this function scans its own script file ($1), looking for the comments starting with #HELP and formatting them. This way the usage info can live near the functions implementing sub-commands.
Do you just mean without the escaped newline characters? Because the fact that they serve as function comments is exactly what I found to be novel about it.
No doubt this is cool and represents good work! Nice job!
Can we really say it’s a reimplementation of Docker in 100 lines, though, when it requires other dependencies that probably total in the hundreds of thousands of lines? That’s still code that has to be audited by people so inclined. Not to mention the other setup specified in the readme and maybe having to build one of the dependencies from source. Usage doesn’t sound necessarily trivial.
Makes me appreciate Docker that much more though. Getting up to speed using it at my job and it certainly abstracts away many things, simplifying our workflow and conceptual models.
I think the idea is to show how much container functionality is actually available out-of-the-box from the operating system. It raises questions about Docker's billion dollar valuation when someone can come along and reproduce a bunch of it with a short shell script. Next someone needs to take on orchestration and show how silly it was for IBM to pay $34 billion for OpenShift. :P
I remember we used to write a kind of document database, using rcs and Perl, with versioning et al. This is to index all kinds of Text files of various formats. This was before JSON and XML got so big as data exchange formats.
Its almost like several of these projects existed for long until people came around took them a little more seriously and built businesses around them.
This also reminds me using the Unix DBM's do all kinds of key-value store work. Long before things like Redis and Memcached were around.
meh - I mean, I don't know what language you write in today, but I would wager it's a language that has had at least a million spent in some kind of marketing, books, directly or indirectly or through advocacy (that also costs money as it's a company telling employees to put effort into that instead of other things). And if you include that, we're probably talking about $1M to 1/2 billion if it's in the top 10 tiobe index.
The billion dollar figure, at the end of the day, is brand recognition. The actual cost for someone to re-implement all the code, if someone wanted to, is probably $10M? Even then, however it will be a fly-by-night never heard of again project like... rkt.
They do offer other things around Docker. Fairly sure those came along later, but they wouldn't be the first company to inflate the value at first and then add additional revenue streams to catch up.
Plus I think by providing infrastructure services and technology, they get valued much higher because other tech companies are their customer and they're usually rich as well
80% of the functionality isn't where the value resides. It's the 20% of really well-thought-out niche features that make it valuable.
Perhaps not 'unicorn' valuable anymore as it went through that phase of adding enterprise shovel features. These days all cloud enterprise software companies must badly incorporate the features of other products because most Enterprise software buying criteria are just boxes to be ticked.
Docker also depends on hundreds of thousands on lines as well. In fact, to run it on Windows, it requires both Windows and Linux as dependencies /s
More seriously - it’s not a complete docket of course, but lines of code in a project are a liability, not an asset. If you can reduce your project size with reasonable dependencies, you should.
>The problem of waiting for a database (for example) to be ready is really just a subset of a much larger problem of distributed systems. In production, your database could become unavailable or move hosts at any time. Your application needs to be resilient to these types of failures.
absolute bullshit. docker compose thinks that it can excuse its bugs by dint of the fact that we're supposed to build "more resilient" applications to accommodate them.
and, their proposed workaround with "wait for" is disgusting. their tool should be able to handle readiness detection. it's so fucking basic.
it's not only this but this is an example of the bullshit in this shitty tool excused with shitty reasons.
> my test of "does my application work given that the database is running" is explicitly not accommodated.
Are you working with tools where this is a problem in practice?
Most web frameworks I've used will keep trying to connect to the DB in a loop until it either connects and things start normally, or it times out after X amount of seconds where X by default is configured to some number that's way higher than it would normally take for your DB to be available, even if it's starting from scratch with no existing volume.
No "wait for it" script needed (and I do agree that type of solution is very hacky). Although to be fair the same page you quoted said the best solution is to handle this at the app level, which is what most web frameworks do.
Yes, I remember it was a problem with Django. It wasn't just the application server, you might need to run some scripts, after the database is up, before kicking off the webserver. Any workflow like this is explicitly unaccommodated.
100% of the solutions ive seen to address this problem have been hacky - either polling bash scripts or explicit waits.
It does seem like Docker will be holding back containers from achieving their true promise, as it flounders looking for profitability and chasing k8s. Its sad that we're still moving around tar balls without advancing some of the tooling around it.
One of my biggest gripes as a non-k8s container user are that we are still keeping all this versioned cruft with every container. I would like to see an easier way to just get the latest binaries not dozens of tar hashes with versions going back to the first iteration of the container. Probably closer to singularity https://singularity.lbl.gov/
Another area is the ability to update containers in place. Hass.io https://www.home-assistant.io/hassio/ does some of this with the ability to manage the container version from within the container, but seems like a little bit of a hack from what I've seen. Watchtower is closer to what I'd like https://github.com/containrrr/watchtower, but why can't we have a more integrated system built into the default tooling?
Docker is a good tool and deserves some resources to continue developing the product and funding marketing team. I wish docker hub to be more competitive among other rivals.
Well, isn't that a specific case though? From my experience most containerized apps use higher port.
In FreeBSD you would be able to also remove such restrictions if needed (not sure if something is also available on Linux) alternatively you could have your app listening on a higher port and use iptables to forward port 80 there.
> From my experience most containerized apps use higher port
Most public images I see on Docker Hub run on default ports. Sure, a lot of these are configurable, but then you need to reconfigure all the consumer services to use a non-default port. FreeBSD is not an option, unless you are willing to run on your own hardware. As for iptables, does podman provide network isolation where you can define iptable rules per container? I know it wouldn't work with docker.
What's stopping you from running your own registry? Or keeping images on a build machine and moving them around with some file sharing mechanism? You don't need a docker account to pull public images from dockerhub, and you don't _have_ to push your images to dockerhub
Docker stopped publishing builds of their own registry many years ago. So if you want to run the official registry, you need to build it from source. This leads to a fun and exciting bootstrapping process; to launch the registry, you have to pull it from somewhere. Since your registry, which is where you'd like to store it, isn't running, you can't pull it from there. So you have to use some third-party registry to bootstrap. Or do what I did, and give up, and just watch their registry crash randomly when it receives input that confuses it.
People will make fun of me if I go into the great details of the workarounds I have to make a DigitalOcean managed Kubernetes instance pull images from a registry that's hosted in the cluster. But it's fun, so here we go. I use a DigitalOcean load balancer to get HTTP traffic into my cluster. (This is because the IP addresses of Kubernetes nodes aren't stable on DigitalOcean, so there is really no way to convince the average browser to direct traffic to a node with any predictable SLA.) I configured the load balancer to use the PROXY protocol to inform my HTTP proxy of the user's IP address. (I don't use HTTP load balancing because I want HTTP/2 and I manage my own letsencrypt certificates with cert-manager, which is not possible with their HTTP load balancer. So I have to terminate TLS inside my cluster.) Of course, the load balancer does not apply the PROXY protocol when the request comes from inside the cluster (but the connection does come from the load balancer's IP). Obviously you don't really want internal traffic going out to the load balancer, but registry images contain the DNS name of the registry in their own names. The solution, of course, is split-horizon DNS. (They should seriously consider renaming "split-horizon DNS" to "production outage DNS", FWIW.) That is all very easy to set up with coredns. It is easy to make registry.jrock.us resolve to A 104.248.110.88 outside of the cluster, and to make it resolve to CNAME registry.docker-registry.svc.cluster.local. inside the cluster. But! Of course Kubernetes does not use cluster DNS for container pulls, it uses node DNS. Since I am on managed Kubernetes, I cannot control what DNS server the node uses. So the DNS lookup for my registry has to go through public DNS. I created an additional load balancer, for $5/month, that is only for the registry. That does not have the PROXY protocol enabled, so when someone DoS's my registry, I have no way of knowing who's doing it. But at least I can "docker push" to that DNS name and my cluster can pull from it. This is all fine and nice until you make a rookie mistake, like building a custom image of your front proxy and storing it inside your own registry. What happens when DigitalOcean shuts off every node in your cluster simultaneously? Eventually the nodes come back on and want to start containers. But your frontend proxy's image is stored in the registry, and to pull something from your registry, it has to be running. This results in your cluster serving no traffic for several hours until you happen to notice what happened and fix it. (I do have monitoring for this stuff, but I don't look at it often enough.)
And that's why I have 99.375% availability over the lifetime of my personal cluster. And why smart people do not self-host their own docker registry.
> building a custom image of your front proxy and storing it inside your own registry
But do the images have to be co-located with their registry?
Can't the images be somewhere else, and the registry replicated among the nodes, so any node can find its image through the registry and fetch from that location?
For me, the hardest thing about Docker was running my own registries. Or rather, configuring credentials so Docker would use them. Not to mention the hassle of doing that through Tor.
Fun anecdote: Early in the lifespan of CoreOS (early 2014 IIRC) I was meeting with technology leadership at a Fortune 500 customer. I remember the head of developer services asking all kinds of Docker questions. Finally, he was cut off by his boss who said:
"Bob¹*, unless we've solely staffed the organization with folks from coder camps a lot of the folks here should be able to carve out a couple of namespaces in the kernel. That's not interesting and that's not why we asked CoreOS to come in today. Let's talk about the actual problems they're trying to solve."
<3 <3 <3 <3
That being said, it's great that folks continue to dig in to decompose the tooling and show what's actually happening behind the scenes.
¹Bob was definitely not that guy's name.
EDIT: Added note of context appreciating the work by the author.
I was thinking about something like this for build systems. Everything in Docker is regular Linux. I get why Docker is so big for its use case as cloud deployments, but what I actually want from it is such a small piece of it. Hermetic, reproducible builds that produce the same binary on the same kernel release. No dependency hell because the dependencies are all part of the same build. (I know Bazel has a lot of this already.) The Docker solution of pulling in an entire distro is overkill, and it doesn't even solve the problem because dependencies are just downloaded from the package manager.
If that's your concern then Nix might be your thing, because that's what is targeting. It approaches the problem slightly differently. Instead of generating an image, it is used to describe all dependencies for your project down to glibc. Of course you're not expected to define full dependencies of your project, so you (and most people) will use nixpkgs. As long as nixpkgs is fixed to specific version (it's literally a github repo) you can get identical result every time.
Once you have that then you want to deploy it in whatever way you like it. Nixpkgs has functions to for example generate a docker image that contains only your app + essential dependencies. You could deploy it using nix package manager as you would install a typical app in the OS. You could also describe configuration of NixOS that has your application included and it could generate an image of it.
> it doesn't even solve the problem because dependencies are just downloaded from the package manager.
The advantage of Docker is that you can verify the container works locally as part of the build process rather than finding out it is broken due to some missing dep after a deployment. If you can verify that the image works then the mechanism for fetching the deps can be as scrappy as you like. Docker moves the dependency challenge from deployment-time to build-time.
Does container mean something different to y’all than it does to me?
I ask because I read your comment as saying “the advantage of Docker is that it uses (explanation of what containers are)” and the parent comment as saying “all I want from Docker is (explanation of what containers are)” and I am confused why (a) y’all are not just saying “containers” but rather “the part of docker that packages up my network of scripts so I can think about it like a statically linked binary” and (b) why you think this is a competitive advantage over other things you might have instead recommended here (Buildah, Makisu, BuildKit, img, Bazel, FTL, Ansible Container, Metaparticle... I am sure there are at least a dozen) to satisfy the parent comment’s needs.
Is there really any container ecosystem which has write-an-image-but-you-can’t-run-it-locally semantics? How do you finally run that image?
Docker is too general, too much of a Swiss army knife for this particular problem. The problem I am talking about is where a C++ program has all of its dependencies vendored into the source tree. When you run Make, everything including the dependencies build at the same time. All you need is a chroot, namespaces, cgroups, btrfs, squashfs--plain old Linux APIs--to make sure the compiler has a consistent view of the system. Assuming the compiler and filesystem are well behaved (e.g., don't insert timestamps), you should be able to take a consistent sha256sum of the build. And maybe even ZIP it up like a JAR and pass around a lightweight, source-only file that can compile and run (without a network connection) on other computers with the same kernel version.
Again, Bazel is basically this already. But it would be nice to have something like OP's tool to integrate in other build systems.
I could just make a Dockerfile and say that's my build system. But then I'm stuck with Docker. The only way to run my program would be through Docker. Docker doesn't have a monopoly on the idea of a fully-realized chroot.
For some scenarios, most (all?) of them have write-an-image-but-you-can’t-run-it-locally semantics.
My build server is x64, but target output is ARM. Can't exactly just run that locally super easily. Perhaps somebody has created a container runtime that will detect this, and automatically spin up a qemu container, running an arm host image, and communicate my container run request (and image) to that emulated system, but I haven't heard of that feature. (Not that I actually looked for it.)
In my current company we are deploying almost all code as docker (with exceptions of lambda functions) when talked to multiple developers. No one uses docker for local development, except maybe using it to spin another service that might interact with the app, but even that isn't preferred. Mainly because unless you're running Linux, docker is quite expensive on resources due to running under VM.
"Everything in Docker is regular Linux" is a bit of a misleading statement IMO. You aren't required to pull an entire big distro like Ubuntu and install deps from a package manager. Are you familiar with scratch builds? You can create an image that is basically just a single executable binary with a few mounts handled for you behind the scenes so that chroot works.
Typically a minimal Alpine distro image will allow you to pull in deps however you want (either via package manager, manual download, or copy), run a build, and then copy only the artifact over to a scratch image for a final image that's only a few MB in size.
Huh, clever use of btrfs volumes, it does make it a little dependent on the filesystem though. Quite informative overall, probably implements only a fraction of docker, but it does do most of what you'd typically need.
Containerization makes sense for a tiny, tiny fraction of all products. It's just been overblown and overused, creating another point of potential failure for no great benefit at all.
Just chroot the damn thing if you need to and keep it simple.
I use this currently still a lot to run new software on unsupported hardware/kernels. For instance, I run a lot of new (Ubuntu) software on my OpenPandora under the default OS on that device; I have an SD with a chroot with Ubuntu and things like dotnet core. Runs fine. Gives devices like that a second youth.