Hacker News new | past | comments | ask | show | jobs | submit login
The bad side of systemd: two recent systemd failures (utcc.utoronto.ca)
121 points by lreeves on Dec 12, 2014 | hide | past | favorite | 119 comments



I'm no fan of systemd, but do we really need to be nitpicking its every failing? The alternatives are certainly not perfect either. If you don't like it, don't use it. Many distros are still committed to supporting other init systems along side systemd and many BSDs have a very good philosophy with things like this and may be worth a look too.

Can we just agree to move on? I'm sick of the endless banter. If you like systemd, use it, if you don't like it, don't. If you think it needs more time to mature, stay away from it for a while and experiment occasionally. That's my plan at least.


Within all the noise about systemd I think posts like this are quite beneficial. Instead of discussing politics or throwing out buzzwords, it presents a specific technical shortcoming. Positive or negative, these are the types of things that people should be reading about in relation to systemd.

The problem is that people are deciding whether or not they like systemd based on "do you like Lennart" or "do you like The UNIX Philosophy", because that's what people are making noise about.


As I said in another comment, I really wish it went into more detail on the actual problem.

The problems with journalctl are usability ones, which doesn't make them any less of a problem, but doesn't point to an underlying technical failure either.

The problems with systemd are basically stated as "a Fedora upgrade causes systemd to go bad." Is this a systemd problem or a Fedora problem?


Given that Fedora and systemd has been developing in lockstep over the last few years, one would think they worked well together...


I'd hope that any distribution would be packaged to work well with it's init system, regardless of which init system that is. After all, the whole point of a distribution is to have a core set of software[0] that works together reasonably well.

[0] With the number of different packages in a modern Linux, I don't expect everything in a distributions repository to work perfectly, but core stuff (like the stuff that gets you from a cold boot to a login prompt, etc) should be rock solid.


Some anti-fragility on the part of both Fedora and systemd would help here. Maybe a statically compiled init would be a start?


systemd does not support static linking due to its use of NSS. See: http://lists.freedesktop.org/archives/systemd-devel/2014-Mar...


> Only the most trivial libraries actually support it.

... what kind of nonsense is that? Almost every library supports static linking to a basic degree at least. Heck, I statically link giant things like Qt and Boost every day at work.


Well the core of the issue here is you can't really staticly link against a library that depends on dlopen/dlsym (nss system is built around this).

I mean, you could, but now your staticly linked binary still depends on other dlsym'd code. So whats the point?

(Not trying to make an argument that this is good or bad behaviour)


I understand, but the question then becomes why your library depends on dlopen and why you don't offer an alternative static method as Qt's "plugins" do. Doesn't seem like good design on the part of the NSS devs to me, and probably a poor choice of library on the part of the systemd devs to me. But maybe I'm missing something.


The library that depends on dlopen is glibc.


You can statically link glibc, so I'm pretty sure that's not right...


I can see systemd as a whole not supporting static linking. But systemd is a collection of many different binaries that each do their thing, so would it be possible for systemd (the pid 1 executable) to be statically linked while the other supporting executables are dynamically linked?


The systemd developers try to avoid using NSS on PID 1 (for very good reasons), so NSS wouldn't be a problem for static linking PID 1.

* See for instance https://bugzilla.redhat.com/show_bug.cgi?id=915912#c18


Sounds like it would just need some changes to their build setup to allow the PID 1 piece to be compiled statically, which doesn't seem too bad.

Of course, this may not even solve the root cause of the problem this article talks about. It would make PID 1 a little more resillient though.


systemd does not support static linking due to its use of NSS.

Well, that is a surprise to me. And something of a concern.


Anti-fragility isn't the same as robustness. It should rarely be used as a term.


I'm not even sure how any software could be described (accurately) as anti-fragile. Robust, yes. But anti-fragility indicates that performance improves with stressors or catastrophic/unexpected events.


I don't really have a good or bad opinion about systemd, to me the init system and things around it are just pieces that needs to work.

The points discussed here are serious, should be filed as bug reports, and should be acted on by the developers. But the article also just sounds like systemd bashing, software has bugs and poor thought out solutions - which can be fixed or improved - it's just code.

I could probably have written a similar article about many, many Fedora (and Debian) upgrades over the course of a decade or so about core components such as distro upgrades making the init system crashing, failing to start services, glibc upgrades screwing up servers, grub upgrades calling for a 2 hour car drive to fix and so on. But I didn't, I either filed a bug report, fixed the code, or was just too plain busy getting things back to working.


those are completely valid criteria. What's wrong with wanting your system to follow the closest thing to a known axiom in operational computer science that has ever existed?


if you like systemd, use it, if you don't like it, don't.

That is so naive. No one chooses to use systemd. People choose a distro. It's being forced on long time users of distros without any of the users having a say.

Yeah, I can switch from one distro to another. That's a choice I have. But it's not so simple as you make it sound.


> It's being forced on long time users of distros without any of the users having a say.

It's being adopted by distributions in accordance with the stated and implied wishes of the developers, maintainers, and users of the distributions.

The reason that (e.g.) Arch dropped support for legacy init so quickly was because they literally failed to find anyone who wanted to go through all the work of maintaining the old init system. If none of the users want to do the work, then nothing's being "forced" - that's the way community-organized FOSS projects work![0]

Same goes for Debian, who made the switch in accordiance with their normal development process (including a general resolution in support of systemd!). Debian has promised to maintain systemd-shim (supporting other init systems) through at least Jessie, and possibly longer, depending on whether there is enough demand for alternative init systems, and support from developers for maintaining it in future releases.

Ironically, the biggest threat to systemd-shim in Jessie's successor are people who claim they want to fork Debian - if all the people who hate systemd leave Debian, then nobody who wants to actually do the work to maintain the alternative init systems will remain with the Debian project[1].

[0] Even then, you can still find an AUR package for it (https://aur.archlinux.org/packages/initscripts-fork/), though as with everything in the AUR, it's subject to much less testing. Note that the package has only 27 votes - this means that only 27 people actually said that they care about moving this package back into the main distro.

[1] This may be a bad or a good thing, depending on who you ask.


>The reason that (e.g.) Arch dropped support for legacy init so quickly was because they literally failed to find anyone who wanted to go through all the work of maintaining the old init system. If none of the users want to do the work, then nothing's being "forced" - that's the way community-organized FOSS projects work![0]

Users != developers

I am not a developer. I'm an administrator. I know enough C to be able to poke around source code and understand what is happening when I get an error, but not enough of it or anything else to maintain an init system. A lot of admins don't know this much. A lot of desktop users don't know this much.

I'm not disagreeing with your overall sentiment here - but it's not so simple as "If the users of the product liked it, they'd maintain it" - sometimes they don't have the skills required to. Not everyone who drives is a mechanic, etc.


> I'm not disagreeing with your overall sentiment here - but it's not so simple as "If the users of the product liked it, they'd maintain it" - sometimes they don't have the skills required to. Not everyone who drives is a mechanic, etc.

What I find quite unfair is that many people say that they don't have the technical skills to effectively contribute to the project, let alone develop or maintain any alternative, yet accuse the systemd team (or the GNOME team, or the NetworkManager team, or the XFCE team, or whatever) of producing sub-standard software. Given the self-asserted lack of skills, on which basis such a severe judgement is done?

Can we please go back to trusting each other, praising who actually do some useful work or at least avoiding open attacks to them? Technical disagreement is welcome, but it has to be backed by some skills to be useful.


> Not everyone who drives is a mechanic, etc.

If you car requires more knowledge to maintain than you have, you can get a different car, pay someone to maintain it or get a bike. Those are the only options I see. I don't see how init is any different than a car. You have those same options. I'm sure someone somewhere is paying RH or canonical or in house devs/admins to maintain init scripts for their application stacks.


I hesitated to make the analogy because I expected a response like this, stretching the analogy too far.

It isn't a matter of managing init scripts - that's easy. Continually updating a distro to use your init system of choice is a wholly different matter. I can change the oil in my car. I can't replace the transmission.


Yeah, I can switch from one distro to another. That's a choice I have. But it's not so simple as you make it sound.

Switching distros is like moving to a new city. If you stay in the same state (or at least in the same country) it isn't too disorienting. But switching to something completely different (like on another continent, or going from an RPM-based to deb-based distro) will require some significant effort to adapt to the new situation.

We pick our distro of choice based on a number of factors. Just because it has some downsides doesn't mean we have to move immediately. But that doesn't mean we can't give (constructive!) feedback if there are problems.


> No one chooses to use systemd.

I did. I've been systematically installing systemd on my Debian machines.


Because its a free software project you are not paying for and use as a gratis gift from the developers and maintainers of the distro.

And those developers and maintainers wanted to use systemd.

If, as a user, you disagree with the developers and maintainers of your choice distribution, you picked the wrong distro, and should either find developers and maintainers that align with your views or fork it and become a developer / maintainer yourself if nobody provides for your use case.

You cannot bitch on a high horse when not paying a cent for any of this about how the developers and maintainers are doing what they want with their distro. Well, you can, but nothing will ever happen - they already made their choice based on their needs and use cases.


I think ongoing constructive discussion about the technical and philosophical issues with software design, particularly of open-source software that affects huge numbers of users, is always valuable. The OP here even likes systemd, so the article isn't nitpicking.

As others have pointed out, when your init process faults, that's a major problem. In the real world, for the core elements of an operating system, whatever features they have are less important than how well they fail. In this case, the OP has identified a very very bad failure case.

More importantly, this report makes concrete the theoretical complaint against systemd that its monolithic design is not just a philosophical issue, but a practical one. The init process needs to be rock solid, but a law of software is that bigger software is less stable. So if the sprawling nature of systemd means it will tend to become less stable over time, then we have a major practical issue that needs to be addressed.

This is not a matter of not moving on. Rather, given that systemd has been adopted by several major Linux distributions, it's actually far more important now to criticize its design failures, to draw attention to get things fixed. If "moving on" means we shouldn't critique the software, then how will it ever improve?


Not sure if /sbin/init failing is a nitpick. its kind of a catastrophic failure as far operating systems are concerned.


Never mind that it results in a zombie state where you still have a shell, but no commands actually work.


Given the huge ambitions of the project maintainers and how everyone's rushing to adopt it even amidst sloppy integration work, yes, the criticism is worth it.

"The alternatives are certainly not perfect" is no consolation. At all. Especially considering the vast majority were never fairly considered to begin with.


I had the same sort of annoyances, but instead of whining, I simply ditched it for now. I don't see why people have such a problem doing this.


> I'm no fan of systemd, but do we really need to be nitpicking its every failing?

Yes. When some developers are pushing down your (and mine) throat critical software that sucks _that_ much, that's the least someone can do.


Some developers wrote some code and published it in a software package under a free software license.

Other developers picked up said package and added it to the package collection they are maintaining.

You choose to use the package collection maintained by said developers, which means that you trust their technical decisions otherwise you'd be running software from people you don't trust, which is not a great idea.

Nobody is pushing software down anyone's throat. Please avoid such hyperbolic mischaracterization, it really does not contribute anything to the discussion.


> You choose to use the package collection maintained by said developers, which means that you trust their technical decisions

This does not however mean that I must silently accept their bad choices. If they were about to step out in front of a bus I wouldn't remain silent because I trust their judgement. Sometimes people you respect excercise poor judgement. This is one of those cases.

Feedback is a thing. You're giving it here. Why shouldn't I give it to the distros that certainly are trying to force systemd on me?


> This does not however mean that I must silently accept their bad choices.

But there are proper places to discuss this issue with them, and HN is totally not one of those.

> Sometimes people you respect excercise poor judgement. This is one of those cases.

Sure, but have you considered that the one exercising poor judgement in this case may be you?

I don't know which distro you're using, but Debian has extensively discussed the issue and the first tech CTTE resolution made lots of valid technical points in favour of systemd. For sure it's not an obviously bad choice, and if your maintainers have chosen to ship it they may have valid reasons to do so. Not to mention that's still entirely possible to avoid systemd for whatever reason, at least if you're using Debian it's trivial.

> Feedback is a thing. You're giving it here. Why shouldn't I give it to the distros that certainly are trying to force systemd on me?

Right, but I'm here trying to give feedback to you for your actions in this thread. You're here giving feedback on HN to people that probably don't read HN, and doing so with a rather inflammatory tone.

If you want to give feedback to your distro maintainers, please get in touch with them in the appropriate channels (IRC, mailing lists) and ask them their reasons, but please first check if they haven't already stated those, one can just get annoyed if every person in the world keeps asking stuff that has been answered countless times.


> Can we just agree to move on?

No, we can't. Sorry but that's just how the world works. Just like every Go language post has "generics or crap" mentioned, every systemd post will have "The Unix Way" mentioned.

A price of freedom of speech is having to deal with self-filtering idiotic viewpoints. Worth it in my opinion, but I'm an anti-censorship for EVERYTHING person.


I'm not suggesting censorship of any sort. I'm certainly the same way. I'm just suggesting level-headedness and a willingness to take or leave it and move on with life.


I understand, I also agree with your suggestion. I guess my point is really that even though a nice ideal, based on my experience, likely not attainable.


> If you don't like it, don't use it.

Have you tried removing systemd from latest Ubuntu? I didn't go very well for me. I imagine removing it from RHEL 7 won't go well over either.

> I'm sick of the endless banter.

Ok but "use something else" or "fork it" is not exactly solving the problem either.


> Have you tried removing systemd from latest Ubuntu? I didn't go very well for me. I imagine removing it from RHEL 7 won't go well over either.

Ubuntu is not the Universal Operating System, that's Debian. This means that Ubuntu is definitely opinionated in its software selection, and rightly so. You couldn't replace Upstart and now Ubuntu relies on systemd. If you have the skills to fully evaluate the costs and benefits of mixing and matching random pieces of software, distribution like Debian or Gentoo may be a better fit for you.

Indeed, switching to sysvinit in Debian is trivial and the systemd team worked really really hard to make sure the transition is smooth even if one keeps switching back and forth. They should really be praised for the effort they've put in it.


And thats the thing. If it was "just a init", it would be just as replaceable as sysv, openrc or runit. But it is not, so it is not. As best i can tell, a major argument for going systemd is that something more user facing (Gnome) depends on a subsystem of systemd, and that in turn requires that systemd is used as init. In other words, systemd as a project is invasive and insidious.


The people that develop GNOME chose to depend on the D-Bus interfaces exported by some systemd-provided services (eg. logind) because they actually solve some problems they were facing with a clean and robust solution. Yet they made sure that the dependency is optional, to accomodate those that for a reason of another do not use systemd.

This means that systemd is not insidious, it's valuable.


>If you like systemd, use it, if you don't like it, don't.

I think people would care less about systemd if that was a realistic choice. I strongly believe that the whole systemd debate would go away if systemd wasn't a dependency of GNOME and other stuff.

Personally I have no strong feelings one way or the other, but I do fear that the BSDs are going to be left behind as they to some extend have been piggybacking on Linux in terms of desktop software.


you're missing the point... SystemD is about RedHat trying to take over the entire Linux ecosystem. It's intentionally designed so if you use any part of it (which there are good parts) then you have to use all of it (which there are many bad parts). Oh and conveniently RedHat is the main project maintainers.


I can't speak to the first problem, but the second is more of "this is strange and I don't care to learn how it works".

journalctl --no-pager

Problem solved. I mean, you have the command, it's right there in the usage text. Which, awesomely, does invoke a pager when you do journalctl --help so if you're stuck on a 80x25 console (as you might be on a broken system) you can easily actually read the help options.


I found that when the output is piped it doesn't invoke the pager. This seems reasonable to me.

What doesn't seem reasonable is the truncation behaviour. When the output isn't piped, it gets truncated to the width of my terminal. When the output is piped, it gets truncated to 80 columns. I'd expect no truncation when the output is not a terminal.

This is an outdated systemd though (I installed it on a Rasberry Pi [Raspbian] in order to play with it), so this may have changed.


>I found that when the output is piped it doesn't invoke the pager. This seems reasonable to me.

This might be less being smart and getting out of the way. From http://en.wikipedia.org/wiki/Less_%28Unix%29#Usage:

>By default, less displays the contents of the file to the standard output (one screen at a time). If the file name argument is omitted, it displays the contents from standard input (usually the output of another command through a pipe). If the output is redirected to anything other than a terminal, for example a pipe to another command, less behaves like cat.


That could be, but the change in wrapping behaviour makes me think it's journalctl doing it. If less acts like cat when the output is a pipe, then it wouldn't be wrapping at 80 columns, so journalctl must be doing the wrapping in that scenario.


yum does the same thing. I actually looked at the code and it always assumes the output width is 80 characters and wraps or center…truncates if output isn't a terminal. It also doesn't properly sort its output, so you're tempted to use `yum search | sort`, which ends up with indented lines at the top of the output.

Of course, the recommended thing to do is use a different program for scripts/pipelines rather than yum, so now there are two different interfaces that need to be remembered. yum looks like a terminal-friendly program, but actually isn't.

https://bugzilla.redhat.com/show_bug.cgi?id=584525

https://bugzilla.redhat.com/show_bug.cgi?id=986740


    journalctl | cat
...would cause less to act like cat and cat to, well, :-), and then the tty driver usually wraps from the last column to the first.


Sorry, when I said "wrap" I should have said "truncate".

journalctl truncates at the width of my terminal or at 80 columns if the output is not to a terminal.


> I can't speak to the first problem, but the second is more of "this is strange and I don't care to learn how it works".

I can't speak for you not speaking to the first problem, but the second is more of "this is strange and I didn't care to read the text"

> In the grand systemd tradition, there is no option to control this; all you can do is force journalctl to not use a pager or work out how to change things inside the pager to not do this.

(emphasis mine)

> Problem solved. I mean, you have the command, it's right there in the usage text. Which, awesomely, does invoke a pager when you do journalctl --help so if you're stuck on a 80x25 console (as you might be on a broken system) you can easily actually read the help options.

From reading the text, I'm under the impression that the issue is the mangling of $LESS. One would expect that $LESS is set up how the user wants it, if the user wants -S he'll set that.


Okay, okay. I understand that the user might be surprised.

man journalctl

    $SYSTEMD_LESS
           Override the default options passed to less ("FRSXMK").
So - whereever the user defines $LESS I'd add a line a la

    export SYSTEMD_LESS=$LESS
and that should solve the problem nicely, right? Surprising? Not intuitive? Maybe. But easily fixable with a quick peek at the man page.


So I have no dog in the systemd fight. However. $SYSTEMD_LESS is quite possibly one of the most aggressively stupid ways to force a bad UX on someone.

I mean seriously. If you have $LESS defined then you obviously have an expectation that less will behave in a certain way. An expectation that has in all probability built up over years of tweaking your shell.

For some other application to ignore that preference and in fact force it's own preference on you and only allow you to use your own preference for how less should work by reading not the docs for less but the docs for an unrelated application? That breaks so many rules it's just sad.

Yes it's easily fixable with a quick peek at the man page for systemd which might not be the first thing you check given you've been using less for years. Systemd's manpage is the last thing I would check. First would be what is my $LESS set to? Second would be is less still less? At some point it would occur to me that maybe systemd is being aggresively stupid and check their manpage.

But given I have no particular preference I would be inclined to assume it couldn't possibly be systemd's fault. I must have messed something up.

Put this minor annoyance in the middle of attempting to debug a segfaulting init? That's what we call a major annoyance.


We got to agree to disagree here, I guess.

I think it makes sense to use a pager by default.

I am not convinced that your less options that were (granted, you referred to the full environment) 'tweaked over the years' necessarily apply. When you defined those options you were thinking of all the cases where you invoke less, or at least where you knew less was being invoked.

Now journalctl defaults to use a pager if you invoke it. I'm not convinced that

journalctl

and

less /var/log/messages

should necessarily be configured the same. Maybe it would make sense to fall back to $LESS if $SYSTEMD_LESS isn't configured - I'm not sure - but I do think that it makes sense to have this environment variable.

>For some other application to ignore that preference and in fact force it's own preference on you and only allow you to use your own preference for how less should work by reading not the docs for less but the docs for an unrelated application? That breaks so many rules it's just sad.

Hmm.. Not in my world. The author was launching journalctl, not journalctl | less (which, as others have pointed out, would probably work as he expects it to). If you launch 'foo' and it doesn't do what you want and aren't an expert in all things 'foo', read man foo.

Which, for foo=journalctl, spells out that

- there is an option to stop truncating very long fields

- there is an option to use no pager

- the default pager is less

- the default less configuration is as quoted above

If you launch journalctl, see output you don't like and _don't_ skim the manual? Now that is what I personally would consider sad.


    If you launch journalctl, see output you don't like and 
    _don't_ skim the manual? Now that is what I personally 
    would consider sad.
Yeah I'm afraid we have to agree to disagree here too. I'll just say that if you launch journalctl and see output that is obviously being printed to the screen by less my first thought will be to check my less configuration not journalctl's. These tools are meant to be used in a way that respects the users configurations for a reason.

Flouting that convention is bound to be rage inducing when encountered during a particularly nasty debugging session.


And this is why Torvalds defaults to "if a change in the kernel breaks user space, the fault is in the kernel no matter how buggy the user space is".

The attitude of many of the core systemd devs seems to be the polar opposite. They decree from on high, and if you don't like it thought luck.

In a sense this is the same difference in attitude we see between Apple (systemd) and Microsoft (Linux kernel).

Microsoft has in the past bent over backwards to ensure that people could run older software on newer version of Windows.

Apple in contrast have removed and changed expected product behaviors virtually over night.


I think you and zaphar both have good points. That is to say, if for some reason you have $LESS set, it would be nice if journalctl respected it. (I'm trying not to take that and the kernel debug flag fiasco and view it as a trend.) However, as you say, if the only command you invoked was journalctl, it makes no sense to consult other documentation.


The need for separate configuration sounds contrived, breaks existing expected behavior, and assumes that this tool is special, unlike all of the other tools that call use a $PAGER. The comparison isn't "jornalctl" vs "less /var/log/messages". jornalctl should respect $PAGER (it does), and it should similarly respect any other environment variable the user sets.

If there really was a need to override behavior of the pager for just this one tool, the normal solution should be to locally override these cases:

   alias journalctl="LESS='...opts...' journalctl"
It is worth pointing out that this solution works for any tool that uses the $PAGER, and isn't specific to journalctl. If, for some strange reason that isn't an acceptable solution (such as Lennart's hatred of shell scripting - which he's entitled to, if that's the way he wants HIS environment to work), then a far better solution would be, as you suggest, two variables in priority order. It's a trivial change: [pager.c:92]

    -    less_opts = getenv("SYSTEMD_LESS");
    +    less_opts = getenv("SYSTEMD_LESS") || getenv("LESS");
This makes no assumptions about what the user intended and accepts either their general or specific settings, if present. This IS done for the $SYSTEMD_PAGER / $PAGER at [pager.c:59] making it somewhat strange that similar behavior wasn't also used for $LESS.

TL&DR: Setting defaults when $LESS isn't present is fine. Overriding behavior when a special, tool-specific request is also fine. Requiring everybody to search documentation and set a new environment variable to restore the standard, expected behavior is lazy and rude.

I set $LESS (and $PAGER, and $EDITOR, $VISUAL, $LESSOPEN, ...) for a reason, and the value(s) change depending on what I'm working on[1]. Overriding these should never be done as default behavior, or it is creating extra work for no reason. Assuming you know WHY I override $LESS (or any other environment variable) is always bad behavior, because you cannot know what all possible requirements that everybody will have.

// yet another reason I don't consider systemd to be usable

pager.c:92 http://cgit.freedesktop.org/systemd/systemd-stable/tree/src/...

pager.c:59 http://cgit.freedesktop.org/systemd/systemd-stable/tree/src/...

[1] I use custom tool that can push/pop environment variable overlays, which setup use/project specific behavior for a lot of stuff... including the pager.


Thanks a lot for the explanation and I appreciate your direct pointers to the source.

Now, given that we both consider the $SYSTEMD_LESS and $LESS cascadation (? You get what I mean) a decent idea and seeing that you found the right spot to add it, even gave a ~diff~ in your comment here...

Have you considered reporting that as a bug or sending that as a patch?

Wouldn't it make sense to defer judgement ("another reason I don't consider systemd to be usable") until your patch is accepted or rejected?


ASK HN: When did using open sores software become so much like joining the Communist Party?


Oh, they'll namespace the pager options environment variable, but wouldn't namespace the kernel command line options so systemd's debug output isn't confused with the kernel's debug output? This is some next level inconsistent shit.

https://bugs.freedesktop.org/show_bug.cgi?id=76935


Linux Torvalds: [1] >> No, we very much expose /proc/cmdline for a reason. System services are supposed to parse it, because it gives a unified way for people to pass in various flags. The kernel doesn't complain about flags it doesn't recognize, exactly because the kernel realizes that "hey, maybe this flag is for something else". <snip> And yes, that does include "quiet" and "debug". Parsing them and doing something sane with them is not a bug, it's a feature. <<

So

- nothing to see here, seems okay

- not related to the subthread at all anyway

1: http://lkml.iu.edu/hypermail/linux/kernel/1404.0/01488.html


Yawn. It's been covered to death that the problem isn't that systemd was parsing the kernel command line, or that the kernel command line is exposed to user space system services; the problem is the overloading of the string "debug", being parsed by two different things (the kernel and systemd), and systemd interpreting that in a way that made kernel debugging more difficult.

Which is, in fact, mentioned generally in the very next paragraph that you failed to quote from that message:

    > But the problem appears when system services seem to think that they
    > *own* those flags, and nothing else matters, and they don't do
    > something "sane" any more.
This is related to this subthread because the rest of the discussion on the mailing list around the kernel command line option systemd was parsing was asking for it to be named so as not to conflict, and there was pushback on adding that namespacing. And yet here we have a convoluted option naming override system (ignore $LESS, give hardcoded pager options to less, or use $SYSTEMD_LESS) which does feature namespacing. And too boot, it's a namespacing that undermines the exact case you define options in the LESS environment variable for.


You can yawn all you want. The link I posted says that 'overloading' (your word) the string 'debug' is fine. That's what it was meant for.

I mean.. Go ahead, flame not just Lennart et al, Linus might be wrong too! Reading the mail I linked to leads to the following

- userspace can and should read the kernel command line, very explicitly including the word 'debug'

- userspace shouldn't crap out and cause a failure to boot. Surprise!

The latter was a bug. The former is what you're trying to stuff into this thread for no reason. You might not like the 'kernel debug command line affects systemd' behavior, but that seems to be a) something the systemd guys want to keep (Ready your pitch fork..) and b) Linus himself seems to consider okay - IF that doesn't cause regressions.

That whole mailing list thread is basically just about personal issues between Linus and Kay and a 'not my problem' attitude towards regressions. The general idea of using the 'debug' kernel command line parameter to mean something specific for systemd? That's considered okay and correct.

So no, your whining is still unrelated to the (journalctl/less) topic at hand, you just felt that your pet peeve should be mentioned here as well. And .. it still makes no sense. There was a bug in systemd. It got fixed. Systemd reading/using kernel command lines without their own namespace? Fine. Correct. Good. Accepted.

"debug" can mean a lot of things and everyone's entitled to use "debug" to mean something system specific. No need to invent a systemd.debug here. Straight from the horse's mouth, we're done.

(For reference: You hijacked the thread with "(they) wouldn't namespace the kernel command line options so systemd's debug output isn't confused with the kernel's debug output" - and that's obviously wrong if you read the mail I linked to - and maybe a bit of the discussion around that)


You might not like the 'kernel debug command line affects systemd' behavior

Correct, I don't like that behavior. Just like I, along with a bunch of other people, don't like this pager options handling. If I like it or not has nothing to do with what Linus says or what the systemd guys want to keep. I also find this namespacing of options to be inconsistent, either inconsistent with the rest of the systemd ecosystem and/or inconsistent with how environment variables are defacto-standard used to do things like this.

I didn't "hijack" this thread. I made one comment, a snarky comment I felt was related. Usually my comments get ignored. But I guess not today. Glad I could keep you busy.


>doing something sane

Dunno if dumping the output of every last damn command performed by systemd, and the daemons it starts, into dmesg can be considered sane...


I cannot follow. I think you meant kmsg - and where else would you stuff messages when the filesystem isn't there yet?

I mean .. you have no / or /var yet, where do you send your log messages?

The whole reason for the bug report this angry GP wanted to share with us is that there was a buggy, wrong assert filling up the logs and causing issues during boot. It was literally just a bug. People can now form a mob and complain that their favorite init system never exposed a bug like that..

That's not related to

- logging to kmsg (that's okay, just don't kill it because you're doing stupid things)

- parsing the 'debug' kernel command line flag, the original complaint of the GP


"In the grand systemd tradition, there is no option to control this; all you can do is force journalctl to not use a pager or work out how to change things inside the pager to not do this."

He mentioned that. So, yes, it looks like he cared to learn how it works.


But apparently not enough to think to alias journalctl --no-pager, like is the advice for literally dozens of other Unix shell commands? Because colorized ls output is certainly not an "option" either in that sense.


Or..

       -a, --all
           Show all fields in full, even if they include unprintable characters
           or are very long.

       -f, --follow
           Show only the most recent journal entries, and continuously print
           new entries as they are appended to the journal.
(I'm talking about -a, copied the _following_ -f because that is what the author is trying to use here. It was right there..?)

Edit: Although I'm not sure if he's complaining about truncated fields or long lines. Hmm..


While I agree with your first paragraph,

> if you're stuck on a 80x25 console

What's wrong with `journalctl | less` in this case? The only command I use regularly which invokes a pager by default is `man`, and that seems like an actual special case.

Edit: FTR, I agree with your assessment in the first paragraph, but not the judgement I infer you passing. IMO, things should avoid being strange.


> The only command I use regularly which invokes a pager by default is `man`

Git also invokes a pager by default. And it also sets $LESS when doing so. In fact, it wouldn't surprise me if systemd's automatic use of a pager was based on git; IIRC, it used to set $LESS to the exact same value git used (I believe systemd has changed it since then).


I super dislike this btw. If I want a pager I'll use a pager. I'm using Git, I can handle `| less`.


It can make sense to invoke a pager by default to avoid blowing the current terminal scrollback to smithereen if the user himself forgets it. Especially for diagnostic tools, where the information left in the scrollback buffer could be useful and not saved anywhere.


Fair point, I guess.

I very rarely work with a non-virtual terminal, but when I do I take a photo before typing anything. Perhaps that's only practical for me because it's pretty rare.


I tested this just now: journalctl | less works normally for me on Ubuntu Utopic.


Wait, so the --no-pager does invoke the pager, but with different options? That would be surprising.

It looks to me like you either get a pager with broken settings, or you get no pager at all.


How much of the blame should systemd be taking for these bugs? Unless the same segfault is happening on other distros, this seems like a problem in Fedora rather than systemd.

I suppose this does give some substance to the systemd criticisms that it is too system-critical for a process that is technically outside of the Linux kernel.


If you're saying, a bug should always be investigated downstream before reporting it upstream, then I agree.

If you're saying, there are scenarios where segfaults caused by upstream code are not upstream bugs, then I disagree.


The first problem, PID 1 crashing, is certainly worth complaining about.

The rest seems mostly nitpicking. I wouldn't work myself up about the pager issue as discussed elsewhere here. I do somewhat agree with a rather useless default for the range of things to show (i.e. I'd prefer -b [1] to be implied unless specified otherwise, for example).

1: -b means 'since the last boot' for people not using journalctl. The default is instead to show the full journal from its beginning, as stated in the article.


Regarding the first problem, as someone else said, 'Remember how awesome PulseAudio was? Imagine that for PID 1!'


PulseAudio gives me a better out-of-the-box experience than any other sound system, bar none.


Guess I'm the only one who had a cron job restarting the PulseAudio process regularly.


I had to regularly restart PulseAudio to keep Firefox's context menu working :(

On the one hand, PulseAudio really would have benefitted from a built-in watchdog to restart it its early days. On the other hand, before PulseAudio, I never really had working sound at all, so I consider PulseAudio to have been an improvement.


  # keep systemd happy
  0 * * * * /bin/shutdown -r now


You know you are old when the first thing that comes to mind is a User Friendly strip...

http://ars.userfriendly.org/cartoons/?id=19990302


That might be true today (I no longer have linux desktops, so I have no idea), but it took years of buggy (or extremely hard to configure) releases to reach that point.


Which was the result of Ubuntu shipping buggy beta builds when everyone told them not to do it. So blame Ubuntu and not PulseAudio.


My biggest problems were on Fedora, but it's not been really easy on Archlinux and Gentoo either...


It was a nightmare for me and several other local users on SuSE/openSUSE as well, so I think we're down to just Debian and Mandriva as major distros of that period that might not have had serious issues (Slackware was very conservative about integrating it). I suspect @adestefan used other distros minimally if at all and got lucky as far as PulseAudio went and I'm a bit envious. Problems with PA were a frequent topic in our LUG for 3-4 years and only a fraction of that was distro devs testing it early. Really they should have just used "PulseAudio: More powerful than OSS and more stable than Flash!" as the slogan.


Guys, the `less -S` problem is bunk, because you can easily change it at runtime. From the OPTIONS section of `man less`:

Most options may be changed while less is running, via the "-" command.

So, just start `journalctl -l` as you would, then type in `-S` and scroll the screen in any direction to force it to re-draw.

Discussing it any further is bikeshedding, and distracts from the much more interesting segfault discussion.


To which I would say that you are completely missing the point of the complaint levelled here - that you can work around a stupid design decision does not negate the stupidity of that design decision. Personally I think the default of show all logs ever is worse but both mean that what should be an integral and useful tool is initially off putting at best and constantly irritating at worst. I say this as an arch linux user for the past three years.


It's interesting to see examples of where systemd falls down, though I'd be interested to hear what the actual problem is. The Fedora bug report and linked page basically say "systemd fell over during an upgrade", but is this really a systemd problem or is it a Fedora problem?


Almost any time PID 1 segfaults, it's PID 1's fault. With PID 1 being systemd, that makes it systemd's fault here. Based on gdb stack backtraces in the Fedora bug report and looking at the systemd code, it seems like some sort of memory corruption or overwrite, perhaps a use after free issue. The segfault itself comes from dereferencing a clearly invalid pointer and said pointer was obtained by dereferencing a structure field through another pointer, so you'd get exactly this result if the structure was overwritten with other data at some point.

(In my grumpy sysadmin view, it is PID 1's fault even if the distribution is doing odd things around PID 1. Init processes need to be absolutely rock solid and extremely defensively coded, precisely because the world basically dies if they ever fall over.)

(I am the author of the original post.)


What i find interesting is that it is the traditional sysadmins that find fault after fault with systemd.

On the other hand those that embrace it come from web/cloud development, with an eye towards virtualization and containerization.

I think someone somewhere once compared it to pets vs cattle.

Traditional servers are the pets of sysadmins. Groomed and cared for to make sure they never keel over unexpectedly.

To the cloud "admins" (or maybe i should call them devops?) servers are cattle. they have X of them living in some cloud "farm" somewhere, and can order any number of them to be slaughtered and replace at the drop of a pin.


At this point in time they seem joined at the hip. I think a Pottering retort about Debian users complaining about the size of systemd was that it was introduced over time on Fedora. So if any distro should make systemd shine, it would be Fedora.


It's failing during an upgrade. My guess would be that it's a systemd problem, but one triggered by mixing systemd versions. Probably the systemctl from Fedora 21 does something that the systemd from Fedora 20 doesn't like.

If that's the case, the bug might have even already been fixed in systemd upstream, but the fix wasn't applied to the older systemd in Fedora 20. The solution would be to backport the fix to Fedora 20's systemd package, so users who do a "yum update" before upgrading to Fedora 21 would have it.

An alternative workaround might be to, immediately after unpacking the new systemd, send a SIGTERM to PID 1. According to http://www.freedesktop.org/software/systemd/man/systemd.html, that signal directly tells PID 1 to re-exec itself, without using systemctl in the process. That would make PID 1 be the new systemd before anything tries to use the new systemctl.


" Oh, the full message is available, all right, but journalctl specifically and deliberately invokes the pager in a mode where you have to scroll sideways to see long lines. Under no circumstance is all of a long line visible on screen at once so that you may, for example, copy it into a bug report."

Well, that's what you get by supporting a system built by obnoxious incompetents that think they can replace the most important process in the machine

"(Oh, and journalctl goes out of its way to set up this behavior. Not by passing command line arguments to less, because that would be too obvious (you might spot it in a ps listing, for example); instead it mangles $LESS to effectively add the '-S' option, among other things.)"

Yes, of course.


Man, anybody else remember being able to use tail and grep to decode system logs?

And not having the init process crash?

Seriously, this is clownshoes. Their developers and evangelists should get hit by a SIGBUS.


I felt dirty, but this made me laugh out loud:

>While I'm here, let me mention that journalctl's default behavior of 'show all messages since the beginning of time in forward chronological order' is about the most useless default I can imagine.


Yeah, that's an annoying default. At least it also uses a pager by default, so it stops after the first screenful.

On the other hand, "journalctl -b" (show all messages since the machine has been last powered on, in forward chronological order) is quite useful.


Yep... even better is that it takes an index, so you can "journalctl --list-boots" to look up boot by timestamp and "journalctl -b -2" to look at the log of the second-to-last boot to copy and report the kernel oops in there.


With apologies for drifting off topic:

I love less's command line interface for some of this stuff.

The command line switch to chop long lines: -S The in-app command to toggle chop lines: -S

The reverse is almost true for things like /search, F (follow) or G (go to end). They can be used from the command line with +/search, +F or +G


Looks like there is another bug open already with a similar backtrace, also triggered by an upgrade:

https://bugzilla.redhat.com/show_bug.cgi?id=1130633


All I've been seeing lately are posts very critical of systemd. I really don't know much about it, and I would love to hear the other side of the story. Why is it so popular lately?


[flagged]


Hacker News is a startup hangout. statups makes more and more use of cloud services. Systemd is in large part about cloud...


could you point me to the large part of systemd that is cloud? Is it the "desktop bus", or maybe mdns and dhcp, or perhaps the core files being stored in the journal?


Drawing a clear line is difficult, and the fact that many people are enraged with systemd for the opposite reason (systemd is too desktop-oriented, it has no place on servers) just proves that systemd does a lot to cater for different use cases (which in some case may even overlap), but I guess some features that may be useful in cloud settings are:

  * service lifetime tracking, restarting them if they go down
  * lots of resources-tracking knobs thanks to cgroups
  * support for read-only root filesystem
  * strong integration with containers: http://0pointer.net/blog/systemd-for-administrators-part-xxi.html
  * many batteries included (no need to ship ntp on VMs thanks timesyncd, journald can replace syslogd for basic usage, networkd tries to do the right thing without any manual setup when run in a VM, etc.)
I'm sure there are more, these are just the few ones I could think off the top of my head.


>lots of resources-tracking knobs thanks to cgroups

I'm already tracking these at the orchestration level. I don't need my init system to do it.

>support for read-only root filesystem

Nice, I guess, but not sure why it's a particular win for cloud. My stuff is already stateless by design. I don't care if anything gets written to the root volume or not, since it will go away when the stateless vm goes away anyway

>many batteries included (no need to ship ntp on VMs thanks timesyncd, journald can replace syslogd for basic usage, networkd tries to do the right thing without any manual setup when run in a VM, etc.

Well, I could strip out NTP thanks to the systemd SNTP client, but I could also just use NTP and not use the systemd SNTP.

The ephemeral nature of cloud makes journald absolutely useless, unless you just don't care about logs. (Why don't you care about logs?)

Restarting services, containers, and networkd are nice, but I think the other stuff is probably reaching in a lot of ways.


I think part of the "desktop orientation" compliant is because of the heavy reliance on dbus as the signaling path between the various sub-daemons.


Probably. That said, other than having "desktop" in the name, D-Bus is just a nice IPC system and inventing yet-another-IPC systems would have been worse (not counting kdbus as a new IPC system but rather an evolution of D-Bus).


wait, you're saying that inventing D-Bus rather than using existing IPC systems would have been better than inventing yet another IPC system?


D-Bus replaced CORBA in GNOME by improving the IPC system originally used by KDE, DCOP. Unless you're a CORBA aficionado, yes, I'm glad D-Bus exist.

If you feel that D-Bus duplicates existing functionality can you please elaborate?


Have you used CORBA?

Replacing CORBA with anything else was a good decision for everyone.


Containerization. CoreOS is pretty much built around systemd, and RH has publically stated that they will be focusing on cloud going forward.

Systemd even has the capability of supporting systemd within containers that then talk to systemd on the host OS level.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: