I remember a discussion on a FreeBSD mailing list, around 2003-2004, where people bragged about the impressive (though in comparison to this headline, puny) uptimes of a few years.
One of the developers remarked that while he was proud the system he worked on could deliver such uptimes, having an uptime of, say, three years, on a server, also meant that a) its hardware was kind of dated and b) it had not received kernel updates (and probably no other updates, either) for as long. (Which might be okay, if your system is well tucked away behind a good firewall, but is kind of insane if it is directly reachable from the Internet.)
You only need to restart when there is a kernel update, and the frequency of kernel updates depend heavily on the distro used. Debian stable, for example, although using ancient versions of the packages, is a great OS for such a use case, as kernel upgrades are really infrequent. Have a look at the changelog frequency of Squeeze [1] or Wheezy [2].
If you update a central library (e.g. openssl), you'll have to restart in order to deal with in-memory copies being used by other programs. If you're running a Debian server one of the packages to include in your base install is debian-goodies or needrestart because the former bundles a very helpful little script called "checkrestart" and the latter is an updated systemd-compatible version, both of which use `lsof` under the hood to determine when and why package updates require a restart for full effect.
But do you? You really only need to restart the processes using those packages. Technically, a kernel update (specifically security update, bug fixes may not be important) would only require a reboot.
However, I've often been in situations where I reboot anyhow, because rebooting means I'm 100% confident the old code is gone, whereas if I try to get clever and avoid the restart, I'm significantly less confident. Depending on how hard it is to validate the security bug, that can be a problem.
Plus, for much of the past 20 years for many computers, if you're going to restart all services, adding in the last step for rebooting doesn't add all that significantly to the downtime. Server-class hardware often have things that make that not true (stupid RAID cards), but for everything else you were often only adding, say, 25% for the actual reboot.
You don't need to be 100% confident the old code is gone - just 100% confident the old code is no longer exposed to the network - check your sockstat/netstat and call it a day.
It gets complicated when central libraries like glibc have to be updated.
I did this once with checkrestart on Debian Wheezy and I had to restart nearly everything except for the init process.
So in this case just restarting the system would have been faster and easier.
For lots of core stuff, you don't technically need to reboot, but you probably do need to go down to single user mode and come back up (consider upgrading glibc or openssl), and at that point you might as well reboot.
Well, the server is there to host some service. If you'd need to restart the service deamon, why not restart the machine for once, and make everything simpler?
Also, boot time bugs are a huge issue. They can creep during the entire time your system is up, and only show up during a reboot. Thus, if your server only has unplanned restarts, you'll only discover those bugs when you have yet another pressing issue to deal with, and also, likely at 3 in the morning on a Sunday.
So, make things better for you, and restart those servers once in a while, when things are quiet.
KSplice helps you avoid the need to reboot even with many kernel changes. KSplice is the delta layer that gets you from security patch to maintenance window for a real reboot.
Not only that but you only reboot when there's a kernel update that you care about. If it's not a security update or it's not a security update that affects you. I don't reboot for remote exploits in kernel services I don't use or single service VMs with local privilege escalation vulns.
Since a lot of updates will require for you to stop or the service anyhow, adding a reboot at the end before bringing everything back up it's not that bad of an idea.
I once had a debian stable desktop and home server reach two years of uptime using that strategy of upgrading everything except the kernel. Some upgrades, like a newer glibc, were quite tricky to accomplish without a reboot as you had to restart nearly every process. It was a fun game so i didn't mind the effort. Eventually a power outage wiped away my uptime.
You only need to restart to install kernel updates that you need. While I'd normally just install all updates, most kernel bugs that I recall seeing the past few years are local exploits. If you're running few/no external services, you might not need to upgrade. And often bugs are in little-used subsystems/protocols -- often those will be off-by-default, or turned off by a diligent administrators (never run code you don't need).
It's rare to see a kernel-bug that can't be worked around in some way other than patching.
> It's rare to see a kernel-bug that can't be worked around in some way other than patching.
While it's true that most critical bugs I've run into with the kernel can be workaround in some way, it's at the detriment to some use-case that some users, somewhere rely upon.
A vulnerability I discovered a bit of a year ago allowed local privilege escalation for any user with access to write to an XFS filesystem. The only workarounds were to modify their SELinux policies or switch to another filesystem. I'm not aware of any users that rewrote their SELinux policies for this. It had been fixed in the kernel and fairly quietly fixed in a RedHat security advisory, but I don't think any other distribution did anything at all.
I'd estimate that critical kernel bugs happen at least other month. Worse, discussion of these bugs happens in the public and takes dozens of months to fix. Again, assuming you even know of these vulnerabilities, while there are often workarounds, they're not always practical.
Did you know that via user namespaces, all non-root users on a machine can elevate to a root user? That root user is supposed to be limited, but it's allowed the mount syscall. Numerous vulnerabilities have been discovered as a result of this. The kernel team usually considers them low-impact and they get a low CVSS score, but when using certain applications this can lead to local privilege escalation. The workarounds are to disable user namespaces or disable mount for user namespaces, both of which will break some set of users.
Did you know that any user capable of creating a socket can load kernel modules? For a long time this allowed loading ANY kernel module! The only workarounds were to compile it out of the kernel or monkeypatch the kernel. Only last year was this was finally fixed upstream so only those modules patching a pattern were loading. Yet, it was also discovered that if you used busybox's modprobe, the filter would still allow non-root users to load arbitrary kernel modules from anywhere on the filesystem!
Point being, this is par for the course. Clearly one needs to understand their threat model, but if the model is at all worried about local privilege escalation, update weekly until you find another OS.
> A vulnerability I discovered a bit of a year ago allowed local privilege escalation for any user with access to write to an XFS filesystem.
So, mount any existing xfs file systems read-only, and move rw systems to ext3? I'm not saying it would make sense - but sounds like a prime example of something for which there was a work around...
(I'll concede that for those that need(ed) xfs, there'd probably not be many alternatives at the time. Possibly JFS?)
Yeah, I had a laptop I used as a home server that had an uptime of nearly 2 years when I finally decided to update the packages. Turns out the hard drive was hanging by a thread and the reboot was enough that it gave out permanently.
Try smartd[1], which can be set to run a SMART self-test at regular intervals. Presumably the hard drive won't last as long, but you'll probably get a warning before it fails.
While some might disagree, I definitely agree. Often there is no need to install updates at all on machines that only perform one or very few functions that have limited/no network connectivity. Things like HVAC and SCADA systems that only talk to hardware and not the internet, and are physically secured well.
I've seen many windows systems with uptimes of several years that have never required any maintenance.
We do regular security audits from a security firm who goes the extra mile to try and social engineer and gain physical access to all of our sites.
Plus we're talking about things like processing fish in a town of 2,000 people. If I was operating a nuclear reactor, I would surely adapt better security measures.. although against government sponsored attacks using undocumented vulnerabilities, windows update isn't really going to do much.
The Target thing you posted has to do with internet access, which is something that goes against what I was saying. I'm talking about closed, physically secure networks, possibly not even using tcp/ip or ethernet.
Your quote omits the critical "that only talk to hardware and not the internet". Your examples 3 and 4 are doing it wrong.
Stuxnet-like attacks can go after non-networked equipment, but they're based on exploiting the computer with the programming suite, not the industrial system itself.
That's fair. My point was that in reality, a ton of people end up doing it wrong in some way or another. You should cover your bases and keep your systems up to date with security patches regardless of how segregated you believe they are.
Under those circumstances, you can definitely get away without updating. But remember that updates do not only fix security issues, but also stability issues.
My gut feeling is that it is kind of like driving a car without wearing the seat belt. So far, if I had never worn a seat belt, nothing bad would have happened, because I did not have any accidents. But when it happens, one goes through the wind shield, so to speak. Also, some stability/performance issues do not manifest until a machine has been running continuously for months or years.
(What is more disturbing, though, that the very-high-uptime systems (~4 to 8 years) I have seen also appeared to never get backed up, and there didn't seem to be any plans for replacement, or at least spare parts. Which is kind of bad if the machine happens to be responsible for getting production data from your SCADA to your ERP system which in turn orders supplies based on that data.)
Totally. A lot of industrial/utility type places don't really have robust IT, and they treat computers like industrial equipment. So you may have a factory foreman or operating engineer who is responsible for equipment, who is 100% reliant on a vendor CE for implementing stuff.
What ends up happening is that they'll bolt on some network connectivity for convenience or to take on some new process and not set it up appropriately, or not understand what it means to expose something to the LAN or directly to the internet.
I helped a friend at a municipal utility with something like this when they wanted to provide telemetry to a city-wide operations center. They had a dedicated LAN/WAN for the SCADA stuff, and the only interface was in this case a web browser running over XWindows that had a dashboard and access to some reports. I think they later replaced it with a Windows RDS box with a similar configuration.
Because of the isolation, and professional IT who understood how to isolate the environment, it was advisable to to not be tinkering with updates, as the consequence of failure is risk to health & safety.
Yes, frequently precisely because one of the two clauses asserted by the previous commenter (a lack of general network connectivity) has become false without changing other things about the workflow.
(I'm not advocating for HVAC/SCADA systems to be running, say, Windows XP Embedded with no updates and default passwords, world-facing, just observing that the preconditions changed.)
Well, if one simply does not install updates, it gets rather easy, as long as the hardware does not act up.
Which is of course, a really bad idea for the general case.
Although it is actually kind of a requirement in some industrial environments, where certifications are involved - once the thing is certified, any change, hardware or software, requires a re-certification, which apparently is expensive and tedious. Which is how many industrial plants, too, end up running on ancient computers, at least by todays's standards.
An evaluation of the advantages of these kind of certifications compared to not having updates would be interesting (do they really add value, except moving around responsibility?).
That would be a highly interesting evaluation. I worked in the Aerospace/Defense industry in the 1980s, and it seemed to me that "we can't change X, X is 'flight certified'" was a huge excuse for not innovating, or maybe a huge roadblock to innovation. So it's big news in 2016, when Boeing is hinting about stopping 747 production, an aircraft that made its first flight in February of 1969, 46 years before. I'm guessing that "flight certification" is the largest factor in keeping airliner technology in the 1960s.
At the same time, we had a good understanding about aerodynamics in the 1960s and were producing more or less optimized designs. There are some additional optimizations that we've figured out like sharklets, but overall the design is similar -- at least until we trusted composites enough to use them in aircraft.
Where we have seen a lot of innovation is in the engines -- fuel economy and noise regulations have pushed GE, RR and P&W to up their game substantially.
The 747 is only one example. Martin Marietta made and launched Titan space launch vehicles from the early 60s to the early 90s, with only very slight changes and improvements. GD did much the same with the Atlas launch vehicle, and the Centaur upper stage. I will grant that NASA and Douglas/McDonnel Douglas made a lot out of the Thor IRBM, but that seems like a function of NASA Administrators having longer tenure than anything else.
*If the system is sufficiently critical, it may be hard to update or migrate it. And if it's still safely working there's no real incentive to do so.
In the very short term, perhaps. But I'd argue that critical systems are the ones most in need of the ability to be frequently updated and migrated. What's going to happen when a serious security problem demands an immediate change, and you're not prepared for it? Or the system catches fire, or floods?
SPOF critical systems are why so many organisations end up in legacy software hell.
You'd be hard pressed to convince me that Windows model for locking files is superior to what Unix offers, at least as far as file deletion goes. Conceptually speaking, it's pretty simple:
* Files are blobs of storage on disk referenced by inode number.
* Each file can have zero or more directory entries referencing the file. (Additional directory entries are created using hard links.)
* Each file can have zero or more open file descriptors.
* Each blob has a reference count, and disk space is able to be reclaimed when the reference count goes to zero.
Interestingly enough, this means that 'rm' doesn't technically remove a file - what it does is unlink a directory entry. The 'removal' of the file is just what happens when there are no more directory entries to the file and nothing has it open.
In addition to letting you delete files without worrying about closing all the accessing processes, this also lets you do some useful things to help manage file lifecycle. ie: I've used it before in a system where files had to be in an 'online' directory and another directory where they were queued up to be replicated to off-site storage. The system had a directory entry for each use of the file, which avoided the need to keep a bunch of copies around, and deferred the problem of reclaiming disk storage to the file system.
> Now replace the contents file being worked on or delete it, in the context of a multi-process application, that passes the name around via IPC. ...
Sure... if you depend on passing filenames around, removing them is liable to cause problems. The system I mentioned before worked as well as it did for us, precisely because the filenames didn't matter that much. (We had enough design flexibility to design the system that way.)
That said, we did run into minor issues with the Windows approach to file deletion. For performance reasons, we mapped many of our larger files into memory. Unfortunately, because we were running on the JVM, we didn't have a safe way to unmap the memory mapped files when we were done with them. (You have to wait for the buffer to be GC'ed, which is, of course, a non-deterministic process.)
On Linux, this was fine because we didn't have to have the file unmapped to delete it. However, on our Windows workstations, this kept us from being able to reliably delete memory mapped files. This was mainly just a problem during development, so it wasn't worth finding a solution, but it was a bit frustrating.
If you rm a file that someone else is reading, they can happily keep reading it for as long as they like. It's only when they close the file that the data becomes unavailable.
I know how inodes work, thank you. My first UNIX was Xenix.
Applications do crash when they make assumptions about those files.
For example, when they are composed by multiple parts and give the filename for further processing via IPC, or have another instance generating a new file with the same name, instead of appending, because the old one is now gone.
Another easy way to corrupt files is just to access them, without making use of flock, given its cooperative nature.
I surely prefer the OSes that lock files properly, even if it means more work.
> For example, when they are composed by multiple parts and give the filename for further processing via IPC
Of course, the proper way to do this in POSIX is to pass the filehandle.
POSIX is actually a pretty cool standard; the sad thing is that one doesn't often see good examples of its capabilities being used to their full extent.
For this, I primarily blame C: it's so verbose and has such limited facilities for abstraction that it's often difficult to see the forest for the trees. Combine that with a generation of folks whose knowledge of C dates back to a college course or four, in which efficient POSIX usage may not have been a concern, and one finds good, easy-to-read examples of POSIX harder to find than they really should be.
Unfortunately POSIX also shares the same implementation defined behaviour of C.
I had my share of headaches when porting across UNIX systems.
Also POSIX is now stagnated around support for CLI and daemon applications. There is hardly any updates for new hardware or server architectures.
Actually, having used C compilers that only knew K&R, I would say POSIX is the part that should have been part of ANSI C, but they didn't want a large runtime tied to the language.
After working as a Windows admin for a few years, I feel that it is one of those things that sound like a great idea at first, but are causing more problems than they prevent.
But years as a Unix user might have made me biased. I am certain people can come up with lots of stories about how a file being opened prevented them from accidentally deleting it or something similar. I am not saying it is a complete misfeature, just a very two-edged sword.
It looks to me like the "can't replace/delete a file that is opened" is one of the factors that causes the malware phenomenon in Windows. That is, you must reboot to affect some software updates. Frequent reboots meant that boot sector viruses were possible.
That policy also means that replacing some critical Windows DLLs means a very special reboot, one that has to complete, otherwise the entire system is hosed.
Sure, boot sector viruses existed since CP/M days - those systems were almost entirely floppy-disk based, and required a lot of reboots. Removable boot media + frequent reboots = fertile environment for boot sector viruses.
To be honest that sounds like a pretty bad set up if you need to manually delete files when you know there's a chance that the system is not only operating on them, but also not stable enough to handle exceptions arising from accessing them.
But arguments about your system aside, you could mitigate user error by installing lsof[1]. eg
$ lsof /bin/bash
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
startkde 1496 lau txt REG 0,18 791304 6626 /usr/bin/bash
bash 1769 lau txt REG 0,18 791304 6626 /usr/bin/bash
Well, on Linux distros it is not uncommon for individual packages to be updates as updates become available. So if there is an update to, say, the web server, it is sufficient to restart the web server.
On BSD systems, kernel and userland are developed in lock step, so it's usually a good idea to reboot to be sure they are in sync.
Usually you'd be running the same userland daemons on FreeBSD that you might on Linux. Web servers, databases, OpenSSHd, file networking protocols (FTP, SMB, NFS, etc), and so on aren't generally tied to a particular kernel version since ABIs are not subject to frequent changes (it would be very bad if they were).
So with that in mind, you can update your userland on FreeBSD without updating your kernel in much the same way as you can do with Linux. Though it is recommended that you update the entire stack on either platform.
Updating the FreeBSD base (to the next major release) without updating the kernel is Not a Good Idea. Backward compatibility is there, down to version 4.x, but no one guarantees forward compatibility!
Applications will usually work, but the base system will likely break in some places.
It's not at all strange. Security updates to publicly-facing services should be applied as fast as possible. Kernel vulnerabilities are a whole different attack surface
Kernel vulnerabilities can be combined with user space vulnerabilities. eg a buggy web script might allow an attacker shell access under a restricted UID. The attacker could then use a kernel vulnerability to elevate their permissions to root.
I have a netbook running 12.04 sitting on my nightstand that I use for listening to Podcasts / audiobooks as I go to sleep. I usually wake up after a few hours, put the netbook to sleep, then turn around and go back to sleep.
This means I only update the netbook on rare occasions, because when I go to bed, I want to... sleep, you know, not update my netbook. So currently, that thing has about 180 days of uptime (although it spent most of that time sleeping, of course). I have been meaning to install updates and reboot it for months, but during the day, I forget about it, and only think of it as I go to bed... A vicious cycle... ;-)
With a kernel update, one doesn't have to reboot, technically, it is merely required for the update to take effect.
The authoritative DNS server for pvv.ntnu.no is still a MicroVAX II from the late 1980s. It runs an (up-to-date, I think) NetBSD. Logging in by SSH takes several minutes, even with SSH v1.
Sounds just like the kind of half-crazy stuff NTNU students do when you give them some time to tinker. I heard a story from Samfundet where they were displeased with the speed of their key/value database for some payment processing stuff, so they replaced it with BIND.
Even though the thing is old, I'm guessing the uptime is not that impressive?
I've just spent five minutes thinking, "what's wrong with this writing style?" I still don't know. As a later poster pointed out, perhaps it is a peculiarly British style, which just doesn't register as abnormal to me.
There are also certain things which grate rather badly with me when reading US "style" too.
For example, "Recall our previous conversation...." Use of recall in this way feels so impolite and demanding. Normal use for me would be, "You may recall our previous conversation...."
Anyway, this is not a bashing session, I just genuinely didn't find anything odd about El Reg's writing style.
It's a very tabloid-y tone. Other commenters have used the word "informal", but that's not quite right. There's a certain over-the-topness that clearly distinguishes the tabloid press (British, American, German, etc) from a casual blog covering the same subject, for example.
Indeed - the loss of precision over may, can, might & could is a pet peeve of mine. Often language evolves, but with these words meaning is being lost - and sometimes inverted. I recently disagreed with someone who said that "you may not do something" (because you were allowed to), but it turned out that she meant might.
a) It's British, which might explain some of the language and humour (with a "u").
b) I keep forgetting about 'El Reg', but whenever I return there, I enjoy my visit. It has a strong, unique tone which I find a breath of fresh air amongst all the dull, neutral news reporting out there.
> It's British, which might explain some of the language and humour
It's better explained by simply describing the site as a tabloid. Tabloids aside, British news reporting isn't typically that informal.
> I keep forgetting about 'El Reg', but whenever I return there, I enjoy my visit. It has a strong, unique tone which I find a breath of fresh air amongst all the dull, neutral news reporting out there.
Personally I find the site appalling. It's up there with the The Daily Mail for rewriting other people's articles. Granted this is basically how 99% of reporting happens these days, but these kinds of tabloids often miss the entire crux of the original piece they're plagiarizing due to their editorialisation. For example, their articles on OpenSSL parallels the Daily Mail's absurd anti-immigration propaganda. And often, again like the Daily Mail, El Reg will completely miss-report a story just for the sake of having a clickbait headline.
Frankly, I'd rather have dull neutral news reporting over biased misinformation.
Yeah, I'm not a huge fan either and find him somewhat objectionable. Orlowski's the only one that used to turn off commenting on his articles which is half the fun of the reg, though he seems to do this less now. But every publication, in print or online, is going to have a writer who you just never get on with intellectually.
There's plenty of other good contributors such as Duncan Campbell [0] and Alastair Daabs [1] that more than make up for the Orlowski deficit.
The stance is one that reasonable people can disagree on but he resorts to childish insults.
Also he's a climate science denier which is just so odd in the IT world since for me I always associate IT as close to science in that you must deal with reality even if you don't like it.
I'm not convinced it's a parody. That would suggest their writing style is done as a joke; referencing British tabloids while distancing themselves as a comedy version. More likely El Reg's informal writing style is a deliberate and serious decision driven by the requirement to gain more readers (read: eyeballs on their ads). ie the same reasons any tabloid is written they way they are.
I am not from the UK and in fact I am not even a native English speaker and yet I had always assumed that the tone of El Reg was a parody. I think it is quite obvious.
The tone of El Reg is very similar to many other tabloids though. Most don't take themselves seriously. I gave a few examples of The Sunday Sport in another post, but here's one from a Scottish tabloid, reporting about when someone caught a failed suicide bomber trying to attack an airport:
I don't tend to read many tabloids, but the Red Tops (predictably named because of their design) I've read on occasions do follow the same tongue-in-cheek writing style.
I think it's fair to say they can't all be parodies as you then have to question when a parody is so commonplace that it's no longer a parody; that it instead just becomes that normal thing.
I think some folks are confusing "parody" and "tabloid style".
The definition of parody is:
"a work created to imitate, make fun of, or comment on an original work, its subject, author, style, or some other target, by means of satiric or ironic imitation." [0]
Websites that would fall into the "parody" category would be The Onion, The Daily Mash, Landover Baptist Church, The Poke etc. Another word to describe these types of sites would be satire.
The Register is not really a "parody" website because it's reporting on actual things that have happened (direct from source or through recycling), but the reporting and writing style is in the same vein or riffs off of the "tabloid journalism" style.
Tabloid journalism can be characterised as sensationalist, ridiculing and hyperbolic, with headlines designed to appeal to your more base instincts.
What The Register does, or attempts to do is inject humour into its reporting by using the tabloid style, it tries not to take itself too seriously. There is as you rightly point out a lot of "tongue in cheek" phrasing in many articles. For example, as I've mentioned elsewhere, I enjoy their headlines and bylines, e.g.:
"Criminal records checks 'unlawful' and 'arbitrary' rules High Court - Disclosing minor silliness no longer required, say judges" [1]
"Rust 1.6 released, complete with a stabilised libcore - A world without buffer overflows is what our children shall inherit" [2]
"Boeing just about gives up on the 747 - Even the cargo market's dried up for the Jumbo Jet. Next stop, elephant's graveyard?" [3]
Anyway, I wouldn't get too bogged down, in over analysing El Reg, you either get their sense of humour or you don't.
There are also an offshoot by ex-Register founder Mike Magee which reports in a similar style:
> Look at it. Big red banner at the top - just like the British tabloids. Constant twisting of headlines to add as much innuendo as possible.
The fact it follows the same design and editorial patterns as a tabloid doesn't mean it's a parody of a tabloid. A simpler conclusion would be that it looks and reads like a tabloid because it is a tabloid.
It's very common for products in any same category to be designed similarly. For example the packaging on washing detergents, breakfast cereals, and English ales often follow similar patterns because people expect their washing detergents to look like washing detergents, their cereals to look like breakfast cereals, and their beer to look like beer. Granted you do get some products that deliberately choose a non-conventional design in the hope to stand out. But generally items look similar because people expect that of them. And equally, The Register is designed like a "Red Top"[1] because people expect tabloids to look like that.
It's probably also worth noting that most "good" tabloids have a comical tone to them. In fact not taking themselves seriously is pretty much the prime directive for any Red Top. Though some take things to a more ridiculous level than others[2][3]
I think the crux of the argument should be "Are the stories real or made up for comical value?" Since El Reg is actually reporting real stories, I couldn't put it in the same camp as News Thump, The Onion, and the lesser known Suffolk Gazette (http://www.suffolkgazette.com/), which most definitely are parody sites.
> I guess if you are not from the UK it might not be so bleeding obvious, but for us it is. Would I be correct in guessing that you are not?
You'd be incorrect with that assumption. I wouldn't say I had a great degree of national pride, but I would say I'm very British. I drink tea, eat angus steak, drink real ales and single malt scotches, and constantly moan about queues. However I never carry an umbrella around with me. :)
El Reg since day one has always had a somewhat less than deferential style of writing. The original founder, Mike Magee[0], intended the reporting style to be tabloid-like right from the get-go.
You either love or hate The Register, personally I've enjoyed reading it since 1997 and get a good chuckle each morning from some of their amusing bylines.
Not inaccurate specifically, but sometimes heavily steeped in editorial bias and dramatisations that an uninformed reader would easily draw inaccurate conclusions from the piece.
A lot of news networks operate this way. Since they all often publish the same stories from the same sources, they end up competing with each other to publish the most dramatised interpretation of the facts in an effort to win rating. The Register just stands out as being one of the worst offenders (or most successful - depending on your view point).
> b) I keep forgetting about 'El Reg', but whenever I return there, I enjoy my visit. It has a strong, unique tone which I find a breath of fresh air amongst all the dull, neutral news reporting out there.
They're basically the tabloid press of the tech news industry. But their writing style is fun, I'll grant them that.
We have various news feeds displayed on TV here at work, The Register is one of them. It's always fun to see some of their headlines/copy pop up, the style of writing is really great (especially when they bash the Government's IT projects with colourful language).
Fairly informal, occasionally jokey writing is a hallmark of the British press. For us, it’s a bit of a culture shock to read the mainstream US press which takes everything very seriously.
In fairness, from the article it's not actually clear whether the server literally had an uptime (as reported by the OS) of 18 years, or whether it had simply been in constant service (modulus power cuts) for 18 years.
Having read and enjoyed this thread and the later follow up thread on The Register, I was struck by the number of commenters who could not clearly remember the dates/machine types or who posted anachronistic descriptions.
People here forging ahead with innovative hardware, why not just record brief details of dates and setups in the back of a diary or something. In 30 years time, you'll be able to start threads like this!
I was sad that we had to shut it down, but we had to shut it down due to migrating our primary colo to another city and were going to retire all of the hardware. I'd been manually backporting bind fixes, building my own version, and had to do some config tweaks when Dan Kaminski released his DNS vulns to the world.
It is always a sad day to retire an old server like that, but 18 years... What a winner!
Edit:
But 1158 days for an old dell 1750 running RHEL4 isn't too bad considering it serviced all kinds of external dns requests for the firm. Its secondary didn't have the uptime due to constant power issues in the backup datacenter and incompetent people managing the UPS.
At my previous job I had a server for big, commercial version control system (an SCM, as people in selling those call them) that was running RHEL4 and had similar uptime. I remember my team celebrating round 1024 days of uptime.
Yes, I thought about that too, but I guess that UPS will not survive 18 years - the battery for sure, the device probably neither. Can you replace one without shutting down the server?
> Twenty eight seconds later, an electrical blackout had cascaded across Europe extending from Poland in the north-east, to the Benelux countries and France in the west, through to Portugal, Spain and Morocco in the south-west, and across to Greece and the Balkans in the south-east.
Parts of several countries were affected. Other parts had no outage. Depending on the view point, the blackouts a few years earlier in Italy and US were more widespread.
I used to run the FreeBSD box for sendmail.org. When I left that job in 2001 it had already been running for 2+ years.
Considering that the datacenter it was in is now the Dropbox office, I'm guessing it had to be shut down and moved at some point, but 2+ years seemed like a really long time even then!
I don't get the "even then" part of your statement. If anything, big uptimes are a thing of an older, more monolithic era when patching was less common. It was extraordinarily common to have multi-year uptimes on important servers, whereas these days they seem far less common
In my experience, the past decade has been a time of trying to be more rigorous than ever about regular patching, routine reboots, and respinning VMs to ensure that your provisioning systems work as intended, so you never again end up with these monolithic irreplaceable systems.
These days, the only time I find servers with big uptimes is when they've been neglected -- they're some old bastard child of some former employee or the ancient rickety crap some department is too afraid to touch... And even then, it doesn't raise an eyebrow until it's 1000+ days.
Because even 15 years ago it was rare for a server to have such a high uptime, since you usually had to unplug and move it every once in a while, and hardware still didn't last years and needed replacement.
I always had many Unix machines with high uptimes around. My home PC (Linux) typically reboots 2 or 3 times a year. My office DNS server has currently 411 days of uptime and is the best of my bunch ATM.
In 2002 I had installed on the machines under my guard some program that reported uptime to some website. One of my machines, an SGI Indy workstation, had a high uptime, about 2 years. Then a new intern came, and we installed him next to the Indy. Unfortunately, his feet under the desk pulled some cables and unplugged the Indy and broke my hopes of records :)
oh man, I have them so beat! I have a Slackware Linux box with similar specs. 200 MHz Pentium, 32MB of RAM, and I think I have an old 10GB barracuda 80pin SCSI drive in it connected to an Adeptec PCI SCSI card. Every so often the hard disk starts making a high pitch noise but throws no errors and the noise goes away after a few minutes. It sits on a UPS and I probably have an uptime of a few years on it now. It has been running nearly 24/7 since 1996! Only powered off when I needed to move the box from a home-office and a few rented offices over the years.
When it was in the basement of my home/office, I would sometimes hear it's disks wine as I was working out (lifting weights and such). It was even in my basement through parties in my early bootstrap years.
I originally bought it to run WinNT 4.0 for a new company a friend of mine and I bootstrapped. I would guess a couple years later is when I put Slackware on it. It's running a 2.0 linux kernel. It's not exposed to the public Internet.
It use to be a local Samba, DHCP, and DNS server for the company. I eventually upgraded to new hardware and left this server around for redundant backups. I develop software so copies of my git repositories find their way onto this box each night. It is in no way relied upon other than to call upon it out of convenience if another server is down or being upgraded, etc...
At one point the box was in the basement of my home when a small amount of water got to the basement floor and because the box sat just high enough on rubber feet, no damage. Occasionally I go back there and pull the cob webs off it.
There is no SSL on it. We still telnet into it or access the SMB shares for nostalgia. It's sort of a joke in the office these days to see how long it will last or if it will simply out last us.
Where do you all live where the power doesn't go out a couple times a year? I'm from South Florida and I'm jealous. Tropical climates are no fun for always-on computers. I guess I don't need to worry about static discharge when I build my boxes, so that's a plus.
Michigan - and yes, power is awful in certain parts. Especially in subdivisions where much of the power cables are above ground and strung through trees... So bad that I was considering a whole house generator at one point but eventually moved to an industrial park for expanding business.... I also run equally old Trippline UPSes. They run off 220v. Every couple years we have to replace the batteries in them and other than the batteries, they still work.
The UPSes pretty much keep things running for a couple hours - most of my mission critical hosting servers are are in data centers these days and for the office, our servers sit on the UPSes where at most we lose power maybe an hour every couple years in the office park. Much more often in a residential area.
I'm always curious why the US has so many power outages and brownouts. Here in the Netherlands, and as far as I know (but I could be wrong) in most of western Europe as long as you don't live in a very rural area, we have an outage maybe once every five years.
Great run for all-original equipment. I worked at Shell's Westhollow Research Center in the mid-90s. We handled the nightmare of standardizing the desktop space (for the first time ever).
A lab was decommisioning an instrument controller that had been running non-stop since they had first spun it up, fresh out of tge paking box, a decade previous.
And they had never backed up any of the data. Sure, the solution was the pretty straight forward use of a stack of floppies. It was still pretty nerve-wracking having a bunch of high-powered research scientists watching over my shoulder, "making sure" I got all their research data off the machine they were too smart to ever back up themselves. Good Times.
Anyone running old machinery that had DOS drivers would likely have older computers. I remember working on base seeing 386/486s in an aircraft hanger area that were so covered in grime I was astounded they were still used.
I set up a FreeBSD box at a computer shop I worked at in high school. We had a T1 and static IP, so I set up some routes to make it internet accessible (my boss wanted to use it to host pictures for his eBay transactions).
I set it up, threw it under a table in the corner with nothing but a power and ethernet cable, and moved on.
I was surprised when 5 years later he called to ask why it had stopped working. I told him where it was, he rebooted it, and it came back.
(My memory was a little fuzzy but I probably set it up in 2001-2002 and it ran until at least 2007-2008)
Is there a way to track uptime across kexec[1] restarts? That way you could differentiate between a hard reboot and "soft" one (ex: automated kernel upgrade). Having a system like that working for a 18 years would be insane!
Reminds me of the old Netware servers we used to have running file services and print queues for a few computer labs at a University I worked at. Netware was really stable and we only restarted them when some of the hard disks in the raid array were dying.
This is beautiful to me; it's ROI is off the charts from any kind of reasonable expectation. Keeping it cool certainly helped, and having it serve a role that could even exist for 18 years is another important factor.
18 years is a really good run. I had some white-box Cisco networking equipment that had 10 year uptime. I shut it down when we closed the office they were in.
Actually it's kind of odd, but I had a situation where a desktop computer would have a very specific boot issue (as in, if you restart it, it hangs, but if you power down, wait 5s then power up, it boots fine). Over time as I upgraded it, I replaced the hard drive, the motherboard, the power supply, the video card, the cpu, the memory, and the case. The problem remained.
I know it's anecdotal, but the machine seemed to retain some core unwillingness to restart even though every piece of it had been replaced separately. I believe it was the same "broom" even though every component was different from the original.
One of the developers remarked that while he was proud the system he worked on could deliver such uptimes, having an uptime of, say, three years, on a server, also meant that a) its hardware was kind of dated and b) it had not received kernel updates (and probably no other updates, either) for as long. (Which might be okay, if your system is well tucked away behind a good firewall, but is kind of insane if it is directly reachable from the Internet.)
Still, that is really impressive.