Dissecting QNX [pdf]

insulanian · on Sept 19, 2018

I still have the QNX demo disk image (including GUI!) which fits on a floppy drive.

It's a shame there's no comparable open source OS :(

wolfgke · on Sept 19, 2018

For those who want to play around with it:

> http://toastytech.com/guis/qnxdemo.html

> https://winworldpc.com/product/qnx/144mb-demo

Fnoord · on Sept 19, 2018

A word of caution: the first URL doesn't have TLS (HTTPS). None of the links to the demos have either. The links on Win World PC don't have any checksums.

I actually remember using this demo floppy though, back in those days (the other impressive Unix-like OS I used back then was BeOS). It was the first time I stumbled upon Towers of Hanoi. Sadly the main developer of QNX passed away.

Koshkin · on Sept 19, 2018

I think the 32-bit MenuetOS still fits on a floppy (and is open source).

agumonkey · on Sept 19, 2018

graphical os smaller than your average favicon

Fnoord · on Sept 19, 2018

Related 34C3 talk [1]. Not sure if there's any news in this PDF. Skimming through the CVEs mentions 2017 but none from 2018 (34C4 was in December 2017).

[1] https://media.ccc.de/v/34c3-8730-taking_a_scalpel_to_qnx

dfox · on Sept 19, 2018

The article seems to be a more academic rephrasing of said talk.

ahati · on Sept 19, 2018

Anyone working on QNX based system here?

exDM69 · on Sept 19, 2018

Yes. Working for a HW vendor catering to automotive and embedded customers. QNX is one of the five or so OSes we work with.

insulanian · on Sept 19, 2018

What are the other ones?

exDM69 · on Sept 20, 2018

Linux (several flavors), Windows, QNX, GH Integrity, FreeRTOS and a handful of in-house and clients' proprietary special-purpose OSes.

wolfgke · on Sept 20, 2018

Interesting. I would have expected that also

- VxWorks (https://en.wikipedia.org/w/index.php?title=VxWorks&oldid=857...)

- Nucleus (https://en.wikipedia.org/w/index.php?title=Nucleus_RTOS&oldi...)

- perhaps ThreadX (the "spiritual successor of Nucleus") (https://en.wikipedia.org/w/index.php?title=ThreadX&oldid=852...)

are on this list.

Cyph0n · on Sept 19, 2018

Cisco's high-end router operating system, IOS-XR, has two flavors: an older one built on top of a QNX kernel and the current one built on top of Linux. The QNX version is still supported but is not in active development.

wolfgke · on Sept 19, 2018

Do you know any details on the reason why Cisco switched from QNX to Linux?

tyingq · on Sept 19, 2018

The official line...

"The 64-bit Linux infrastructure is the de facto standard that is being used across the industry today...So it gives us more development tools, more tool chains and also more access into the third party development ecosystem."

http://mobile.enterprisenetworkingplanet.com/netos/cisco-evo...

Cyph0n · on Sept 19, 2018

That seems to agree with what I've heard internally. Granted, I'm new, so I didn't go through the migration phase.

XR itself provides abstractions on top of QNX and Linux, so devs working on platform independent code are not really affected.

tyingq · on Sept 19, 2018

Maybe also because things like DPDK (or similar) made it possible for Linix to play at that scale?

Cyph0n · on Sept 19, 2018

The routers I am talking cannot do packet processing on the CPU. We are talking about 3.2 Tbps per linecard with 10 linecards per router.

All data plane activity is done on custom in-house ASICs called network processors (NPs). The CPU only handles control plane traffic and general administration.

jxramos · on Sept 19, 2018

One of my former employers was developing in QNX, as of 2017 they were still using some 2005 compiler for 32-bit.

exikyut · on Sept 20, 2018

What general space were they working in? Networking, for example?

jxramos · on Sept 20, 2018

Medical devices

lsiebert · on Sept 19, 2018

I don't myself, but my brother, who designs and builds stages for U2 and other musicians, uses it for automation (lights etc.)

ahati · on Sept 19, 2018

Does he use the AMX systems. That is Harman, Harman used to own qnx before.

mmorriso · on Sept 19, 2018

Yes, I maintain a few legacy system that still operate partially on QNX 4.25

It did a few really cool things for its time, considering that TCP/IP networking was an optional add-on when we originally deployed.

bregma · on Sept 19, 2018

Yes. I work for QNX. I can answer any technical questions.

vmsp · on Sept 19, 2018

What's your goto intro/reference to QNX's internals?

bregma · on Sept 19, 2018

I have access to the code. There is no more reliable reference. The source code is remarkably well organized and clearly coded and has massive amounts of unit tests.

When the code is not enough, I go to http://www.qnx.com/developers/docs/7.0.0/#com.qnx.doc.qnxsdp...

ahati · on Sept 19, 2018

How much is the difference of qnx7 and 6.4 at kernel source level? 6.4 source is available in github from time qnx was made open source.

bregma · on Sept 19, 2018

The biggest difference between 6.6 and 7.0 in the kernel is mostly that SMP is always enabled and 64-bit targets are first-class citizens. There are plenty of minor changes, but the kernel itself is very, very small.

There are plenty of changes to the userspace runtime, but the question was about the kernel.

Oh, also, 7.0 has an ISO 26262 certified variant. Not a technical difference, but an important one.

exikyut · on Sept 20, 2018

Hi! I have a couple questions I've been wondering about for years. Appreciate any insight/answers you can provide.

--

QNX was momentarily open for a little while until the license change with BlackBerry et al. My question: did this license change only apply to future QNX versions, or did it also retroactively apply to the "open"-sourced code as well?

My primary current interest in QNX stems from being fascinated with old and/or unusual operating systems. I'd love to be able to go fishing for 6.4 et al, maybe even compile what I find from source (or maybe not), and basically just play with the system to see how it works. If it's fine for me to go and find 6.4 and poke at it for noncommercial purposes - well, that'd be awesome to know. Obviously such a usage model would not incorporate any official agreement or warranty, and I understand that.

In a somewhat related vein, at some point I may find it useful to observe how QNX handles certain technical minutiae as part of my own (hopeful) OS development work. Obviously sourcing the latest versions of the QNX source for this purpose would offer support options, not to mention a more relevant codebase; but it would be great to know that I'd be able to safely make do with the older releases as long as I don't seek/expect any form of support.

--

I was unfortunately out of the loop with the QNX scene during the period it was open so I never got a chance to grab any of the repos. (And a quick search turned up what appears to be some QNX 6.4 bits and pieces on GitHub (as the previous comment hinted at) but it doesn't look very official, so I don't want to sift through it in case I waste my time.) Of course official repo access has since been closed, so I can't check that. So: I figure why not ask, what can it hurt.

When the repo was opened, did it include full commit history, or a large portion of it?

QNX 4.x is really cool. I managed to get an old copy working in a VM some time ago (took a bit of thinking; there's so little documentation out there). 6.x is nicer, but 4.x feels faster (which kind of makes sense).

--

I've been fascinated with QNX for years, and to be honest I want to say I find the "closed > yay open!! > closed" timeline incredibly frustrating; but besides wistfulness, this has also generated a fair bit of confusion regarding the current status quo.

Kenji · on Sept 19, 2018

I can tell you some of that. We ported all our applications from QNX6.5 to QNX6.6 and then QNX7.0.

They deprecated some POSIX calls like posix_spawn_file_actions_addopen (at least in QNX6.6 documentation they mention it's not fully implemented and it is indeed broken).

They made the QNX7.0 kernel instrumented-only (there are no longer non-instrumented versions).

They changed lots of low level stuff that is outside the kernel, like throwing out photon and replacing it with screen library, completely replacing the PCI server, making changes to the console, security patches, and so on. But since it's a microkernel, this is mostly in userland.

jayalpha · on Sept 19, 2018

I found this impressive https://membarrier.wordpress.com/2017/04/12/qnx-7-desktop/

TheSoftwareGuy · on Sept 19, 2018

Yup, my companies product is actually mentioned on the first page of the link

cornellwright · on Sept 19, 2018

I've used it on medical devices.

Kenji · on Sept 19, 2018

Yes. I work with QNX6.6 daily. I also worked with QNX6.5 and QNX7.0. It's a real blast to work with the mikrokernel. It's a really cool system, and the kernel stability is excellent. Feels good that when a low level driver crashes, your system just goes on as if nothing happened (except, of course, the applications that need this driver). Speaking of drivers, the driver situation on QNX is a bit sad, simply because there are few drivers and the ones that exist aren't as high quality and well tested because QNX is niche. Still, if there was a QNX environment that is as progressive as Linux Ubuntu (concerning GUI, drivers, etc.) and OpenSource, I'd definitely use QNX as my main OS.

ahati · on Sept 19, 2018

Can not agree more. Once you know qnx, nothing matches up anymore. Linux feels like dinosaur.

_ofdw · on Sept 19, 2018

>QNX is niche

What is the most mainstream or perhaps "least niche" of the real time embedded OSes?

anta40 · on Sept 19, 2018

Hmm... BlackBerry 10, perhaps? https://help.blackberry.com/id/blackberry-security-overview/...

BlackBerry nowadays is... well let's say the mobile world is basically divided into 2 sides these days: Android and iOS :p

jjoonathan · on Sept 19, 2018

What? BlackBerry is realtime? Why?

monocasa · on Sept 19, 2018

So you have the option of running the baseband on the application processor to save costs if need be, among other reasons.

ahati · on Sept 19, 2018

You need a hypervisor for that. Qnx brought that later.

dfox · on Sept 19, 2018

You don't. Symbian EKA2 platforms (eg. S60 3rd edition and later) used to run baseband as userspace threads on same core as user applications. I would not be that surprised if that was main enabler for Nokia E51/52, ie. full-featured smartphone in executive-phone form factor (small and in the first place thin candy-bar).

gnulinux · on Sept 19, 2018

There is linux-rt, a fork of linux but I don't think it's competitive in the industry.

roymurdock · on Sept 19, 2018

VxWorks/FreeRTOS

There are also more mainstream real time Linux distros e.g. Yocto Linux

nickpsecurity · on Sept 19, 2018

Check this out:

https://membarrier.wordpress.com/2017/04/12/qnx-7-desktop/

sidkshatriya · on Sept 19, 2018

Opening the link in Google Chrome 68 shows the PDF front page momentarily and then the browser shows a blank screen.

Is anybody else encountering this problem?

namdnay · on Sept 19, 2018

https://bugs.chromium.org/p/chromium/issues/detail?id=870404

sidkshatriya · on Sept 19, 2018

Interestingly when I relaunched Chrome and it auto-updated to Chrome 69 I am able to see the PDF without a problem. Not sure where the issue lies now.

als0 · on Sept 19, 2018

I have this problem too with Chrome on macOS. Refreshing the page makes it appear, but this happens frequently enough to be annoying.

ahati · on Sept 19, 2018

Very insightful. QNX 7 disable aslr maybe because of ASIL support.

saagarjha · on Sept 19, 2018

Are you talking about Automotive Safety Integrity Level? If so, how does this have anything to do with ASLR?

kristoffer · on Sept 19, 2018

I don't know why they disabled ASLR, but safety critical systems (and functional safety people) tend avoid randomization...

rurban · on Sept 19, 2018

Because such small embedded systems tend to avoid a stack and recursion, and more, tend to disable malloc at all.

Variables and places are predefined. ASLR is a problem there, not a solution.

saagarjha · on Sept 19, 2018

Having code that behaves differently if it's loaded at different addresses seems like a bug. So by not doing that, aren't you just masking it?

JdeBP · on Sept 19, 2018

Presume that you are a software engineer. Your career and other people's lives depend from your producing systems that operate safely. You also have to make risk analyses and meet performance goals.

Your operating system executes different program images for every successive execution of your program, picked in an unpredictable manner.

How do you prove that every possibility passes the safety tests? How do you measure the risk of this random selection? How do you know when you have done enough simulation?

How do you match up software randomization with the ISO 26262 concept that all software faults are systematic and not random as (some) hardware faults are?

How do you prove that memory allocation and execution always meet performance goals? How do you construct and perform reproducible performance tests? How do you demonstrate that your measurements are meaningful?

Software engineering in this case involves thinking about all of these questions and more besides.

* https://hal.archives-ouvertes.fr/hal-01375451/document

* https://www.usenix.org/sites/default/files/conference/protec...

It appears (to me, at least) that the current state of the literature on ASLR is that it is treated as a succession of theoretical arms races, which new defence militates against which new attack, and almost no attention is paid to the concerns of actually deploying it in a larger system; and the current state of the literature on functional safety is simply "we will assume that there are no randomization processes in the software" (from an actual paper presented at ESREL 2016).

wolfgke · on Sept 19, 2018

> It appears (to me, at least) that the current state of the literature on ASLR is that it is treated as a succession of theoretical arms races, which new defence militates against which new attack, and almost no attention is paid to the concerns of actually deploying it in a larger system; and the current state of the literature on functional safety is simply "we will assume that there are no randomization processes in the software" (from an actual paper presented at ESREL 2016).

Thanks for your explanation. To give a slightly different perspective on the quoted paragraph: mitigations such ASLR etc. do not protect against security bugs, they just make them more "inconvenient" to exploit. So "average script kiddie" will probably not be able to write an exploit for them. On the other hand, for well-founded agencies (think 3-letter agencies), these are no serious hurdles. In this sense, mitigations do not improve security in the sense of "less security holes". Instead their (probably unintended, though not undesired) consequence is that mostly well-founded agencies are able to exploit security holes. Whether this new situation is good or bad for software security is up to the reader to think about.

0xcde4c3db · on Sept 19, 2018

To be more specific about "more inconvenient": I believe part of the intended effect of ASLR is to make ROP exploit attempts typicaly crash the process instead of successfully gaining control. This (ideally) brings admin attention to the system, which attackers generally want to avoid.

wolfgke · on Sept 19, 2018

> To be more specific about "more inconvenient": I believe part of the intended effect of ASLR is to make ROP exploit attempts typicaly crash the process instead of successfully gaining control.

Keep in mind that before ASLR came, there was (and still is) DEP and its claims that lots of classes of attack were now impossible. The end of this story was that ROP was invented and hardly anything has changed, except that ROP code is much more tedious to write (i.e. no problem for well-funded attackers).

Now we have ASLR and you are probably right that now ROP exploits lead to process crashes instead. But attackers have already invented new techniques for circumventing ASLR, such as return-to-plt, GOT overwrite or GOT dereferencing. Again making it more inconvenient for script kiddies to write exploits, but again no problem for an attacker who can throw lots of money and people at the problem.

reitanqild · on Sept 19, 2018

> But attackers have already invented new techniques for circumventing ASLR, such as return-to-plt, GOT overwrite or GOT dereferencing. Again making it more inconvenient for script kiddies to write exploits, but again no problem for an attacker who can throw lots of money and people at the problem.

Helmets and bulletproof vests is no match for powerful rifles.

I'm a bit tired of this reasoning here on HN: If it isn't perfect it is worthless.

I think I can see reasons why a vendor might want to avoid ASLR in safety critical systems.

But we shouldn't talk down decent protection tecniques that will often save us.

wolfgke · on Sept 19, 2018

> I'm a bit tired of this reasoning here on HN: If it isn't perfect it is worthless.

This argument (that "If it isn't perfect it is worthless" does not hold) is suitable for many topics in life, but in my opinion not for IT security. I can conceive that this might be one reason, why so many people (explicitly including politicians) make such bad decisions about IT security.

I might be somewhat paranoid regarding this topic (which is not a bad trait if you want to work in this area), but let me give my arguments:

First: the fight for secure systems is deeply asymmetric. The attacker side just needs one working exploit, while the defender side has to ensure that there exists no security hole. This strong asymmetry really makes it necessary that the security is as perfect as possible.

Second: if the device is connected to the internet, everyone/every device that exists in the world can be an attacker. So what you are fighting against is the whole world. Or in other words: the security of the system that you use has to withstand the smartness of some of the smartest people in the world.

Let it be stated clearly that this fight is not hopeless as it looks based on these arguments: for designing the security of your system, you can resort to the knowledge of many really, really smart people, too: this is what the various standards (e.g. for cryptography) are about. What you cannot afford is to tolerate the slightest bit of imperfection in the security architecture of the system.

TLDR: In security, at least "If it isn't at least nearly perfect, it is worthless" does indeed hold.

saagarjha · on Sept 19, 2018

Cryptography isn't perfect; someone could always guess your private key. But that doesn't make it useless, since you're hoping that it's just sufficiently improbable that nobody in their right mind will even try doing it.

wolfgke · on Sept 19, 2018

> Cryptography isn't perfect; someone could always guess your private key.

For the accepted standards, even the smartest people working in this area have not yet found a method to find the private key sufficiently fast (at least such a method has not been published). So to the best of our current knowledge, those methods are at least very near to the perfection that is possible with our current technology.

cdcfa78156ae5 · on Sept 19, 2018

> Cryptography isn't perfect; someone could always guess your private key.

Cryptography is a branch of mathematics, and cryptographic systems can be formally proved to have certain properties, such as being unable to derive the private key from the content of the encrypted message. That the private key can be guessed is a trivial observation, and a bad argument for dismissing formal proofs. ASLR is a hack on a hack that does not tell you anything about the formal properties of the system.

wolfgke · on Sept 19, 2018

> Cryptography is a branch of mathematics, and cryptographic systems can be formally proved to have certain properties, such as being unable to derive the private key from the content of the encrypted message.

A small correction: All those proofs (if they exist) are relative to complexity-theoretical conjectures that are (ideally) widely believed to be true, but open. The only system that I am aware of where an "absolute" security proof exists is OTP, but this is hardly suitable to use in practice.

wolfgke · on Sept 19, 2018

> Having code that behaves differently if it's loaded at different addresses seems like a bug.

Why? This only sounds like a bug to me if it is intended to be position-independent code (PIC).

A reason why in safety-critical code ASLR is avoided is that it introduces another source of non-determinacy and potential bugs, which you want to avoid.

UPDATE: So you really want to keep the system as simple and small as possible and avoid to add anything to it that can introduce new bugs.

nwmcsween · on Sept 19, 2018

At what point do you want general purpose code to be position dependent?

dfox · on Sept 19, 2018

When you program for platform where writing PIC involves ugly hacks with measurable performance impact, for example i386.

nwmcsween · on Sept 20, 2018

Its not i386 that's the issue, it's the ABI

ahartmetz · on Sept 19, 2018

Would you rather die to expose a masked error or live and leave in the masked error? That's what rules for critical systems are about. The time to fail fast is before production.

cpeterso · on Sept 19, 2018

What is ASIL? Automotive Safety Integrity Level? How does ASIL relate to ASLR?

ahati · on Sept 19, 2018

Yes, with ASLR, qnx loses the predictability. BTW, qnx7 is ASIL-D on SEooC.