Hacker News new | past | comments | ask | show | jobs | submit login
Clear Linux Project (clearlinux.org)
164 points by Merkur on May 19, 2015 | hide | past | favorite | 56 comments



Just what we need a Linux distro who's main goal is apparently to promote Intel products. The language used to describe it makes this quite clear. "The goal of Clear Linux OS, is to showcase the best of Intel Architecture technology...". This is a blatent attempt to exclude ARM who is gaining Linux market share. Whatever innovation they might bring to the table, I will avoid it purely on the basis that its aim is to benefit Intel rather than the user. Dot org my ass.


The site is weak but you should check the LWN link given in this thread. They have done some cool stuff actually.


I don't think they really expect you to want to use it directly. As it says, it's a showcase. But a lot of the technology might make it into other distros in more generic forms.


Not the first time Intel does this.

Moblin was started because MS balked at making Windows for a Intel chip that didn't offer PCI enumeration.


One issue where "pure" containers have an advantage over VMs is IO.

For network intensive workloads, there is a choice between the efficiency of SR-IOV and the control & manageability of a virtual NIC like virtio-net. In order to get efficiency, you need to use SR-IOV, which (the last time I checked) still made lots of admins nervous when running untrusted guests. Sure, the guest could be isolated from internal resources via a vlan, but it could still be launching malicious code onto the internet, and it may be difficult to track its traffic for billing purposes, especially if you want to differentiate between external & internal traffic. SR-IOV NICs also have limited number of queues and VFs, so it is hard to over-commit servers. So in order to maintain control of guests, you end up doubling the kernel overhead by using a virtual NIC (eg, virtio-net) in the VM and a physical NIC in the hypervisor. Now you have twice the overhead, twice the packet pushing, more memory copies, VM exits, etc.

The nice thing about containers is that there is no need to choose. You get the efficiency of running just a single kernel, along with all the accounting and firewalling rules to maintain control & be able to bill the guest.


SR-IOV should not really make you nervous, it uses the iommu. Billing might have some issues I guess.

There are higher performance virtual network setups eg see http://www.virtualopensystems.com/en/solutions/guides/snabbs...

Container networking has overheads, the virtual network pairs and the natting is not costless at all, and most people with network intensive applications are allocating physical interfaces to containers anyway.


LWN got an article about it... https://lwn.net/Articles/644675/



Is it bad to send an LWN Subscriber link to a large number of people? Will they get upset? They mention that they'll remove the feature if it gets abused.


From https://lwn.net/op/FAQ.lwn#slinks:

"Where is it appropriate to post a subscriber link? Almost anywhere. Private mail, messages to project mailing lists, and blog entries are all appropriate. As long as people do not use subscriber links as a way to defeat our attempts to gain subscribers, we are happy to see them shared."



thx!


How they purport to do packaging is interesting, but I'm not sure it will work well in the end. Having "bundles" that contain immutable sets of packages sounds good from a stability point of view, but unless they are entirely self contained, you'll undoubtedly run into a library that you need to updated for one bundle that then forces you to update another entire bundle. If each bundle is entirely self contained (allowing it to have it's own set of libraries), you're essentially recreating what's a static binary through package semantics. This comes with the usual downsides of static binaries.

I'm interested in seeing it tried though. The learning is in the doing.


Self contained packages are not a new idea. For example, PC-BSD has been doing this for years, via their PBI package format. See the description of PBI here: http://www.pcbsd.org/en/package-management

I think PBI does de-duplication at the package manager level by manipulating hard-links to common files, rather than installing multiple copies.


> I think PBI does de-duplication at the package manager level by manipulating hard-links to common files, rather than installing multiple copies.

Which is, itself, a bad reinvention of Plan 9's Venti filesystem. Having one, or two, or a million files on disk containing the same data should take up as much space as having just one. "Hard links" are a policy-level way to express shared mutability; deduplication of backing storage, meanwhile, should be a mechanism-level implementation detail.


ZFS has support for block-level deduplication and it comes with heavy memory and performance requirements. File-level deduplication with hard links is lightweight and requires no special support (besides a filesystem which supports hard links, obviously).


Well, there is something to be said for the 'worse is better' approach.


Self contained packages are not the problem, individualized libraries shipped along with those packages are.

How does PBI handle minor library version differences then? If one package provides and uses mylib-1.3.1 and another provides and uses mylib-1.3.5, how is that distinguished as the core library level (the plain .so file)? My understanding of what Clear Linux is attempting allows this level of granularity to ensure a package (really an amalgamation of individual packages in the sense of most current unixy distros) is functional and updated as a whole.


I believe that PBI works the same way. It allows different packages to independently use multiple versions of the same lib simultaneously. There is deduplication via hardlinks ONLY when the checksums match.

That's what made it attractive to me: I've painted myself into a corner several times when trying to install Ubuntu PPAs that want conflicting versions of shared libs


I'm almost of the mindset that just having userspace apps as lxc containers with everything an app needs may be better for the most-used applications...

Given how relatively cheap even fairly big SSDs are, is it really worth the storage savings for your browser to share a couple .so files with your mp3 player?

I actually like how PC-BSD pbi packages work... given the number of times solutions have been made to work around the issue and reduce space... I'm not sure it's always worth it. At least not in the desktop space.


Imagine the problem of tracking down all the different versions of a library when an exploit is found. If you have 20, or even 50 different apps that bundled openssl, imagine the hassle of making sure each one was vetted and updated as needed, not to mention the delay in getting all the different packages rebuilt and pushed (which may be a small delay, or may not, depending on the vendor).


You regularly use 20 or 50 end-user applications that use openssl?

I'm not talking the low-level OS applications here... I'm talking end-user applications and major exposed services.

For that matter, each of those applications needs to be updated, vetted and packaged... it's a matter of the level and completeness of packages.


What's considered an end user application? Installed languages (Perl, Python, Ruby, etc)? Would you consider all of regular userspace one "app", or split it into multiple chunks (dev tools, web tools, etc)? Wget, curl and chrome all use openssl.

smtpd? httpd? sqld? sshd? ntpd?

This may be illuminating, it's the list of RPMs that have a requirement containing the string ssl: # for RPM in `rpm -aq --qf '%{NAME}\n'`; do rpm -qR $RPM | grep -iq ssl && echo -n "$RPM "; done python-ldap libssh2 mailx Percona-Server-server-55 abrt-addon-ccpp libfprint qpid-cpp-client-ssl perl-Crypt-SSLeay ntp httpd-tools openssh openssl-devel pam_mysql redhat-lsb-core Percona-Server-client-55 sssd-common perl-IO-Socket-SSL qt squid openssl098e elinks ipa-python python-nss compat-openldap Percona-Server-shared-55 systemtap-runtime perl-Net-SSLeay python-libs Percona-Server-shared-compat openssl systemtap-client qpid-cpp-server-ssl certmonger python-urllib3 openssh-server nss_compat_ossl openldap percona-toolkit pyOpenSSL git sssd-common-pac wget ntpdate openssh-clients openssl systemtap-devel postfix nss-tools perl-DBD-MySQL libcurl curl sssd-proxy libesmtp

That's just for SSL, which while it's used in many applications and services, generally is limited to items that communicate externally, for the most part. What about when it's a core library that everything uses? tzdata updates often... We want correct time representations, right? Gzip is used by a lot of applications. What about glibc?

It's not an easy problem, but that's why I'm interested in how it turns out.


If you're talking a server, then the exposed application servers (those listening on ports) would be better served as an abstraction from the core OS (lxc is a decent collection of solutions for that, as would BSD jails, or even Solaris containers)...

As to end-user applications, if you are developing using Perl, Python, etc.. then developing against the host is your choice... that said, having the result deployed separately might be a better option, in a container a valid one.

You bring up a great example... OpenSSL has a minor change, you upgrade that package, and everything runs fine, except a ruby gem you rely on is broken, and your system is effectively down, even if that ruby app isn't publicly facing, and only used internally. If they were in isolated containers, you could upgrade and test each app separately without affecting the system as a whole. Thank you, you've helped me demonstrate my point.

That said, in general there aren't dozens of applications that are run on a single system that matter to people... and most of those that are could well be isolated in separate containers and upgraded separately. Via docker-lxc, your python apps can even use the same base container and not take as much space.. same for ruby, etc... and when you upgrade one, you don't affect the rest.

I've seen plenty of monolithic apps deployed to a single server, where in practice upgrading an underlying language/system breaks part of it.

Myself, for node development, I use NVM via my user account (which there are alternatives for perl, python, ruby) that allow me to work locally with portions of the app/testing, but deployment is via a docker container, with environment variables set to be able to run the application. It's been much smoother in practice than I'd anticipated.


I'm not arguing that there's no advantage to containerized packaging, just that it also comes with it's own set of problems. I'm not sure what weighs more at this point, the advantages of good encapsulation, or the problems caused by the system itself being harder to query. I'm not sure any level of encapsulation is worth it if it leads to a service being exploited when it otherwise wouldn't be, due to missing just one instance where some underlying library upgrade was missed. But this is a solvable problem, it's just a matter of tooling.

I'm not arguing against more containers, I'm just making the point it's not all rainbows and kittens. There are problems, but if we address them and solve them, we come out well ahead of where we were before.


If they are micro-vms, container-style, I don't think they will have such need to share any library? -in theory, at least- ..

I mean, it is possible to completely isolate them, all.

It may end-up very heavy though, but, and I can be wrong on this, with the constant growth of storage capacities, network bandwidth, RAM capacity, and the progress made to lighten "containers", I don't think this "heavy" downside I see of immutable infrastructures will be a real issue in the future.


No, but identifying which of your 20 micro-VMs is susceptible to the next OpenSSL exploit, and rolling out the fixes may be. It's both simpler in some aspects and more complex in others to lave local library versions for every app/service. Managing service prerequisites becomes easy and managing service feature updates becomes easier than it was, but managing service security updates becomes more complex. Juggling these different needs and capabilities is where it gets interesting.


I got your point.

I guess it just lead to a turning point, where end-users won't have to worry about security updates for x or y library, but more about updating the application they're using. In the case you use containers/micro-vms, if there is a security update to do, the container "maintainer" would be in charge to push the security update, then you just need to update your container.

I'm not sure which one is the most constraining, dealing with conflicts or being careful on relating on well maintained "containers".

I guess, for production environments, the second option looks like a wise choice.


I just tried it. it is fast.

its a VM really, but packaged like a container. On my laptop, it starts about as fast as a Docker container, ie less than a second.

This is quite impressive.


I'm not so sure that running a container is directly analagous to using it in a VM.


When using something like LXD I can barely tell the difference.


While namespacing allows you to do various things, most container setups are vm-replacements, running either full OS or not, they're used for the same purpose in the end (ie resources separation)


> most container setups are vm-replacements, running either full OS or not, they're used for the same purpose in the end (ie resources separation)

No they are not. A VM is a completely different system, while a container is a packaged application.

VM's provide an awful lot more than just resource separation... security and isolation being at the top of the list.

The problem we see here is an awful lot of people think a container is a drop-in replacement for a VM, when it is usually not.


> No they are not. A VM is a completely different system, while a container is a packaged application.

I think you have a misunderstanding of terms here, possibly confused by all the fuzz around Docker. A container is nothing more then a virtualization technology on OS level[1]. What you are talking about is something like rkt, which is how to run an app inside a container[2]. From the point of your app there is no difference between a VM, OpenVZ or LXC.

[1] https://en.wikipedia.org/wiki/Operating-system-level_virtual...

[2] https://github.com/coreos/rkt/blob/master/Documentation/app-...


> What you are talking about is something like rkt, which is how to run an app inside a container

Rocket, and Docker, yes.

> I think you have a misunderstanding of terms here, possibly confused by all the fuzz around Docker

I agree Docker has spread a lot of FUD, causing great confusion about what Docker can do, but also what containers are.

> A container is nothing more then a virtualization technology on OS level[

Not quite. A container was intended to be the first (for linux) truly portable application. You create an application, "containerize" it, then you can run that application on any system with minimal effort (Ubuntu app running on CentOS, etc).

Containers are not virtualizing anything, and that is the entire point. They remove the virtualization/emulation overhead of a hypervisor and instead run your application at native speed on the native system.

Docker has tried to make a do-all application which then provides process isolation and other things to add "Security", but at the end of the day, an app running in a container on your system can still negatively impact other containers and/or the host OS (if your container needs to read/write to /etc for example).

In a VM, everything is isolated because it's literally it's own OS running on (what it thinks is) it's own hardware. An app can destroy the VM, or the VM can be exploited, but nothing outside the VM can be effected.

Xen/KVM have zero comparison to things like Rocket, and Docker.


that's an incorrect understanding on many levels unfortunately.

About VMs:

A lot uses direct-host-communication-mechanisms such as paio, virtio, etc. These are close to running on-the-host. Them KVM and Xen both have bugs and these can be exploited to reach the host OS as root. Its just that they have a much smaller attack surface.

About namespacing (if we're into nitpicking, let's use the actual tech. name shall we?):

Most namespaces are used to create so-called containers which run entiere OS images (LXC, systemd-nxspawn, Docker, etc. all are used for that by default) in combination with chroot() and other technologies - even thus you can just call namespacing functions within a process or with a limited amount of programs in a chroot.

It does not matter that the kernel is shared, if you bring in your entiere userland, from a user perspective containers == VM. Sure, from a tech. point of view its different, but that's EXACTLY my point - most users use both the same way.


> Containers are not virtualizing anything, and that is the entire point.

What is it when a process sees a different process tree, different filesystem tree, different network then the host?

http://man7.org/linux/man-pages/man7/namespaces.7.html


As the link you provided states, it's called Namespacing.

> A namespace wraps a global system resource in an abstraction thatmakes it appear to the processes within the namespace that they havetheir own isolated instance of the global resource. Changes to theglobal resource are visible to other processes that are members ofthe namespace, but are invisible to other processes. One use ofnamespaces is to implement containers.

Virtualization via hypervisor does a lot more than just namespace isolation.[1]

> The basic idea behind a hypervisor based virtualization is to emulate the underlying physical hardware and create virtual hardware(with your desired resources like processor and memory). And on top of these newly created virtual hardware an operating system is installed. So this type of virtualization is basically operating system agnostic. In other words, you can have a hypervisor running on a windows system create a virtual hardware and can have Linux installed on that virtual hardware, and vice versa.

> So the main basic thing to understand about hypervisor based virtualization is that, everything is done based on a hardware level. Which means if the base operating system (the operating system on the physical server, which has hypervisor running), has to modify anything in the guest operating system(which is running on the virtual hardware created by the hypervisor), it can only modify the hardware resources, and nothing else.

[1] http://www.slashroot.in/difference-between-hypervisor-virtua...


After reading the overview and features, I'm left wondering:

* what tangible benefits would I get from using Clear Linux over my own heavily customized/handrolled linux server?

* how does the update system handle breakage/conflicts?

* are any of Intel's changes likely to make it into other existing distros or kernels?


This was linked elsewhere in the comments. I think it answers some of your questions:

http://lwn.net/SubscriberLink/644675/5be656c24083e53b/


I just tried on my desktop, woot it's super fast!

[ 0.000000] KERNEL supported cpus:

[ 0.000000] Intel GenuineIntel

[ 0.000000] e820: BIOS-provided physical RAM map:

...

[ 1.245851] calling fuse_init+0x0/0x1b6 [fuse] @ 1

[ 1.245853] fuse init (API version 7.23)

[ 1.246299] initcall fuse_init+0x0/0x1b6 [fuse] returned 0 after 431 usecs


read the hypervisor part of the lwn article: https://lwn.net/SubscriberLink/644675/5be656c24083e53b/

quote: "With kvmtool, we no longer need a BIOS or UEFI; instead we can jump directly into the Linux kernel. Kvmtool is not cost-free, of course; starting kvmtool and creating the CPU contexts takes approximately 30 milliseconds."


Anyone know what they might be doing for the speed increase?


Seeing as it's Intel I might guess that they either have extra instructions in the instruction set they know about or other optimizations that they know to look for. Seeing as they only support 4th generation and E5 v3... it wouldn't surprise me.


i wonder if it builds with icc? seems like a matter of pride they should get that working.


that was my first guess at "how'd they make it faster?" icc is sometimes a shockingly better (read: compiled code that is faster) compiler.


I dont quite understand what this is: Is it a linux distribution that can have a graphical interface like Gnome 3? My question is essentially: Is it more like Ubuntu or more like Docker?


More like CoreOS


Would very much like to see a comparison of Clear Containers and LXD. Would also like to know why Intel decided to do their own thing and not just help with the LXD project.


Unless I am not getting something, are the developers expected to manually compile everything that isn't in a bundle?

And then recompile again whenever a bundle gets updated?


Correct me if I'm wrong but shouldn't 'Cloud' have a lower case C if it's not a product?


I didn't find very mutch information about it.. yet. :( anyone played with it?


Download link didn't work for me in firefox for some reason, had to paste the link:

https://download.clearlinux.org/


I am surprised they didn't go down the container route for OS updates like CoreOS. I think I like that approach.


This does use containers and in fact they have some interesting modifications that they have made to the rkt container runtime to use KVM isolation instead of just namespaces and cgroups. See a link in 4ad's comment for an LWN article.

Those modifications are exciting for me as one of the developers of rkt. We built rkt with this concept of "stages"[1] where the rkt stage1 here is being swapped out from the default which uses "Linux containers" and instead executing lkvm. In this case the Clear Containers team was able to swap out the stage1 with some fairly minimal code changes to rkt which are going upstream. Cool stuff!

[1] https://github.com/coreos/rkt/blob/master/Documentation/deve...


Would be cool if it built with ICC, like the old Linux DNA project.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: