I saw that! And it was helpful to me getting started, thanks.
I wanted to generate types from the protocol file rather than generate them at run-time, so that was the first thing I did. I think it helps debuggability/autocompletability/readability to have the types reified; check it out if you're interested, all the capital-letter-named files are generated.
It's a few-line change to output the AST to code via astor (it's actually part of how I got everything working in the first place). You get full executable code with docstrings an everything, though no support for generating inline comments (I think that changed in 3.6, IIRC).
Would you be interested in a patch that does something like that? It should be a heck of a lot more robust then what you're currently doing (which is manual code generation?).
It's an issue about how there's no good way to determine the remote debug interface version, since the version information that's present never changes, and they keep moving functions.
I put together a caching system so the generated wrapper is made available for overview/manual-patching, etc.... It currently validates the generated file against a generated-from-the-json file at import, and falls back to blindly importing the json file if the tooling to generate the class definition is not available.
Ideally, any patching for the class file should really be done as a subclass, as well as any more complex logic. That way, if the protocol json file is changed, it can be just updated without breaking anything.
Oh, as an aside, you're actually missing a significant chunk of the protocol json file, mainly because there are actually two of them. There's a `browser_protocol.json` AND a `browser_protocol.json` file. They're both produced as build artifacts, so I can't point to them in the source, but the total json size should be > 800KB (mirror in chromecontroller: https://github.com/fake-name/ChromeController/tree/master/Ch... ).
As it is, this should kind of just splat into your source tree, but I don't want to move everything around without at least asking you how you want things structured.
I recently wrote a guide to using headless Chrome with Selenium and Chrome WebDriver [1]. I thought that some people in this thread might find it useful considering that the submitted article doesn't mention Selenium at all.
Your guide mentions setting window size and taking screenshots, which I thought were not currently working via chromedriver. Do you know if that was fixed or is there something else going on?
Setting window size via the ChromeDriver API didn't work for me but I was able to set it using a ChromeOptions command-line argument as I do in the guide. Screenshots seem to work fine on Linux but I haven't tried on other platforms.
Off-topic, but I'm just going to start bringing this up in every thread like this ...
We need a recipe / toolchain for running a browser instance in a chroot jail. With a head. GUI. On the desktop.
I want to fire up a "banking browser" (or "danger browser" or "social media browser") that runs on its own IP, in its own chroot, and I can revert to virgin snapshot state anytime I want ... and I don't want to fire up a full-blown VM for eac of these ...
What is standing in the way of making this happen ?
Who is working on things related to this that I can donate funds to or establish a bounty ?
The idea here is that I set up a chroot jail for firefox or chrome and configure it with things like local filesystem for cookies and cache and certs, etc.
It would also get its own unique IP, this jail.
Then I fire up firefox inside that chroot jail and use it to visit some websites ... and then I can wipe the whole thing out and redeploy again later, starting from scratch.
I don't need to trust incognito mode, I don't need to trust wiping cache or tabs talking to each other (or not) and I can worry a lot less about browser level exploits.
I can even put in firewall rules so that my "banking" instance can only talk to boa.com and scotttrade.com (or whatever).
It's totally workable (and I have done it) with vmware. Make a "bank browsing" VM and revert to pristine snapshot every day. The problem is that this is terribly heavyweight and overkill when I don't need a full blown VM.
It's not even really a browser issue - the real issue is, how do you jail a GUI application in X such that that window is in a chroot jail, distinct from the rest of your desktop?
>> this is terribly heavyweight and overkill when I don't need a full blown VM
The entire concept you're aiming to set up is terribly heavyweight and overkill. If you're knowledgeable enough to be discussing VMs and chroots, you must realize that what you are proposing is being careful to the point of paranoia à la tinfoil hat. Those of us who know how to stay as safe as possible via "basic" methods of security should be sleeping soundly knowing we're already in the top 5-10% of consumers. Install OS security updates, use a virus scanner and firewall, don't install pirated software (more likely to contain malware), and you're better off than most people by a significant margin.
You're talking about barely making a dent in the chances of your credentials or sessions being compromised. Private browsing, a separate browser instance, a VM, or chroot makes no difference if you have malware with a keylogger on the host system. Give yourself a break, realize that there is no such thing as "perfect security", and stop worrying so much. The amount of energy you're pouring into "banking safely" is not a sane endeavor. It serves no useful purpose. You could be investing this time and energy into something far more likely to improve your quality of life (eg: family, friends, health, etc.).
I've seen so many people stress about getting their credit card stolen or bank accounts hacked. It's rather ridiculous considering you don't bear the liability of a hack. If you didn't access or approve a usage of your accounts, the banks just give it back. I have had money stolen more than once from skimmers and I have never had any trouble getting it all back.
My experience with it, though, wasn't great. X11 forwarding through SSH was quite laggy (even after performing some optimizations on the connection). Good luck if you want to set-up audio/mic support. It's a nice solution for a one-time banking login, not for day-to-day use.
Or you can have a banking computer, danger computer and social media computer with everything truly separate. These days, it's quite cheap, as you can get an ARM board with HDMI output for $18 incl shipping and no management engine blobs.
You can create arbitrary local networks with it, and isolate concerns in separate hardware.
If the "social desktop" get's compromised or even rooted, the attacker would still need to find a way through the physical router/firewall, etc. It would not be just a question of finding/using VM/container/chroot escape vulnerabilities.
You can also air gap certain endpoints. Physical security? Use FDE and pop the sd card out of the SBC and take it with you when you're out of home.
Security against rubber hose cryptoanalysis or 'contempt of the court' cryptoanalysis? Devise some way to quickly destroy the sd card when necessary. Then there's nothing to rubber hose you for.
/s? Maybe. But it's all possible today, rather easy to do, and cheap. :)
I always thought http://www.zerovm.org/ had a lot of potential. I recall an HN story where someone used this to spin up a ZeroVM instance for each request that came in over an API and tear it down once the request was finished. That sort of speed and isolation would work really well for this use case.
It's pretty dated, but the state of containers has only improved since 2015, and this was usable for Google Hangouts (video) and YouTube all the way back then.
^ This is the Dockerfile, in a repo full of Dockerfiles for other things you might have had trouble putting into a container.
I tried this myself and had problems because (I think) I was on a very old LTS/stable distro with a necessity to use pretty old Docker release. This person is a maintainer on the Docker core team, so is most definitely tracking newest releases.
I use Chrome in a headless setup with Kubernetes and Docker (Jenkins with kubernetes-plugin) but it's not Headless Chrome, it's Xvfb that makes my setup headless. Chrome itself works great in a container. It's one excellent way to keep it from eating up gobs of RAM like it does: just give it less RAM (cgroups).
If you said "chroot jail" on purpose because you don't want Docker, I don't know why exactly, maybe because you don't think you can run graphical apps in a Docker container, but it's not true. You can!
You could also cobble something together and do it without Docker, but I'm not recommending that, I just saw this and thought it would be exactly what you're looking for. Hope this helps!
It might sound unpopular but, I've always thought these things are a bit of shenanigans, honestly. It sounds like it increases security a lot but it seems to just be a little extra paper on top. It's good for resource control (stop Chrome from eating all RAM) or installing a browser quickly in a container/docker/jail/whatever, but security wise I think it's not the right solution.
The thing is, a chroot jail doesn't really protect my browser in the way I want to (if I'm speaking personally, I guess). It's not the right level of granularity.
If an exploit compromises my browser, it would, essentially, have vast amounts of access to my personal information already, simply due to the nature of a browser being stateful and trusted by the user. Getting root or whatever beyond that is nice I guess, but that's game over for most people. This is true for the vast majority of most computer users. I don't clear my browser history literally every day and 'wipe the slate clean', I like having search history and trained autocomplete, and it's kind of weird to expect people suddenly not to. It seems like it's a move laterally, in a direction that only really satiates some very technically aware users. Even then, I'd say this approach is still fundamentally dangerous for competent users -- a simple mistake or a flaw in the application you can't anticipate, or your own mistake, could expose you easily.
A more instructive example is an email client. If I use thunderbird and put it in a chroot jail, sync part of my mail spool, and then someone sends me an HTML email that exploits a Gecko rendering flaw and owns me 'in the jail' -- then they now have access to my spool! And my credentials. They can just disguise access to the container and do a password reset, for example, and I'm screwed. Depending on the interaction method of the application, things like Stagefright were auto-triggered, for example, just solely by sending SMS. It's a very dangerous game to play at that point, when applications are trying to be so in-our-face today (still waiting for a browser exploit that can be triggered even out-of-focus, through desktop notifications...)
The attack surface for a browser, and most stateful, trusted apps -- basically starts and ends at there, really. For all intents and purposes, an individual's browser or email client is just as relatively valuable as any company's SQL database. Think: if you put a PostgreSQL instance inside a jail, and someone exploits your SQL database... is your data safe? Or do they just exfiltrate your DB dump and walk away? Does a company wipe their database every week to keep hackers from taking it?
Meaningful mitigation has to come, I think, in the way Chrome does it: by doing extensive application level sandboxing. Making it part of the design, in a real, coherent way. That requires a lot of work. It's why Firefox has taken years to do it -- and is pissing so many people off to get there by breaking the extension model, so they can meaningfully sandbox.
Aside from just attack surface concerns though, jails and things like containers still have some real technical limitations that stand in the way of users. Even things like drag-and-drop from desktop into container is a bit convoluted (maybe Wayland makes this easier? I don't know how Qubes does it), and I use 1Password, so the kind of interaction between my key database means we're back at square 1: where browser compromise 'in the sandbox' still means you get owned in all the ways that matter.
Other meaningful mitigations exist beyond 'total redesign' but they're more technical in nature... Things like more robust anti-exploit mechanisms, for example, in our toolchains and processes. That's also very hard work, but I think it's also a lot more fruitful than "containerize/jail it all, and hope for the best".
I have a feeling you misunderstood the parent's idea. The jail there is not to prevent someone from breaking out from the browser into the system. It's to contain simple attacks on your data, exactly because the browser is a stateful system with lots of stored secrets.
If you have a full sandbox breakout exploit, both cases are broken. But if you have just a stupid JS issue that breaks same-origin, or causes a trivial arbitrary file read, jails will put you from them just fine. It's pretty much to stop a post you open from Facebook from being able to get your PayPal session cookie. Not many exploits in the wild are advanced.
> If you have a full sandbox breakout exploit, both cases are broken. But if you have just a stupid JS issue that breaks same-origin, or causes a trivial arbitrary file read, jails will put you from them just fine.
If you can read an arbitrary file, what is stopping you from reading the browser's e.g. password database files, inside the container, or any of the potentially sensitive cached files, for example? Those files are there -- the browser writes them, whether or not it is in a sandboxed directory or not.
Or do you assume that there is no password database that the user stores in any 'sandboxed' browser instance, ever, and they copy/paste or retype passwords every time or something? This is basically treating every single domain and browser instance as stateless. This is what I mean -- users are never going to behave this way, only people on places like Hacker News will. They aren't going to use 14 different instances of a browser, each one perfectly isolated without shared search, or having to re-log-into each browser instance to have consistent search results or and autocomplete. It's just an awful UX experience.
Of course, maybe you don't map files in there, inside the container. That's too dangerous, because if any part of the browser can just read a file, it's game over. Perhaps you could have multiple processes communicate over RPC, each one in its own container, with crafted policies that would e.g. only allow processes for certain SOP domains to request certain passwords or sensitive information from a process that manages the database. Essentially, you add policy and authorization. There is only one process that can read exactly one file, the database file. The process for rendering and dealing with the logic of a particular domain does not even have filesystem access, ever, to any on disk file, it is forbidden. It must instead ask the broker process for access to the sensitive information for a particular domain. You could even do this so that each tab is transparently its own process, as well as enforcing process-level SOP separation...
The thing is... That's basically exactly what Chrome does, by design. As of recent Chrome can actually separate and sandbox processes based on SOP. But it can only do that through its design. It cannot be tacked on.
Think about it. Firefox does not have true sandboxing or process isolation. Simply wrapping it in a container is not sufficient, and simply having 40,000 separate Firefox containers, each with its own little "island" of state, each for a single domain, is simply unusable from a user POV for any average human being. It is also incredibly dangerous (oops, I accidentally opened my bank website inside my gmail container, now they're contaminated. Now if my bank website serves me bad JS, it can possibly get some content related to my gmail, if it can bypass browser policies. In Chrome's new architecture, this can't happen, from what I understand, even if you don't run two separate, isolated instances of Chrome. SOP is now process level, and it is truly baked into the design.)
How do you make this not garbage from a user POV? By rearchitecting Firefox around multiple processes, where each domain is properly sandboxed and requires specific access and authorization to request certain data from another process. And where processes that need access are literally denied filesystem access. That requires taking control of the containers itself, the same way Chrome does. Chrome goes to extreme lengths for this.
The only way to truly enforce these things is at the application level. Just taking Firefox, slapping it inside Docker or a jail, and doing that 40,000 times for each domain isn't even close to the same thing, if that's what you're suggesting.
You're right about a lot of things, but there are still missing pieces. Whatever the sandboxing idea is used in Chrome (and you're right, Chrome is the gold standard now), a simple issue can still bring it all down. The are RCEs in Chrome published almost every month. Some will be limited by sandbox and that's great. But I disagree with:
> It cannot be tacked on.
Security as in prevention of the exploit cannot be tracked on. But separation of data can be. And there's a whole big scale of how it works, starting from another profile, to containers and data brokers, to VMs like qubes, to separate physical machines.
Chrome still uses a single file for cookies of different domains. And because you may have elements of different domains rendered at the same time, it needs that access. But that's exactly where either profiles or a stronger separation like containers can enforce more separation.
Yes, it does involve some interaction from the user, but it's not that bad. The UI can help as well. "This looks like a bank website. Did you mean to open it in a Private profile?", "You're trying to access Facebook, would you like to use your Social profile instead?" Realistically, people only need 3-4 of them (social, shopping, secure/banking, work)
We practically solved spam clarification already and that's in a hostile environment. Detecting social sites should be simple in comparison.
At least 1 VM is strongly recommended but you could containerize within that.
Edit: toolchain for running a browser instance in a chroot jail. With a head. GUI. On the desktop - it's here today (with commercial support), just run Windows+Sandboxie in a VM. Yowch!
We do have a bug on the website, it doesn't really have to do with the design system itself but with the way the website itself is built. Should be fixed soon. Thanks for the feedback :)
In case anyway checks this out, an update to the Clarity site has already went out that improves the performance of the website drastically. Let me know if you're still seeing any issues and thanks again for the feedback :)
hi @arwineap, you are right in saying that Clarity is using chrome webdriver + xvfb. That's for the css regression testing, which we haven't looked into moving to use headless Chrome. We've switched our unit tests to use the headless Chrome: https://github.com/vmware/clarity/blob/master/build/karma.co...
Hrm, but those also run in travis right? I noticed at the bottom of the karma.conf it checks to see if it's in travis.
The travis image is installing chrome stable, which is not 59; AFAIK the headless feature is only available in beta right now which versions out to 59.
I admit I'm not sure if your unit tests are running in travis, but if they are, I'd wonder if the --headless flag is just getting dropped as an unknown flag.
Curious either way actually so let me know; I'm looking at starting the testing moving from phantomjs to headless which is why I was sleuthing out Clarity's move
I contributed a small patch to get this working on Windows. We were a PhantomJS shop, but it was just so unstable, thought we'd give this a shot. Have been running on it for over 2 months now and it's dropped test failures due to intermittent issues to near 0.
No mention of WebDriver, only Chrome's devtools protocol :( There's probably a proxy or something, but would be nice to see completely integrated native support.
That proxy already exists. It is called ChromeDriver[1][2] and was developed in collaboration between the Chromium and Selenium teams. It is the same way that Selenium/Webdriver controls regular Chrome now.
It would have been nice if they at least mentioned it in the article since Selenium is such a popular browser automation tool. They do in fact mention it on the README page for headless chromium.
The problem with ChromeDriver is that it implements Selenium's idea of what the remote control interface should look like, and frankly speaking, Selenium's idea has some enormous and idiotic oversights[1].
it's called ChromeDriver and I kinda mentioned it originally — "There's probably a proxy or something" — and said that I want native fully integrated support :)
I haven't looked at Headless closely enough yet, but the biggest pros I can see are:
- Potentially less overhead (system resources)
- Much simpler setup (compared to something like Xvfb)
- Better support for actual automation tasks, e.g. screenshots, separate sessions, etc.
The last point is especially relevant if you run a tool that is visiting many sites in parallel. If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc). Another obstacle I have discovered is that you can only take screenshots for the tab that is currently active. For my site (https://urlscan.io) I work around that by manually activating the tab when the load event fires to take the screenshot. Works reasonably well, but can sometimes fail under load.
>If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc). Another obstacle I have discovered is that you can only take screenshots for the tab that is currently active.
This is a huge flaw in NightmareJS, which is disappointing because of how beautifully simple its API is. A Nightmare fork rebuilt over headless Chrome would be the best of all worlds for browser automation.
> If you run multiple tabs per process to keep memory usage and Xvfb instances limited then you won't be able to have separate browsing sessions, e.g. two concurrent navigations to the same origin could interfere with each other (cookies, local storage, etc).
Assuming you're running Selenium, this is handled. If you need, you can call each session with its own profile.
Will the "--disable-gpu" flag no longer be needed in the future because headless mode will automatically disable the GPU, or because GPU support will be added to headless mode? I really hope it's the latter.
Huh. So, if you can express your PNG / PDF rendering in terms of HTML, SVG, Canvas, and WebGL... How much easier, faster, more reliable would it be to use Headless Chrome and --screenshot, rather than other means?
HTML & CSS are not really suitable for "pixel perfect" text. Even text in SVG works just like text in an HTML document in the sense there is not a way to control precisely where text will wrap. You'd have to disable the browser's word wrapping & come up with your own logic for where to insert <br /> tags, even then you have issues like browsers ignoring CSS rules because of 3rd party browser extensions, missing fonts & such.
As you resize your browser, you see the browser re-layout the page & the text jumps around. The text wraps at different positions as you resize your browser around. You'd have to essentially render text to an image server-side, then scale the resulting image in the browser. And if you already have an image, there's no need for tools like wkhtmltoX or phantom for creating an image (you already have an image)... So headless browsers are not suitable for rasterizing documents that contain text [in my experience].
Converting HTML to PNG/PDF is like comparing apples & oranges. The conversion will be imperfect.
> HTML & CSS are not really suitable for "pixel perfect" text.
...and a lot of people don't care.
> You'd have to disable the browser's word wrapping & come up with your own logic for where to insert <br /> tags
...unless you don't care.
> even then you have issues like browsers
"BrowserS"? Plural? No. This is one implementation - headless Chrome. Which you could use to do your own rendering. There's no plural. You would use one browser to do your headless rendering.
> You'd have to essentially render text to an image server-side
That is. What this is.
> then scale the resulting image in the browser
What? It depends entirely on what your use case is. If you want to email a PDF, then you could use headless Chrome to turn HTML into a PDF, and then email it.
> So headless browsers are not suitable for rasterizing documents that contain text
I feel like you're off in this weird space, very different from what I've experienced. And I've experienced it again, and again, at different companies.
> Converting HTML to PNG/PDF is like comparing apples & oranges. The conversion will be imperfect.
...but potentially faster, lower effort, and "good enough" for many use cases.
That's a small trade-off I'm willing to take. It beats having to have a completely separate engine for the PDFs (compared to just using what you're already displaying on the web) and it's way more complete than using something like DOMPDF which doesn't even have support for some of the most basic CSS features.
I imagine Headless Chrome will make this obsolete shortly but just in case anyone wants to play with it, here is Dullahan - A headless browser SDK that uses the Chromium Embedded Framework (CEF). It is designed to make it easier to write applications that render modern web content directly to a memory buffer, inject synthesized mouse and keyboard events as well as interact with features like JavaScript or cookies.
Has anyone tried to use headless chrome to detect insecure content on what you hope is a secure page? I've only looked into it a bit, and so far failed. I'd like to know if a visitor to https://example.com/foo.html in chrome will get a warning about insecure content. I'm looking for a browser because in some cases the insecure content is loaded by javascript from a 3rd party. I don't know a way to do that besides fire it up in a browser and let it go.
Is there a way to get the secure/insecure status from headless chrome?
Some things that I found hard: no documentation on how to get it running in a CI yet - some devs (Justin Ribeiro - https://hub.docker.com/r/justinribeiro/chrome-headless/) got a docker file all set up which helped a lot, the devtools protocol whilst it has docs wasn't quite as simple as I had hoped and I had to guess of how to call the node library.
[0501/120318.494691:ERROR:gl_implementation.cc(245)] Failed to load /opt/google/chrome/libosmesa.so: /opt/google/chrome/libosmesa.so: cannot open shared object file: No such file or directory
Selenium opens a real chrome window. So it's automated, but not headless. You will also be able to use this headless support in the new chrome to run it without the window with selenium.
I don't know we've heard any performance/latency issues before that concern issuing script evaluation over the protocol. Were you sending hundreds of scripts to evaluate? Or payloads far over 1MB?
why is Google doing this? They're helping everyone else build a spider that could easily serve as the front end to a new search engine. Is this their way of fighting back against the internet closing off? Give everyone a spider so perfect that walling in your content becomes impractical? I like the idea of Google lighting the match that burns down the walled gardens, but if not this than why?
It could be anything, but I feel like there's definitely another motive to Google unleashing something that could bite their core business.
You're looking for strategy everywhere. People at big companies do a lot of things and not every feature is high-level strategic.
I don't know their motivations, but the first use that comes to mind is to make automatically testing your own web app using Chrome easier. Most teams working on web apps could use better testing, whether they work for Google or not.
$ google-chrome --disable-gpu --headless --print-to-pdf https://www.chromestatus.com/ && ls -la .pdf
ls: cannot access '.pdf': No such file or directory
I hate to be that person, but why is it so exciting to exchange our privacy and freedom for a shiny piece of software? What's so wrong with Firefox + PhantomJS?
I agree with the sentiment. However an issue I had using PhantomJS is that it doesn't support modern JS rendered sites ( doesn't support much es5 at all it seems) I needed to create a pdf from such a site and couldn't get PhantomJS to do it. Headless chrome saved the day.
Then I started on a python library that would handle communication with chrome in a real way using asyncio: https://twitter.com/llimllib/status/855433309375082496
If that's a thing that you're interested in, let me know. https://github.com/llimllib/chrome-control/tree/wstest