Selenium tests are inherently slow, unreliable and flappy. They have been the bane of developers for every employer I've had. Do yourself a favor and write React and test your components without a browser driver in good ol' JS with the occasional JSDom shim. It removes almost the entire need for Selenium, which should be reserved for only the faintest of smoke tests. And please, if you have to use Selenium, use headless Firefox, because PhantomJS is very bad software.
I had a Rails consultancy (Makandra) recently work on a JS-heavy application that I happen to own, and they got Selenium singing on it, which had been beyond my capabilities for years. One of their tricks, which you can inspect the implementation of in their (public) utilities library [+], is using basically a vendored Firefox per project and VNCing into that Firefox to drive things around. It is thus off-screen and out of the way when you're using it, but apparently is more true-to-reality than headless.
The test suite they wrote has about ~600 tests and while they're slower than I'd like (2~3 minutes) they've been bulletproof since we got my dev environment configured properly. It includes some fairly complicated interactions, most relevantly around our calendar interface.
I had been using Firefox driver in Xvfb but wasn't happy with the performance/stability. So I built a Selenium driver out of Java only (using JavaFX's embedded WebKit) and used a headless JRE windowing toolkit (Monocle). My project is still a pre-release but the headless capability, Java-only system requirement, and its ajax handling might make it useful to some people currently: https://github.com/MachinePublishers/jBrowserDriver
1. Not quite sure. I've only used PhantomJS via Selenium Ghost Driver. From that usage they're similar. The main difference is that my driver uses only Java so under the hood the JRE is launching WebKit through JNI and everything runs in the same JRE process.
2. Current WebKit version depends on the JRE used. Oracle Java 1.8.0_45 has WebKit version 537.44.
3. Java maintainers will update WebKit periodically, including within a major version. E.g., here they update WebKit for the 1.8.0_60 JRE: http://openjdk.java.net/jeps/239 ... Other than that I'm not sure.
Honestly, if it was useful, I would probably use this my side project which is commercial, but in no way competes with what you're doing (load testing). I might make changes or improvements, and I would generally contribute those back. Affero doesn't "mix well" so it is pretty much a non-starter for me.
I've found this is true for a lot of projects and it seems like restrictive licenses prevent projects from going mainstream.
I have trouble with Selenium's failure rate too. End up writing a test engine in Javascript. It handles async js func call with js function callback when finished to get rid of all the sleep-wait-retry type logic in Selenium.
Works very well. I can run the 100+ test cases in all IE/FF/Chrome/Safari, ios/android browser without change one line of JS/Test code. Runs fine with desktop with wire connect to cellphone browser on cell connection.
It tests out all the app backend db logic also. The time/pass/fail info are submitted back to the test backend db.
Can you give an example of how it works? Say navigate to a page, fill in a form, click submit and verify that some text is present after submitting the form
Does it really? Does it test for whether a button you thought was present isn't actually clickable?
If you're going to write tests, I think it makes an insane amount of sense to emulate real world conditions as much as feasibly possible (making judgement calls on things that don't matter like speed of the mouse).
Most Selenium tests don't test that a button is actually clickable though, they find things through the DOM and if the button is offscreen or hidden they won't realise it.
This has been our experience as well. We invested a lot of time and money in making sure Selenium tests run reliably for our clients. Despite this, the best reliability we managed to achieve was 90% with tests that run for 40 minutes, which is obviously not acceptable.
Mostly these would be cases where the browser would, seemingly at random, end up in an unpredictable state and all proceeding test scenarios would fail because of this. (Page is white, or a completely unrelated website gets opened. We have seen lots of weird situations so far)
This might be exacerbated by the fact that we use the remote Browserstack Selenium hosting service so that the tests can be executed automatically as a part of our deployment process.
This is pretty good actually. It sucks if you're relying on Selenium testing for verifying your code as you're writing it, but before and after deploys to staging and production? This isn't bad at all.
40 minutes from clicking a button to deploy is actually abysmal, especially when you need to worry about things like rollbacks, or deploying at the end of the day, or releasing quick hotfixes to users. In modern build processes, even 10 minutes seems too long.
For testing Mithril.js, I wrote a mock window object, which allow you to do things like simulate requestAnimationFrame clicks, JSON-P calls and browser quirks from non-browser environments (e.g. from a Node.js script). So to test, you simply swap `window` with the mock and you can drive your fake browser however you wish.
You can cover a lot of ground with that approach and make an extremely fast test suite that is suitable for a save-refresh-test workflow and then you can put trickier tests in a secondary test suite that you only run once in a while (e.g. before a commit)
The extent of the testing I'm currently interested in is "load a page, does the JS on that page run without error"? It won't execute from a CLI and everyone I talked to pointed me at Selenium.
Testing for a page load without JS error is a fine use case, and is an example of what I meant by the "faintest of smoke tests." It's a test that has very little chance to flap, fail, or force you to write hacky commands around Selenium's unreliable API.
I currently manage a rather large test suite (around 700 different tests) using Selenium, which is all written in Ruby and Rspec (although I've also used Cucumber), and uses the gems Capybara (an abstraction layer for querying and manipulating the web browser via the Selenium driver) and SitePrism (for managing page objects and organizing re-usable sections).
The entire suite runs in around 10 minutes on CircleCI, using 8 parallel threads (each running an instance of the Firefox Selenium driver), and it is rock solid stable.
It took us a while to get to this point, though.
The hard part is handling timing due to Javascript race conditions on the front-end. I had to write my own helper methods like "wait_for_ajax" that I sprinkle in various page object methods to wait for any jQuery AJAX requests to complete. I also use a "wait_until_true" method that can evaluate a block of code over and over until a time limit has been reached before throwing an exception. Once you figure out ways to solve those types of issues, testing things with Selenium becomes a lot more stable and easy.
I have also used the exact same techniques (page objects, custom waiter methods for race conditions, etc) to test mobile apps on iOS and Android with Selenium.
It can be a challenge, but once you have a system down and you know what you are doing, it's not so bad.
The most annoying thing I found with Selenium was that it wouldn't wait for the browser to respond to click events and rerender.
The approach in the blog post (and I think elsewhere ... not sure) is to poll the DOM with a timeout.
Is there a better solution to be add with something like `executeScript`? You could run `requestAnimationFrame`, and then poll for an indicator that the click, etc. handler has indeed finished. That way if it fails, you know about it pretty soon, without the need for long timeouts. This is all just a guess though.
Ruby's Capybara encapsulates Selenium and waits until elements appear on the page (the default timeout is 2 seconds). So you can write simple sequential code like
and it will work even if the baz element is injected into the page by an Ajax request to the server triggered by clicking on bar. I've been using it for many years but I didn't check how they implement it. Maybe a callback from a MutationObserver? https://developer.mozilla.org/en-US/docs/Web/API/MutationObs...
I have had some good results using the F# canopy library(http://lefthandedgoat.github.io/canopy/) for working with selenium. It handles (most) all the waits for you so you don't have to scatter a bunch of sleeps in your tests and is pretty easy to work with.
> One developer designed a way to take a screenshot of our main drawing canvas and store it in Amazon’s S3 service. This was then integrated with a screenshot comparison tool to do image comparison tests.
I would also take a look at Applitools https://applitools.com/ — they have Selenium webdriver-compatible libraries that do this screenshot taking/upload and offer a nice interface for comparing screenshot differences (and for adding ignore areas). Way fewer false failures than typical pdiff/imagemagick comparisons.
(where `driver` is your WebDriver object, e.g. `WebDriver.Chrome()`).
Then to match that frame against a previously-captured "template" image,
you can use stb-tester's[1] "match" function[2] which allows you to
specify things like the region to ignore and tweak the matching
sensitivity.
Everyone in the blogosphere (and at my own company) writing non-app-specific layers on top of selenium suggests that there is scope for a higher level framework that can be used on top of selenium. Or that the selenium api is too thin a layer over webdriver.
I absolutely hated the robot framework. The DSL was just horrible to use. It had weird, unnecessary syntax quirks and gave you the minimal amount of information if something failed (wouldn't tell you which line number it failed on, for instance).
The tests were also flaky as hell but that was more to do with poor environment management. That, admittedly, was also easier to fix in python.
It provides a "page object model" implementation on top of Capybara, so you can define a model for each page you want to test, which stores the page's relative URL, and has references to all the elements on the page you care about, and methods for all the interactions you want to do with that page.
So for example, you might have a "LoginPage" model, which contains the following:
class LoginPage < SitePrism::Page
set_url "/login"
element :username_input, '.username-input'
element :password_input, '.password-input'
element :submit, '.submit-button'
def login(username, password)
load # Load the page URL in the Selenium instance
username_input.set(username) # Fill in username
password_input.set(password) # Fill in password
submit.click # Click submit
end
end
Then whenever you want to login from one of your steps, you can just do:
I think it's a nice abstraction as it allows more experienced test automation developers to build the page model while less experienced ones can write steps just calling the methods. You still have to pay a lot of attention to things like appropriate use of "wait for element to appear" rather than "sleep", and ensuring tests use isolated data, to get it working reliably, but we've got it working pretty well at my current place.
I should write up how we have it set up at some point as we have our own app-specific framework on top of SitePrism which provides some useful abstractions to make it quicker to develop tests.
I'm just getting into Play Framework development, and they ship with FluentLenium, which seems to add some a more friendly API and convenience functions. Nothing too fancy, but just looking at the pure-Selenium coffee examples people have posted here shows how dramatic the effect can be.
The one downside is that the developers only seem to tag official releases once in a blue moon; despite the github repo being well updated, the last push to Maven was more than half a year ago, and so depends on a rather old version of Selenium.
I just write my own layers on top of Selenium (with python)
This one is a rough test automation, mostly used for filling in forms etc during development http://kopy.io/LMBKt (old one but to hand) handy to be able to open, login and fill in a form in a few seconds that by hand would take minutes.
I find that way works as the abstraction is only one level removed and I can just throw in methods that relate to that project.
The PageObjects tip is a really good one. Previously using Selenium you end up with a complete maintainability nightmare.
I used Geb on a recent project, and I actually felt that the tests I built demonstrated a passable level of engineering discipline. However, Geb was really hard to learn (partly the error messages were really confusing/missing) and you're still on top of Selenium so you still get wacky exceptions and edge cases.
Improve them how though? Speed? Reliability? If it's just a nicer API, that's all well and good, but until the key problems I face with Selenium are solved (slow and non-deterministic tests) then a nicer API to it is just rearranging deck-chairs on the Titanic.
It seems that you try to use selenium 2, or webdriver, in order to run your unit tests.
Selenium is for browser test, and by its nature it can not run in milliseconds. Its execution time is in seconds.
Even when use phantomjs webdriver.
It is integration testing approach because it combines execution of several javascript modules. That run in real browser.
Selenium has its purpose, but fast test execution is not one of them.
> It seems that you try to use selenium 2, or webdriver, in order to run your unit tests.
Nope. Integration tests. But integration tests that start a Firefox instance from scratch and have to be rerun multiple times to pass due to non-determinism are slow.
Using Capybara alone one gets most of the stuff they had to implement (page, with, retries, ...) but I'll look into those gems you suggest.
Maybe the Scala ecosystem is still immature on the side of integration testing. They could implement them in Ruby if they are familiar with the language. I don't feel OK about using two languages but at least it could enforce strict separation between integration testing and the application.
Some very good information in this article.
It is true that Selenium has its quirks, retrying a failed test can sometimes result in a passing test.
Disclaimer: I work for https://testingbot.com : at my work we offer our customers automatic retries when a test fails.
Writing a Selenium test does take its time, but once you run it in parallel across hundreds of browser and os combinations, it's worth it.
BrowserMob, that was a sweet service (based on selenium). Does anyone know what happened to those guys after they sold? I've always wanted to learn more about their story.
using it right now for my latest project, it is a nightmare. I have 1100 tests that have to run per night. I'm using PhantomJS.
It is such a mess ! ! !
> getWithRetry takes a function with a return value
>
> def numberOfChildren(implicit user: LucidUser): Int = {
> getWithRetry() {
> user.driver.getCssElement(visibleCss).children.size
> }
> }
>
> predicateWithRetry takes function that returns a boolean and will retry on any false values
>
> def onPage(implicit user: LucidUser): Boolean = {
> predicateWithRetry() {
> user.driver.getCurrentUrl.contains(pageUrl)
> }
> }
At first I didn't get the difference between `getWithRetry` and
`predicateWithRetry`, but then I noticed that the former throws an
exception whereas the latter returns false. I infer that `getWithRetry`
will handle exceptions thrown by the retried function.
In stb-tester[1] (a UI tool/framework targeted more at consumer
electronics devices where the only access you have to the
system-under-test is an HDMI output) after a few years we've settled on
a `wait_until` function, which waits until the retried function returns
a "truthy" value. `wait_until` returns whatever the retried function
returns:
Since we use `assert` instead of throwing exceptions in our retried
function, `wait_until` seems to fill both the roles of `getWithRetry`
and `predicateWithRetry`. I suppose that you've chosen to go with 2
separate functions because so many of the APIs provided by Selenium
throw exceptions instead of returning true/false.
> doWithRetry takes a function with no return type
>
> def clickFillColorWell(implicit user: LucidUser) {
> doWithRetry() {
> user.clickElementByCss("#fill-colorwell-color-well-wrapper")
> }
Unlike Selenium, when testing the UI of an external device we have no
way of noticing whether an action failed, other than by checking the
device's video output. For example we have `press` to send an infrared
signal ("press a button on the remote control"), but that will never
throw unless you've forgotten to plug in your infrared emitter. I
haven't come up with a really natural way of specifying the retry of
actions. We have `press_until_match`, but that's not very general. The
best I have come up with is `do_until`, which takes two functions: The
action to do, and the predicate to say whether the action succeeded.
It's not ideal, given the limitations around Python's lambdas (anonymous
functions). Using Python's normal looping constructs is also not ideal:
# Could get into an infinite loop if the system-under-test fails
while not miniguide_is_up():
press(Key.INFO)
# This is very verbose, and it uses an obscure Python feature: `for...else`[2]
for _ in range(10):
press(Key.INFO)
if miniguide_is_up():
break
else:
assert False, "Miniguide didn't appear after pressing INFO 10 times"
Thanks for the article, I enjoyed it and it has reminded me to write up
more of my experiences with UI testing. I take it that the article's
sample code is Scala? I like its syntax for anonymous functions.
Thanks for the comment. We actually originally had a waitUntil function that was basically used for all three of the cases I mentioned above. In some sections of the code, it was just there to eat errors, other sections get some text, and yet others it was wrapped in an assert and needed to return a boolean. This led to chronic misuse around the code (I found 4-5 tests that simply forgot to wrap it in an assert effectively rendering the test completely worthless). The main benefit we got from splitting the methods out was making it clear to developers what it did. Catching all the exceptions thrown by Selenium instead of returning booleans was just an added benefit.
And you are correct, we are using Scala. There are some really cool things about the language, case classes, pattern matching, first order functions, and traits just to name a few.
> This led to chronic misuse around the code (I found 4-5 tests that simply forgot to wrap it in an assert effectively rendering the test completely worthless).
Yes, I've been bitten by that too -- it's too easy to forget the "assert". This morning it occurred to me that I could write a pylint (static analysis) checker to catch that, so I've done just that: https://github.com/stb-tester/stb-tester/commit/5e5bdbb
I'm working for a startup that addresses this by means of a simpla wrapper API: http://heliumhq.com. Human-readable tests with no more HTML IDs, CSS selectors, XPaths or other implementation details.