No, they're not. Unless you're OK with the system working for only exactly the test cases you specify (which, btw, is trivially implementable as a look-up table of the test cases!), tests do not specify a system.
EDIT: Importantly, tests have no means of distinguishing for a missing test case whether that case's behavior should be inferred or is actually unspecified.
The QuickCheck style of testing is my favorite (see paper [1]). Using a simple embedded DSL, you write specifications that declare the desired properties of your code. The computer then checks that the properties hold by randomly sampling the space of possible test cases, using one of several sampling strategies. Even though the checks are not proofs, they are very effective in practice.
For example, here's a quick Haskell session in which I claim that the function f has the property of returning its smaller argument:
Prelude> import Test.QuickCheck
Prelude> let prop_minWins x y = if x < y then f x y == x else f x y == y
Prelude> quickCheck prop_minWins
+++ OK, passed 100 tests.
The best part is that the property specifications are not only machine testable but human readable. They allow you to tell other humans what you code should do, using a clear, formal language that the machine can check to make sure you didn't tell any lies. For example, the second line in the code sample above says, "For all x and y (which are values of some type on which there is a less-than relation), if x < y then f(x, y) = x; otherwise,
f(x, y) = y. See [2] for another great example.
> nothing that says any non trivial specification in any language is either complete or correct.
Absolutely correct. In fact the larger the specification the more likely that the specification has a bug.
> Tests _can be_ used as a _kind of_ specification.
Also true. But, frequently in program verification (proving a program does what it is specified to do) specifications are not tests.
Why are specification not written as tests? It comes down to the program verification technique which is used. There are several categories:
- Testing. "Optimistically Inaccurate" It can't find all problems but all problems it finds are real problems (assuming of course the test is correct). The downside is you may give an OK to a bad program, the upside is when you find a problem it is a real problem.
- Dataflow Analysis. "Pessimistically Inaccurate" It can prove fairly general properties about a program. However, it cannot prove all properties and may not be able to prove a property which is true. For example say you where proving a program did not have a SQL injection. Dataflow analysis would OK only programs it could prove the absence of a SQL injections (as per the specification you give it! (there may be a SQL injection your spec doesn't cover)). However, there may be programs which are free of SQL injections but Dataflow Analysis is unable to prove it.
- Model Checking. "Pessimistically Inaccurate, Simplified Properties" In model checking you move the program by mapping it into a new domain such as a finite state automaton. You do the mapping in such a way that properties proved about the model must hold in the actual program. Unfortunately, you can only check simplified properties and there is still some pessimistic inaccuracy (although it is reduced from dataflow analysis since the domain is easier to prove properties on).
- Syntax Analysis. "Simplified Properties" Eg. Grammar Checking. You can only prove properties that can be found by a parser. You can write checks which have false positives. Eg. the check may fail but the program is fine.
Type Checking is another method way as is Theorem Proving. I am sure there are even more methods beyond the ones I listed here.
The point is except in the case of testing, the specifications are not given as tests. Each verification technique requires its own formal specification language. However, there are some systems where you can specify a program in a general language which can then be compiled to the verification system you actually use.
I don't know; I'd be wary of working with any architect who handed me a list of test cases as a specification. What if he/she left out important edge cases, which I extrapolate incorrectly? How am I to infer that the absence of a test case means "not specified" vs. "specified to be the extrapolation of other test cases"?
That's kind of like saying that a coder who can only write pseudocode can code a system, but not to the extent that I personally want him/her to.
An interesting alternative to this is Spock [1], a specification framework for Java and Groovy with a nice Groovy DSL. Check out the new reference documentation [2] or try out some specifications using the web console [3].
It has a great syntax for data driven and interaction based syntax.
Data driven example:
class Math extends Specification {
def "maximum of two numbers"(int a, int b, int c) {
expect:
Math.max(a, b) == c
where:
a | b | c
1 | 3 | 3
7 | 4 | 4
0 | 0 | 0
}
}
> It has a great syntax for data driven and interaction based syntax.
With DSL's in groovy it always looks like the provided syntax is trying to code around Groovy's syntactic limitations, e.g. having to use the bar | in the example
There is this odd trend amongst testing framework developers to make their frameworks more verbose, as if this somehow delivers value. Spek seems to be in this tradition. Almost every other area of programming aims towards getting as much work done with as a little verbiage as possible. Why do testing framework people feel that programs should read like written prose? This idea has failed a great many times over the years.
The ScalaCheck [0] library is actually innovating in the domain of testing. It's derived from QuickCheck, a Haskell library, and I believe an Erlang derivative is making $$s as a commercial product.
It does deliver value because it allows you to document the context of the assertion code in a way that's more enduring than a comment, and in a way that's visible at failure time.
I actually use a homebrewed assertion context specifier, where I wrap blocks of code in a "that" context, which then decorates any thrown exceptions.
When your tests grow past verifying that string concatenation of your string class works (short, sweet, tight unit tests), and you end up testing business logic, you end up with tests that are just not that short and assertions which are not single-minded assertion. Often times you end up writing multiple asserts for one logical "assertion result", e.g. assert that returned map not null, map contains expected key, value for the key has correct detail, etc. Providing a single intent reason for a group of assertions is very helpful.
Sometimes when time is tight, you might end up extending an existing test to assert one more concept instead of breaking it up into two separate tests. Now you can provide separate context for that assertion, and having the code be in a separate closure/block helps isolate it too.
Note, that this is altogether different than the concept of writing tests by non-programmers (e.g. Cucumber).
I think there needs to be a distinction between verbosity and clarity. For me code that is clear with precise wording of methods and functions is important, even if these need to be more verbose. Amount of boiler-plate code on the hand is undesired.
In terms of testing, I think defining the requirements inline with the tests that are meant to implement those requirements provides value, removes ambiguity and is easier for those coming to a new code base (or coming back to one).
There's this idea that tests can be written by non-programmers (QA, business SMEs, etc.) which may be the motivator for this. Just a guess.
I used a framework called FitNesse which was awful but which was built around this idea that tests could all be specified in spreadsheets and managed by non-technical folks.
I'd hope not. I think developers should write tests. And I think things like BDD is more about communication than anything else.
The difference is however in using a language that is ubiquitous and understood by all parties, leaving ambiguity out. It is about contrasting what you've understood in terms of the requirement with what is expressed in the code.
Why do you believe it has failed many times? It seems a more literate programming style is becoming more and more common. I find that features such as those written in a Gherkin format serve other developers very well as to what is expected of the system. I use it as documentation quite often.
I am not sure if something like Jasmine or Chai exists in the JVM world but this seems pretty similar. You can do much the same thing with BDD-style describe() and it() specs.
Disagree, there's a lot of ambiguity in English that makes for bad specifications.
Example: take even the word "tomorrow". Does tomorrow mean at midnight tonight, or tomorrow morning? Does it mean 24 hours from now, or does it mean a far future time? Does "tomorrow" include weekends or holidays? In what timezone is this happening?
The Beach Boy's song: "Will you love me tomorrow?": Is that asking if the singers love will still have the same feelings about them tomorrow (less than 24 hours from now), or is the singer asking if their love will still adore them many years from now, (when they're 64, for example, to pick another oldey).
Which is of course why civilians think developers are pedantic, and of course we are, because to implement specifications on Turning machines we need to answer these slightly outrageous questions. "Turn on the sprinkling system tomorrow" is a trivial specification for a human, but not so for computers.
Tomorrow is just the example that comes to mind, there are thousands of other words whose meanings are vague. This is why creating specifications for computers, from "civilian English" is hard.
The fact that some people use the language in a vague or unspecific manner does not invalidate English's suitability as a language for writing specs.
I doubt I've ever used "tomorrow" in a spec. I'd use "next business day" or "T+1" (with the calendar defined).
In my experience, writing a spec is an iterative process, where the developers review a draft of the spec and make comments, ask questions, point out ambiguity, etc., which provides the feedback for the next version off the spec. That process loops until everyone is happy, at which point the spec (or, at least, a portion of it) is finalised and the developers start work.
No, they're not. Unless you're OK with the system working for only exactly the test cases you specify (which, btw, is trivially implementable as a look-up table of the test cases!), tests do not specify a system.
EDIT: Importantly, tests have no means of distinguishing for a missing test case whether that case's behavior should be inferred or is actually unspecified.
See http://en.wikipedia.org/wiki/Z_notation for an example of an actual serious specification language.