Hacker News new | past | comments | ask | show | jobs | submit login
Testing techniques in Go (segment.com)
124 points by astdb on Jan 1, 2018 | hide | past | favorite | 9 comments



I have to share this testing code I came across yesterday: https://github.com/dgryski/go-tinymap/blob/master/tinymap_te...

It uses two neat strategies:

- testing/quick is used to generate random test cases that try to produce a failure, and;

- when a failure is found, another algorithm kicks in that minimizes the input that caused the failure, removing any superfluous operations.

The former uses a helper function that converts a `[]byte` (supplied by testing/quick) to a list of operations. This has a nice secondary benefit, which is that the same helper function can be used with fuzzing libraries.

The latter is really handy because testing/quick generates rather unusual test cases, and it can be hard to piece together which part of a large input caused the failure.

Very cool!


Isn't that just quickcheck?


No. I mean, conceptually, sure, it's very similar. But QuickCheck is data type driven. The underlying library here is ddmin, and it seems to operate on []byte directly: https://github.com/dgryski/go-ddmin/blob/master/ddmin.go --- The corresponding paper is here: https://www.st.cs.uni-saarland.de/papers/tse2002/tse2002.pdf

(I am somewhat surprised that the paper doesn't mention QuickCheck at all. Although, I believe QuickCheck was published in 2000, so it had only been out for two years by then. So maybe I shouldn't be surprised actually.)

See also: http://hypothesis.works/


Cross posting my comment from the crusty news aggregator site:

I liked this blog post and agree with most of it. But I think its section on interfaces is a bit muddled. It links to this article[1] for more advice, and while the article sounds nice in principle from the perspective of someone publishing a library for others to use, I’ve found other techniques to be valuable as well.

One technique I’m particularly fond of is writing a package whose main goal is to provide an abstraction layer between a storage system (e.g., PostgreSQL) and one’s model of data. The utility of the package permits callers to access data without caring too much about the underlying details of the storage itself.

This on its own doesn’t require an interface. It makes sense to define a type that implements various routines that interact with the storage layer and be done with it. But something what’s really useful is to create an in-memory version using the exact same API with the same contracts. It’s useful to use the in-memory version not necessarily just in testing, but potentially also in command line debugging tools. In particular, it lets one debug or test other components of the system without necessarily depending on a running storage service. This doesn’t replace integration tests, of course, but it does permit unit testing in a more fine grained way.

The only real issue here is that you want everyone else that uses this storage system to be able to also use the in-memory implementation as well. The most natural way to do that is to define an interface. Now, you could take the route that is being advocated here and define the interface in each package that wants to use this storage system. I tried that. It got old, real old, fast. Re-defining that interface N times was just silly. And doing it in a granular way was just annoying. As soon as one part of the code wanted to use another aspect of the storage, you need to update any intervening interfaces with an additional method. It was just tedious work that served little to no value from my perspective.

So I switched over to something more like this:

    type Store interface {
        FooA() (*Bar, error)
        FooB() (*Baz, error)
        FooC() (*Quux, error)
        // Potentially many more methods
    }

    type postgresStore struct { ... }
    func NewPostgres(db *sql.DB) Store { ... }

    type memoryStore struct { ... }
    func NewMemory() Store { ... }
And all of a sudden, things became much nicer, even though the package is now doing what everyone seems to recommend against: exporting an interface.

The key thing I realized is that Store isn’t meant to be some super modular interface. (In principle, I might implement this using a sum type in a different language.) Other people don’t ever need to implement it. For the most part, the two implementations—one “real” and one in-memory only—are all you ever need. That is, the interface isn’t really being used as a mechanism for extensibility, but rather, as a pragmatic tool to abstract over a small number of implementations of the same API. Namely: if one didn’t care about granular unit tests on this storage API and only ever used integration tests, then there would be no reason for the in-memory implementation and therefore no reason for the interface.

Rob Pike said, “The bigger the interface, the weaker the abstraction.” I’d probably agree with that. And I’m OK with it. The Store interface isn’t meant to be a strong abstraction. It is, in fact, quite weak!

[1] - https://rakyll.org/interface-pollution/


If "the bigger the interface, the weaker the abstraction", then would that mean the opposite is also true, i.e. the smaller the interface, the greater the abstraction? If so, then interface{} would be the greatest abstraction of all, and pure dynamic typing Python-style throughout the program the best design choice!


> If so, then interface{} would be the greatest abstraction of all, and pure dynamic typing Python-style throughout the program the best design choice!

`interface{}` and dynamic typing aren't more abstract; they still have an interface--it just is hidden from the type-system. And in my experience as a Python developer, absent a type checker, many developers lack the discipline to keep their interfaces small. A type checker makes it somewhat harder for the author to make design decisions that the maintainer will regret. (Of course, when I point this out, dynamic typing devotees invariably respond, "You can write bad code in any language!", failing to distinguish between my claim and "It's impossible to write bad code in a statically typed language").


Anyone interested in this should also check out the hashicorp talk at gophercon on advanced testing in Go: https://www.youtube.com/watch?v=8hQG7QlcLBk

It has several awesome ideas.


Why does Segment have a large Go code base? From my oustider perspective it would seem to be mostly client side javascript plus server side glue code.

And big data glue is best written in Python.

Now, of course, Segment may be big enough to have many large codebases and Go is just one of them.


also check out https://github.com/golang/mock for mocking




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: