Google releases Skipfish, an open-source web security scanner

tptacek · on March 19, 2010

It's Michel Zalewski's (known to his friends and those, like me, who fear him as "lcamtuf") code.

It's a command-line Unix tool. It builds with almost no exotic deps other than GNU IDN.

It's a pure-async wordlist-driven crawler/fuzzer. It is screaming fast on the network I'm testing it on. Because it's async, it's not bottlenecked on demand-threading for each request.

It generates pretty HTML reports. Well, pretty for a C program.

It's Apache licensed. There are things in here I'd probably steal, like the URL parser; it's much tighter than mine.

How useful is this going to be for YC-style apps? Meh? You should definitely run it on your QA instance. Make sure you give it a login cookie to run with. It will find things. But where it looks like Skipfish is really going to do some damage is on the enterprisey J2EE- and .NET-stack apps.

axod · on March 19, 2010

Why do you say @ J2EE and .NET-stack apps?

I would have guessed most vulnerabilities are in PHP code.

tptacek · on March 20, 2010

Uh, very no.

axod · on March 20, 2010

back it up with some data?

I'd say wordpress vulnerabilities alone account for quite a bit.

PHP is such a low barrier to entry, plus there's just so much php code about.

tptacek · on March 21, 2010

For now, you should just take my word for this.

PHP is the least of WordPress' problems. Or. Well. Not the worst of WordPress' problems.

Batsu · on March 20, 2010

I love the sentiment as a php connoisseur, but my lack of knowledge about enterprise language has me asking why...?

_tggb · on March 20, 2010

My (uneducated about security) guess is that it is because most bugs are in code written in a language by a programmer, not runtime bugs or bugs in the language itself. And that when it comes to languages, how easy it is to shoot yourself in the proverbial foot could (should?) be looked at as being as (or more, depending on your POV?) important then how secure the runtime is.

pavs · on March 19, 2010

Screenshot of pretty html report would be awesome. I don't have access to a local *nix system at the moment so can't try it out.

mrtron · on March 19, 2010

The Google page lists a screenshot

http://skipfish.googlecode.com/files/skipfish-screen.png

pavs · on March 19, 2010

Ah! Thanks!

ErrantX · on March 19, 2010

actual results are stored as a hierarchy of JSON files, suitable for machine processing if needs be

That's a quality touch. I can't count the number of times I've had to write "quick" tools to parse results into a form we prefer/need :P

rabidgnat · on March 20, 2010

Wow, this actually found a moderate security vulnerability in my website!

Fetching http://mysite.com/static/ returned the plaintext template of my index.html file, which I must have accidentally copied in some manual hackery during a broken push (none of my scripts copy it normally)

However, I almost missed the warning: Skipfish complained because the page lacked a content type, and it was buried in several similar warnings. I'd like it to recognize potentially templated files, which is a much more serious vulnerability than missing a 'text/plain' content type. Years of staring at unimportant compiler warnings might cause people to miss gems like this.

dschobel · on March 19, 2010

And it even builds on cygwin out of the box. All c + minimal dependency, is it the next big fad? (I sure hope so).

tptacek · on March 19, 2010

Redis was the same way. I very much like this style of packaging.

leftnode · on March 19, 2010

For those of us who kind of know C but want to learn it really in depth, I like it as well. It makes compiling open source programs really easily.

I took a cursory glance of the code and it seems good, but is it good code to learn from?

tptacek · on March 19, 2010

Yes. Start with "analysis.c" --- the code in their works with completed request/response pairs. Look for "scrape_response()". Note how the strcspn's and pointer math approximate simple, unrolled regular expressions.

The database code (the "pivot tree") is a bit tangley as a data structure.

http_client.c has really tight URL parsing code.

This isn't written in a very modern style (I like that about it, though). For instance, look at "check_for_stuff()", which implements basic content sniffing. In a modern C program, you'd probably see an array of structs containing function pointers and names, each pointing to a tiny function looking for a different bit of content. Here it's one big unrolled function. Likewise, modern C code would probably just regex the HTML responses instead of hand-coding HTML parsing. But on the other hand this actually exists and works, and that's a good goal to have too.

The I/O loop in Skipfish is definitely the right way to do network programming in C. This program is simpler and faster because it doesn't waste time with threads. But if you do something similar, use libevent.

tocomment · on March 19, 2010

I'm not really understanding this tool. What does it find exactly?

tptacek · on March 19, 2010

You aim it at a URL.

It asynchronously launches thousands of requests based on a very large wordlist.

It scrapes the responses and spiders them.

As it identifies actual pages, it fuzzes them with strings that tickle web app flaws. It analyzes the responses. For instance, it tries to inject "skipfish://whatever" URLs into fields, parameters, and links; then it looks to see whethe those URLs appear in "hot" places in the response, like headers or link tags.

It's looking for --- primarily --- :

* Cross-site scripting and content injection

* Best practices problems (like failing to declare charsets properly)

* Forms without XSRF tokens

* SQL injection

It's better than anything that I've written but I will hazard a guess that it finds things well in that order. It's bound to get better over time.

aschobel · on March 19, 2010

Simple to build, just make sure you grab libidn:

http://ftp.gnu.org/gnu/libidn/libidn-1.18.tar.gz

It also defaults to white text, so set your terminal background to black.

arohner · on March 19, 2010

And coincidentally, leiningen is having problems downloading an unrelated project from google code. Surely HN didn't overload google code?

durin42 · on March 19, 2010

No, we had some unexpected infrastructure problems today.

phatbyte · on March 20, 2010

For windows users:

I've installed over cygwin with no problems. Make sure have all the "dev" packages installed and then type "make", you are ready to go.

alnayyir · on March 20, 2010

I'm having trouble compiling it, and libidn is installed.

See:

http://gist.github.com/338360

for make error output.

It appears some files are missing, I tried redownloading but no dice.

Has anyone else encountered this?

tptacek · on March 20, 2010

You're building it on a system without zlib or OpenSSL. Install those too.

alnayyir · on March 20, 2010

Both were installed, it looks like I needed the dev packages.

yigit · on March 20, 2010

install libssl-dev package.