Hacker News new | past | comments | ask | show | jobs | submit login
Google releases Skipfish, an open-source web security scanner (code.google.com)
193 points by tptacek on March 19, 2010 | hide | past | favorite | 26 comments



It's Michel Zalewski's (known to his friends and those, like me, who fear him as "lcamtuf") code.

It's a command-line Unix tool. It builds with almost no exotic deps other than GNU IDN.

It's a pure-async wordlist-driven crawler/fuzzer. It is screaming fast on the network I'm testing it on. Because it's async, it's not bottlenecked on demand-threading for each request.

It generates pretty HTML reports. Well, pretty for a C program.

It's Apache licensed. There are things in here I'd probably steal, like the URL parser; it's much tighter than mine.

How useful is this going to be for YC-style apps? Meh? You should definitely run it on your QA instance. Make sure you give it a login cookie to run with. It will find things. But where it looks like Skipfish is really going to do some damage is on the enterprisey J2EE- and .NET-stack apps.


Why do you say @ J2EE and .NET-stack apps?

I would have guessed most vulnerabilities are in PHP code.


Uh, very no.


back it up with some data?

I'd say wordpress vulnerabilities alone account for quite a bit.

PHP is such a low barrier to entry, plus there's just so much php code about.


For now, you should just take my word for this.

PHP is the least of WordPress' problems. Or. Well. Not the worst of WordPress' problems.


I love the sentiment as a php connoisseur, but my lack of knowledge about enterprise language has me asking why...?


My (uneducated about security) guess is that it is because most bugs are in code written in a language by a programmer, not runtime bugs or bugs in the language itself. And that when it comes to languages, how easy it is to shoot yourself in the proverbial foot could (should?) be looked at as being as (or more, depending on your POV?) important then how secure the runtime is.


Screenshot of pretty html report would be awesome. I don't have access to a local *nix system at the moment so can't try it out.



Ah! Thanks!


actual results are stored as a hierarchy of JSON files, suitable for machine processing if needs be

That's a quality touch. I can't count the number of times I've had to write "quick" tools to parse results into a form we prefer/need :P


Wow, this actually found a moderate security vulnerability in my website!

Fetching http://mysite.com/static/ returned the plaintext template of my index.html file, which I must have accidentally copied in some manual hackery during a broken push (none of my scripts copy it normally)

However, I almost missed the warning: Skipfish complained because the page lacked a content type, and it was buried in several similar warnings. I'd like it to recognize potentially templated files, which is a much more serious vulnerability than missing a 'text/plain' content type. Years of staring at unimportant compiler warnings might cause people to miss gems like this.


And it even builds on cygwin out of the box. All c + minimal dependency, is it the next big fad? (I sure hope so).


Redis was the same way. I very much like this style of packaging.


For those of us who kind of know C but want to learn it really in depth, I like it as well. It makes compiling open source programs really easily.

I took a cursory glance of the code and it seems good, but is it good code to learn from?


Yes. Start with "analysis.c" --- the code in their works with completed request/response pairs. Look for "scrape_response()". Note how the strcspn's and pointer math approximate simple, unrolled regular expressions.

The database code (the "pivot tree") is a bit tangley as a data structure.

http_client.c has really tight URL parsing code.

This isn't written in a very modern style (I like that about it, though). For instance, look at "check_for_stuff()", which implements basic content sniffing. In a modern C program, you'd probably see an array of structs containing function pointers and names, each pointing to a tiny function looking for a different bit of content. Here it's one big unrolled function. Likewise, modern C code would probably just regex the HTML responses instead of hand-coding HTML parsing. But on the other hand this actually exists and works, and that's a good goal to have too.

The I/O loop in Skipfish is definitely the right way to do network programming in C. This program is simpler and faster because it doesn't waste time with threads. But if you do something similar, use libevent.


I'm not really understanding this tool. What does it find exactly?


You aim it at a URL.

It asynchronously launches thousands of requests based on a very large wordlist.

It scrapes the responses and spiders them.

As it identifies actual pages, it fuzzes them with strings that tickle web app flaws. It analyzes the responses. For instance, it tries to inject "skipfish://whatever" URLs into fields, parameters, and links; then it looks to see whethe those URLs appear in "hot" places in the response, like headers or link tags.

It's looking for --- primarily --- :

* Cross-site scripting and content injection

* Best practices problems (like failing to declare charsets properly)

* Forms without XSRF tokens

* SQL injection

It's better than anything that I've written but I will hazard a guess that it finds things well in that order. It's bound to get better over time.


Simple to build, just make sure you grab libidn:

http://ftp.gnu.org/gnu/libidn/libidn-1.18.tar.gz

It also defaults to white text, so set your terminal background to black.


And coincidentally, leiningen is having problems downloading an unrelated project from google code. Surely HN didn't overload google code?


No, we had some unexpected infrastructure problems today.


For windows users:

I've installed over cygwin with no problems. Make sure have all the "dev" packages installed and then type "make", you are ready to go.


I'm having trouble compiling it, and libidn is installed.

See:

http://gist.github.com/338360

for make error output.

It appears some files are missing, I tried redownloading but no dice.

Has anyone else encountered this?


You're building it on a system without zlib or OpenSSL. Install those too.


Both were installed, it looks like I needed the dev packages.


install libssl-dev package.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: