The sorry state of Avira anti-virus heuristics

thaumaturgy · on March 18, 2010

Avira is well-known in A/V for producing higher false positives than most other competing products.

That said, it also (as of late last year) had the highest actual detection rate (at 74% of 20,000+ tested bits of malware).

It also won AV Comparative's "gold" award for 2008, there's still a free version available for personal use, and it offers one of the lowest impacts on system performance of any antivirus product.

So, although I agree with the author that the heuristics in this case aren't optimal, Avira is still a very good product overall.

tsally · on March 17, 2010

Great example of why signature based detection for anti-virus is just a stopgap measure. You can be sure the anti-virus of the future (if there is one) will make limited to no use of malware signatures. I suppose it probably feels weird to most people to find out the AV product they pay a lot of money for is just a sophisticated version of grep.

bad_user · on March 17, 2010

I'm pretty sure the future of anti-viruses will involve signature matching one way or another, because there's no way around that.

Viruses are pretty similar to their biological counterparts ... they are pretty useless by themselves, but once they infect a host, they can use its mechanisms to multiply.

Antibiotics are targeting bacterial infections successfully because bacterial cells can be recognized (have their own specific shell, and aren't relying on a host for survival). That's why antibiotics work, because while different, all bacterial cells are alike. And antibiotics only attack bacterias, not normal cells.

With viruses it's different ... you can't attack viruses with a generic mechanism.

The only hope you have to treat a virus / stop it's reproduction is to identify a signature in its proteins, and target that. Vaccines are the most efficient because they give your immune system a heads-up, but once you caught it the only hope you have is for your immune system to fight back.

Computer viruses are pretty much alike ... you can't have a generic method for fighting back. You can only develop a specific treatment for a specific virus, otherwise you'll be in the same league as cancer treatment ... targeting cancerous cells efficiently, but doing enough of a damage in normal cells that your internal organs start failing (with a direct analogy being the loss of data).

Personally I never install anti-viruses. I'm just cautious about the sources of the materials I have to work with, and whenever I suspect my computer got infected (happens once in 2 years) I just save my data somewhere and do a fresh reinstall.

tsally · on March 17, 2010

Analogies between computer viruses and biological viruses are strained, overused, and abused. I'll just point out that our innate immune system doesn't use signature based detection, it uses a whitelist and behavioral/heuristic based detection. Where do you think transplant rejection comes from? It's not because the immune system has a virus signature that matches the foreign organ.

I'm pretty sure the future of anti-viruses will involve signature matching one way or another, because there's no way around that.

There are numerous ways around it. A whitelist is the simplest way. We have the technology to systematically verify the integrity of each piece of software on a computer (barring physical access) [1]. The current implementations of TPM have the potential for massive privacy violations and abuse, but I'm hopeful that projects like OpenCore will eventually lead to open source TPM designs. In any case, in such a system anti-virus is not even necessary.

More complex but less restrictive methods include:

(1) Building machine learning algorithms around a set of data that measures the normal operation of the machine. Because 99.9999999999% of the time Grandma's computer shouldn't be acting as a server.

(2) A feedback system built into your antivirsus where users report applications that don't work or have malicious behavior. Then Grandma can look up the greeting card creator she wants to install and see that 100 users have given the application 5/5 stars over a period of 6 months. Bonus points for open source applications where people will take the time to audit them for free.

There's tons of research in this area and I don't have time to reproduce a complete summery here. The point is that signature based detection is a method that is not sustainable in the long term because the number of unique malware variants is something that is growing exponentially. Most people don't even have anti-virus installed, and those that do have to deal with the fact that the AV slows the machine to a crawl.

Signature based AV is just a stopgap measure until we can properly implement our operating systems and detection methods. In the future we will have operating systems with trusted computing bases and detection methods based on behavior and crowd sourcing.

[1] http://en.wikipedia.org/wiki/Trusted_Platform_Module

Retric · on March 18, 2010

Your immune system uses a wide range of methods to find and fight infection. However, while identification is based on finding things outside of the "Normal" range, fighting it is all about signatures.

Vaccination works by preparing the body to fight something that looks like X by dumping millions of copy of something that looks like X into your body. At which point your immune system feels it needs to really fight X with everything it has so it develops a specific response. aka Adaptive immunity is triggered in vertebrates when a pathogen evades the innate immune system and generates a threshold level of antigen. SEE: http://en.wikipedia.org/wiki/Adaptive_immune_system

This is also why you can have auto immune diseases, if you trip the "fight" response to something that should be left along you get a world of problems. You immune system is more than willing to kill cancer, or even normal cells, and inflammation can be vary harmful.

tsally · on March 18, 2010

It's not all about the signatures. Vaccination would be just as ineffective as anti-virus if biological viruses were anything like computer viruses. Imagine billions of biological viruses that are instantly transmitted upon infection to anywhere in the world, have an immediate impact on the host, and can evolve and adapt with human-like intelligence. Good luck fighting that with vaccines.

Because computers are not human beings, if we can do detection well, than the problem of fighting an infection is already solved. Just run your server on a virtual machine verified by a trusted boot module. When an infection is detected, just roll back to a trusted snapshot or roll back to a clean install.

Retric · on March 18, 2010

Remember less than 1/2 the cells in your body actually contain your DNA. Yet signatures still work. Biological viruses are light years ahead of computer viruses. Picture a world where there where more computer virus strains than computers, they mutated millions of times a second, ignored firewalls and could attack computers though the electrical system. Further 1/4th of every nations GDP went to anti virus systems and had been for the last billion years.

PS: Virus writers have become far more tame but old school virus's would rewrite your BIOS.

theblackbox · on March 17, 2010

any idea where I could learn more about curent implementations of (1) ?

tsally · on March 17, 2010

Find a video of this Blackhat talk from 2005. Pretty sure it's in the iTunes store in the podcast section. The video covers pretty much everything you need to know about the theory of such systems.

http://www.blackhat.com/html/bh-usa-05/bh-usa-05-speakers.ht...

As far as actual implementation, I'm not aware of a system that does this. Although honestly some researcher has probably implemented such a system in a paper somewhere. There's probably a commercial network IDS that does it too. I haven't been doing this long enough to be an expert in the field, so I can't confidently say what the true state of the art is in behavioral detection. I'm pretty sure there are no open source projects though if that's what you're asking.

If you're thinking about implementing your own, I'd say start at the network level and write a simple program that sets an alarm off when a specific host starts responding to HTTP requests. If you want to get complex, use Wireshark to log all your network traffic for a month. You can then use Bayesian learning to determine whether new traffic is out of the ordinary. SSH traffic is problematic, so if you need to allow such traffic you'll have some additional challenges to overcome.

Starting an open source project related to this is on my TODO list, but I'm focusing on demonstrating the hopelessness of signature based detection first.

theblackbox · on March 18, 2010

Good luck. And thanks for the links, I'll be looking into it. You mind a PM if it get's interesting (just to bounce ideas off someone)? I'm looking into machine learning stuff myself and can imagine some interesting integrations with this subject.

tsally · on March 19, 2010

Definitely don't mind.

Email: tss AT timsally DOT com

Public key if you need: http://www.timsally.com/static/tss.asc

I'm presenting a metamorphic engine at a hacking conference in April (http://thotcon.org/). It'll be completely open source. Initial tests against modern anti-virus look very good. ;)

Periodic · on March 17, 2010

When it comes to recognizing current infections we're always going to have to deal with signatures, though we have the ability to match on behavior as well as code. We might look for things that are run out of the users directories that modify system-wide settings, for example.

I think there's a fundamental difference and that is that we can have a central security mechanism in a computer. It's not too hard to link a program to its privileges. For example, you browser should not be writing system-wide settings after it is installed, it shouldn't be forking processes that aren't duplicates of itself, etc. We shouldn't allow a process to add code to another process unless the user deliberately gives the process a higher level of privilege.

To draw another parallel with biology, we have the ability to surround every organ with its own barrier mechanism, like the blood-brain barrier and plecental barrier (I don't actually know what I'm talking about, so it's it is a very loose parallel). Basically, we have the power to wrap each process in its own protective sheath which doesn't let things in or out unless we okay it ahead of time.

Virus prevention is the future. Newer operating systems are already making progress in this direction.

Blasa · on March 18, 2010

For those that haven't come across it before the object-capability model deals with this sort of problem.

http://en.wikipedia.org/wiki/Object-capability_model

Edit: That of restricting each process to only be allowed to do certain things.

jfarmer · on March 17, 2010

I once had a totally benign website that triggered an anti-virus alert in Clam AV. There was an specific sequence of characters in the HTML that caused it.

One of the weirder bugs I've encountered.

on March 17, 2010

[dead]

mixmax · on March 17, 2010

Could a moderator please disable this account, it's spam.

http://news.ycombinator.com/threads?id=therealist