Taking a peek inside the package it seems to mostly be the libraries - CuFFT alone is about 350MB for example, twice over for the debug and release versions. I'm guessing those are probably fat binaries pre-compiled for every generation of Nvidia hardware rather than just the PTX bytecode, which would help to speed up fresh builds, at the expense of being huge.