Hacker News new | past | comments | ask | show | jobs | submit login

You can read sequentially through a zip file and dynamically build up a central directory yourself and do whatever desired operations.

There's the caveat of the zipfile itself may have stuff that's not mentioned in the actual central directory of the zipfile.




> You can read sequentially through a zip file and dynamically build up a central directory yourself and do whatever desired operations.

First, zip files already have a central directory so why would you do that?

Second, you seem to be missing the subject of this subthread entirely, the point is being able to selectively access S3 content without downloading the whole archive. If you sequentially read through the entire zip file, you are in fact downloading the whole archive.


Sorry, I wasn't clear before. You don't need the central directory to process a zipfile. You don't need random access to a zipfile to process it.

A zipfile can be treated as a stream of data and processed as each individual zip entry is seen in the download/read. NO random access is required.

Just enough memory for the local directory entry and buffers for the zipped data and unzipped data. The first two items should be covered by the downloaded zipfile buffer.

If you want to process the zipfile ASAP or don't have the resources to download the entire zipfile first before processing the zipfile, then this is a valid manner to handle the zipfile. If your desired data occurs before the entire zipfile has been downloaded, you can stop the download.

A zipfile can also be treated as a randomly accessed file as you mentioned. Some operations are faster that way - like browsing the each zip entry's metadata.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: