The size limit (16 mb or whatever) does not enforce you to have archive size as close to it as possible.
Assuming that you are allowed to create archives of smaller size, here is the "first iteration" solution - dead simple, but meets your requirements: just zip every file into separate archive.
myFile1
-> archive1.zip
myFile2
-> archive2.zip
- etc
Now, if you want it a bit less dumb, use the sum of current archive size (Deflater.getBytesWritten()) and next uncompressed file size to decide if it's time to switch to creating new archive.
myFile1
-> archive1.zip
- size of
archive1.zip
plus myFile2
within limit -> add myFile2
to archive1
- size of
archive1.zip
plus myFile3
exceeds limit -> add myFile3
to new zip, archive2
Yeah there is a chance that adding compressed myFile3 to archive1 will remain within limit, but why bother?
- etc
I know this is ages old, but in case someone runs into this. IMHO the way to go about it is like this:
1) Open the original file (e.g. original.txt) using file_get_contents('original.txt').
2) Make your changes/edits.
3) Use file_put_contents('original.txt.tmp') and write it to a temp file original.txt.tmp.
4) Then move the tmp file to the original file, replacing the original file. For this you use rename('original.txt.tmp', 'original.txt').
Advantages: While the file is being processed and written to the file is not locked and others can still read the old content. At least on Linux/Unix boxes rename is an atomic operation. Any interruptions during the file writing don't touch the original file. Only once the file has been fully written to disk is it moved. More interesting read on this in the comments to http://php.net/manual/en/function.rename.php
Edit to address commments(too for comment):
https://stackoverflow.com/questions/7054844/is-rename-atomic has further references to what you might need to do if you are operating across filesystems.
On the shared lock for the reading I am not sure why that would be needed as in this implementation there is no writing to the file directly. PHP's flock (which is used to get the lock) is a little but unreliable and can be ignored by other processes. Thats why I am suggesting using the rename.
The rename file should ideally be named uniquely to the process doing the renaming so as to make sure not 2 processes do the same thing. But this of course does not prevent editing of the same file by more than one person at the same time. But at least the file will be left intact (last edit wins).
Step 3) & 4) would then become this:
$tempfile = uniqid(microtime(true)); // make sure we have a unique name
file_put_contents($tempFile); // write temp file
rename($tempfile, 'original.txt'); // ideally on the same filesystem
Best Answer
The only thing I see inelegant about your implementation is intertwining the file listing work, the zip file opening/closing work, and the counting work. (Do you have another issue with it?)
Solution 1: Use Java's try-with-resources block to automate the file closing work thanks to ZipFile being AutoCloseable:
Solution 2: In Groovy, separate out the file listing work (and add filename filtering):
Solution 3: Also separate ZipFile open/closing work using Groovy's "with" idiom, adding a
zipFileWith(closure)
method to the classFile
:Solution 4: Add an
eachZipFile(closure)
method to the classFile
: