I havent decided on a technology yet but i wanted to ask. BMP images are large, i compressed a 3mb image to 50k. Browsers can decode gzip text, so would it be possible to gzip bmp on my server and have the user request mysite.com/images/test.bmp and have it decompress my gzip bmp on the other side so the user doesnt notice a difference?
Gzip to bmp images on the fly
gzipimages
Related Solutions
I'd recommend using a regular file system instead of databases. Using file system is easier than a database, you can use normal tools to access files, file systems are designed for this kind of usage etc. NTFS should work just fine as a storage system.
Do not store the actual path to database. Better to store the image's sequence number to database and have function that can generate path from the sequence number. e.g:
File path = generatePathFromSequenceNumber(sequenceNumber);
It is easier to handle if you need to change directory structure some how. Maybe you need to move the images to different location, maybe you run out of space and you start storing some of the images on the disk A and some on the disk B etc. It is easier to change one function than to change paths in database.
I would use this kind of algorithm for generating the directory structure:
- First pad you sequence number with leading zeroes until you have at least 12 digit string. This is the name for your file. You may want to add a suffix:
12345
->000000012345.jpg
- Then split the string to 2 or 3 character blocks where each block denotes a directory level. Have a fixed number of directory levels (for example 3):
000000012345
->000/000/012
- Store the file to under generated directory:
- Thus the full path and file filename for file with sequence id
123
is000/000/012/00000000012345.jpg
- For file with sequence id
12345678901234
the path would be123/456/789/12345678901234.jpg
- Thus the full path and file filename for file with sequence id
Some things to consider about directory structures and file storage:
- Above algorithm gives you a system where every leaf directory has maximum of 1000 files (if you have less that total of 1 000 000 000 000 files)
- There may be limits how many files and subdirectories a directory can contain, for example ext3 files system on Linux has a limit of 31998 sub-directories per one directory.
- Normal tools (WinZip, Windows Explorer, command line, bash shell, etc.) may not work very well if you have large number of files per directory (> 1000)
- Directory structure itself will take some disk space, so you'll do not want too many directories.
- With above structure you can always find the correct path for the image file by just looking at the filename, if you happen to mess up your directory structures.
- If you need to access files from several machines, consider sharing the files via a network file system.
- The above directory structure will not work if you delete a lot of files. It leaves "holes" in directory structure. But since you are not deleting any files it should be ok.
Two questions: 1) What did I do wrong with the syntax?
Gzip only compresses individual files; it's not an archiving tool. It's usually used in combination with something like tar. In fact, some versions of tar
will use gzip
to automatically create a compressed archive if given appropriate flags. For example:
tar -cvz -f public_html.tar.gz /home/site/public_html/
This creates (-c
) a gzip-compressed (-z
) archive called public_html.tar.gz
containing the contents of the specified public_html
directory.
2) How can I revert all of the filenames back to what they were before (exactly what they are now, minus the .gz extension).
Just run gunzip
on all the files. E.g:
gunzip -r /home/site/public_html/
Note that you can also simply use the zip
command, which will create a single compressed archive.
Best Answer
If possible, use PNG as image format. It's the defacto standard for lossless image compression and will probably surpass the compression level achieved by merely gziping bitmaps.
If you need to use BMP because of dependencies from legacy systems, you can try using the output compression methods available in your webserver, e.g. see mod_deflate for Apache. Wireshark can be especially helpful in testing/debugging such setups.