Google Takeout – Why Are Files Poorly Compressed?

google-takeout

Every once in a while I use Google Takeout (now called just "Download your Data" or something), to get a copy of all my stuff that Google holds.

Since the last time I did this (about six months ago), the .tbz (bzip2) compression method no longer seems to be available (just .tgz). Also, my email came down in an uncompressed mbox file, which surprised me. That plain text email is massively compressible.

Here is a screenshot from the download page:

enter image description here

Has there been any announcement about the compression methods available for Google Download your Data? I couldn't find anything with a reasonable search.

Best Answer

Sorry about the confusion in Takeout's behavior a couple answers for you (disclosure I work on Takeout)

  1. Why is tbz gone? The usage numbers were extremely low. Zip and tgz had orders of magnitude more usage than tbz. We received a lot of feedback that tbz/tgz was a confusing choice, so we removed the tbz choice.

  2. Why is you mbox uncompressed? This has to do with the maximum archive size setting, in your screenshot it is set to 2GB, so for any file > 2GB we just export a raw file. To work around this choose a larger archive size, up to 50GB.
    (This behavior is most useful for zip on older clients where the default windows client doesn't support zip64 and hence larger zip files would be unreadable).

  3. Help center article is wrong. Sorry about that I'll see about getting updated.