Red Green Repeat Adventures of a Spec Driven Junkie

Ruby: Zlib and GZip

I wanted to compress a string before writing out the contents to a file. Uncompressed, the string is about 4MB, compressed, 600kb.

The end result was e-mailing out this file to another system. Compressing would ensure faster transmission of the e-mail and it not getting stuck or bounced back because of size.

Compressing using Zlib

I naively though this would work:

require 'zlib'

compressed_contents = Zlib::Deflate.deflate(contents)

File.open("<filename>.gz", "wb") do |file|
  file.write(compressed_contents)
end

Easy, peasy!

Decompressing using gunzip

When the file arrived on the other side, I wanted to decompress it, so I ran:

$ gunzip "<filename>.gz"
gunzip: "<filename>.gz: not in gzip format"

HUH?? Hmm… maybe Zlib is regular zip format?

$ unzip "<filename>".gz
Archive:  "<filename>".gz
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.
unzip:  cannot find zipfile directory in one of "<filename>".gz or
        "<filename>".gz.zip, and cannot find "<filename>".gz.ZIP, period.

Quick - just do something!

At the end, I just wrote another ruby script to decompress it:

require 'zlib'

compressed_file = File.read("<filename>.gz")

decompressed_file = Zlib::Inflate.inflate(compressed_file)

File.open("<filename>", "wb") do |file|
  file.write(decompressed_file)
end

I solved my problem, but what went wrong??

Zlib vs GZip - IO Stream vs File Stream

My assumption about Zlib and GZip’s relationship is right. Zlib IS the compression engine for GZip.

Writing a Zlib output to a file is NOT the same as a GZip file.

There’s a different standard for the GZip file than the actual compression method, which is Zlib.

This is the difference between the IO stream and the file stream. I should know this, I’ve served on compression committee! 🤦‍♂️

If you want to compress stuff in memory to GZip - easiest is to use the GzipWriter function in the Zlib class and write to a file:

require 'zlib'
Zlib::GzipWriter.open("filename.gz") do |gzip_file|
  gzip_file.write(<memory contents>)
end

Using this, standard gunzip will be able to decompress it.

Reference

Thanks to Phil Nash’s post highlighting this.