Image Optimization with libjpeg-turbo and Perl

This article was published in April of 2021 and last updated February 10th, 2022.

I had a number of JPEG images (and more coming in all the time) that I wanted to optimize. I had happened to look at the source for jpegoptim before and I seemed to recall that it mostly used libjpeg to re-compress the image with reasonable settings. I saw that libjpeg-turbo was already in the OpenBSD ports tree and I wondered if I might be able to use its djpeg and cjpeg utilities to do something similar by calling them from Perl.

Installation

I went ahead and installed libjpeg-turbo. I was happy to see that it didn't have any dependencies. It doesn't sit well with me to pull in a world of dependencies to perform a seemingly ancillary function.

lucy# pkg_add jpeg
quirks-3.187 signed on 2020-05-19T14:41:48Z
jpeg-2.0.3v0: ok

My scripts run in a chroot jail, so I'd need to copy djpeg and cjpeg into there to use them. First, I used ldd to check what shared libraries they might need. I was happy to see that (other than libc and ld.so) they only depended on their own libjpeg. I copied that shared library along with djpeg and cjpeg into the jail. After copying the shared library into the jail, I ran ldconfig to rebuild the hints file for ld.so.

lucy# mkdir /var/www/usr/local/lib
lucy# cp /usr/local/lib/libjpeg.so.70.0 /var/www/usr/local/lib
lucy# mkdir /var/www/usr/local/bin
lucy# cp /usr/local/bin/?jpeg /var/www/usr/local/bin
lucy# chroot /var/www ldconfig /usr/local/lib

Experimenting

I started out by doing some experiments on a small set of images I had which I thought might be representative. I tried compressing with the default options, with entropy encoding parameter optimization, and with progressive mode enabled. I timed the execution and checked the size of the output for each. I also used the -maxmemory option since I planned to use it in production and was curious to see if there would be any trouble with my test images.

Using default settings (including the default quality of 75) picked up an easy savings of about 48%.

lucy$ time for f in original/*.jpeg ; do
> bn=`basename $f`
> djpeg -pnm -maxmemory 100m < $f |
> cjpeg -maxmemory 100m > default/$bn
> done
    0m23.36s real     0m00.11s user     0m01.21s system
lucy$ du -sh original default
33.6M	original
17.6M	default

Optimizing the entropy-encoding parameters didn't save much more but it also didn't take much time.

lucy$ time for f in original/*.jpeg ; do
> bn=`basename $f`
> djpeg -pnm -maxmemory 100m < $f |
> cjpeg -maxmemory 100m -optimize > optimized/$bn
> done
    0m24.71s real     0m00.13s user     0m01.14s system
lucy$ du -sh default optimized
17.6M	default
16.8M	optimized

Using progressive mode didn't save much either, but it should give a better experience for folks using a slow link.

lucy$ time for f in original/*.jpeg ; do
> bn=`basename $f`
> djpeg -pnm -maxmemory 100m < $f |
> cjpeg -maxmemory 100m -progressive > progressive/$bn
> done
    0m30.94s real     0m00.17s user     0m01.03s system
lucy$ du -sh default progressive
17.6M	default
16.4M	progressive

Based on these results, I figured I may as well go all-out with both progressive mode and optimized parameters so I tested that too.

lucy$ time for f in original/*.jpeg ; do
> bn=`basename $f`
> djpeg -pnm -maxmemory 100m < $f |
> cjpeg -maxmemory 100m -optimize -progressive > all-out/$bn
> done
    0m30.84s real     0m00.18s user     0m01.21s system
lucy$ du -sh default all-out
17.6M	default
16.4M	all-out

I performed an exhaustive qualitative perceptual study of the resulting images and found that they looked good.

Coding

I could have used system with standard shell metacharacters to build the pipeline, but this requires /bin/sh which I didn't have in my chroot jail. It wouldn't be a big deal to bring it in, but I figured I'd open separate pipes from djpeg and to cjpeg instead and use a read/print loop to copy the data. This may provide desirable flexibility later anyway.

sub djpeg {
	my ($path) = @_;
	my @args = ('-maxmemory', '100m', '-pnm', $path);
	open(my $fh, '-|', '/usr/local/bin/djpeg', @args)
		or die "Could not open djpeg for $path: $!";
	return $fh;
}
sub cjpeg {
	my ($path) = @_;
	my @args = ('-maxmemory', '100m', '-optimize', '-progressive',
		'-outfile', $path);
	open(my $fh, '|-', '/usr/local/bin/cjpeg', @args)
		or die "Could not open cjpeg for $path: $!";
	return $fh;
}

There were a couple of things I wanted to think about when playing with pipes like this in Perl. One was that if cjpeg were to exit before I'd finished writing to it, I'd get SIGPIPE which would cause my process to exit. I made a local change to the signal handler to just emit a warning. Another thing I wanted to pay attention to was the exit codes from djpeg and cjpeg. Happily, Perl's close will check the exit code on the child process when closing a pipe like this, though it's necessary to check if close failed because of an error from the underlying system call or due to a nonzero exit code. For my application, I can fall back to the unoptimized image so I just emit a warning and unlink the optimized output in the case of a nonzero exit code.

sub optimize_jpeg {
	my ($in, $out) = @_;
	my $djpeg = djpeg($in);
	my $cjpeg = cjpeg($out);
	local $SIG{PIPE} = sub { warn "optimize caught sigpipe" };
	while(read($djpeg, my $buffer, 1048576)) {
		print $cjpeg $buffer;
	}
	unless(close $djpeg) {
		die "Could not close djpeg: $!" if $!;
		warn "djpeg $in exited with code $?";
		unlink $out;
	}
	unless(close $cjpeg) {
		die "Could not close cjpeg: $!" if $!;
		warn "cjpeg $out exited with code $?";
		unlink $out;
	}
}

I put together a little convenience function that will take the name of a media file and return the name of an optimized media file if it can be found or made, using optimize_jpeg to optimize... jpegs.

sub optimized_media {
        my ($file) = @_;
        return $file unless $file =~ /\.jpeg$/;
        my $opt_file = "opt$file";
        return $opt_file if -e "$site_dir/$opt_file";
        optimize_jpeg("$site_dir/$file", "$site_dir/$opt_file");
        return $opt_file if -e "$site_dir/$opt_file";
        return $file;
}

Conclusion

I hope that you enjoyed this article and that it gave you some ideas about how you could do some image optimization in Perl without too much fuss. If this article is the kind of thing you're into, you may enjoy my other articles.

Aaron D. Parks
aparks@aftermath.net