I had a number of JPEG images (and more coming in all the time) that I wanted to optimize. I had happened to look at the source for jpegoptim before and I seemed to recall that it mostly used libjpeg to re-compress the image with reasonable settings. I saw that libjpeg-turbo was already in the OpenBSD ports tree and I wondered if I might be able to use its djpeg and cjpeg utilities to do something similar by calling them from Perl.
I went ahead and installed libjpeg-turbo. I was happy to see that it didn't have any dependencies. It doesn't sit well with me to pull in a world of dependencies to perform a seemingly ancillary function.
lucy# pkg_add jpeg quirks-3.187 signed on 2020-05-19T14:41:48Z jpeg-2.0.3v0: ok
My scripts run in a chroot jail, so I'd need to copy djpeg and cjpeg into there to use them. First, I used ldd to check what shared libraries they might need. I was happy to see that (other than libc and ld.so) they only depended on their own libjpeg. I copied that shared library along with djpeg and cjpeg into the jail. After copying the shared library into the jail, I ran ldconfig to rebuild the hints file for ld.so.
lucy# mkdir /var/www/usr/local/lib lucy# cp /usr/local/lib/libjpeg.so.70.0 /var/www/usr/local/lib lucy# mkdir /var/www/usr/local/bin lucy# cp /usr/local/bin/?jpeg /var/www/usr/local/bin lucy# chroot /var/www ldconfig /usr/local/lib
I started out by doing some experiments on a small set of images I had which I thought might be representative. I tried compressing with the default options, with entropy encoding parameter optimization, and with progressive mode enabled. I timed the execution and checked the size of the output for each. I also used the -maxmemory option since I planned to use it in production and was curious to see if there would be any trouble with my test images.
Using default settings (including the default quality of 75) picked up an easy savings of about 48%.
lucy$ time for f in original/*.jpeg ; do > bn=`basename $f` > djpeg -pnm -maxmemory 100m < $f | > cjpeg -maxmemory 100m > default/$bn > done 0m23.36s real 0m00.11s user 0m01.21s system lucy$ du -sh original default 33.6M original 17.6M default
Optimizing the entropy-encoding parameters didn't save much more but it also didn't take much time.
lucy$ time for f in original/*.jpeg ; do > bn=`basename $f` > djpeg -pnm -maxmemory 100m < $f | > cjpeg -maxmemory 100m -optimize > optimized/$bn > done 0m24.71s real 0m00.13s user 0m01.14s system lucy$ du -sh default optimized 17.6M default 16.8M optimized
Using progressive mode didn't save much either, but it should give a better experience for folks using a slow link.
lucy$ time for f in original/*.jpeg ; do > bn=`basename $f` > djpeg -pnm -maxmemory 100m < $f | > cjpeg -maxmemory 100m -progressive > progressive/$bn > done 0m30.94s real 0m00.17s user 0m01.03s system lucy$ du -sh default progressive 17.6M default 16.4M progressive
Based on these results, I figured I may as well go all-out with both progressive mode and optimized parameters so I tested that too.
lucy$ time for f in original/*.jpeg ; do > bn=`basename $f` > djpeg -pnm -maxmemory 100m < $f | > cjpeg -maxmemory 100m -optimize -progressive > all-out/$bn > done 0m30.84s real 0m00.18s user 0m01.21s system lucy$ du -sh default all-out 17.6M default 16.4M all-out
I performed an exhaustive qualitative perceptual study of the resulting images and found that they looked good.
I could have used system
with standard shell metacharacters
to build the pipeline, but this requires /bin/sh which I didn't have in my
chroot jail. It wouldn't be a big deal to bring it in, but I figured I'd
open separate pipes from djpeg and to cjpeg instead and use a read/print loop
to copy the data. This may provide desirable flexibility later anyway.
sub djpeg { my ($path) = @_; my @args = ('-maxmemory', '100m', '-pnm', $path); open(my $fh, '-|', '/usr/local/bin/djpeg', @args) or die "Could not open djpeg for $path: $!"; return $fh; }
sub cjpeg { my ($path) = @_; my @args = ('-maxmemory', '100m', '-optimize', '-progressive', '-outfile', $path); open(my $fh, '|-', '/usr/local/bin/cjpeg', @args) or die "Could not open cjpeg for $path: $!"; return $fh; }
There were a couple of things I wanted to think about when playing with pipes
like this in Perl. One was that if cjpeg were to exit before I'd finished
writing to it, I'd get SIGPIPE which would cause my process to exit. I made a
local change to the signal handler to just emit a warning. Another thing I
wanted to pay attention to was the exit codes from djpeg and cjpeg. Happily,
Perl's close
will check the exit code on the child process when
closing a pipe like this, though it's necessary to check if close
failed because of an error from the underlying system call or due to a nonzero
exit code. For my application, I can fall back to the unoptimized image so I
just emit a warning and unlink the optimized output in the case of a nonzero
exit code.
sub optimize_jpeg { my ($in, $out) = @_; my $djpeg = djpeg($in); my $cjpeg = cjpeg($out); local $SIG{PIPE} = sub { warn "optimize caught sigpipe" }; while(read($djpeg, my $buffer, 1048576)) { print $cjpeg $buffer; } unless(close $djpeg) { die "Could not close djpeg: $!" if $!; warn "djpeg $in exited with code $?"; unlink $out; } unless(close $cjpeg) { die "Could not close cjpeg: $!" if $!; warn "cjpeg $out exited with code $?"; unlink $out; } }
I put together a little convenience function that will take the name of a media file and return the name of an optimized media file if it can be found or made, using optimize_jpeg to optimize... jpegs.
sub optimized_media { my ($file) = @_; return $file unless $file =~ /\.jpeg$/; my $opt_file = "opt$file"; return $opt_file if -e "$site_dir/$opt_file"; optimize_jpeg("$site_dir/$file", "$site_dir/$opt_file"); return $opt_file if -e "$site_dir/$opt_file"; return $file; }
I hope that you enjoyed this article and that it gave you some ideas about how you could do some image optimization in Perl without too much fuss. If this article is the kind of thing you're into, you may enjoy my other articles.
Aaron D. Parks