Simple JPEG image I/O with libjpeg
20 Mar 2020 - tsp
Last update 23 Dec 2021
11 mins
Back when I started experimenting with computer vision algorithms Iāve
always had to major problem of having to use either proprietary toolboxes (back
then when I started playing around with computer vision at university we
used MatLab with the highly flexible image processing toolbox) or use some
ugly hacks (like having a half baked JPEG decoder implementation written
by myself that unfortunately was never really finished). After some
time I came to the decision that using libjpeg
or libjpeg-turbo would be a good idea
since it provides a solid implementation of JPEG and a rather
simple programming interface. But the documentation was hard to read
for me and I felt that I just missed an easy example on how to use libjpeg
for accessing JPEG images. The same problem arose later when I tried
to process images captures via a Video4Linux device (i.e. a webcam) and
the RaspberryPi camera. So I decided to write this really short introduction - and
provide a basic method that just reads an JPEG file into an bitmap buffer that
one can simply copy and paste into existing projects without having any
other dependencies than libjpeg
.
Update 2021: Iāve also written a short summary on how to capture frames
using webcams from C when using the Video4Linux
API on FreeBSD or Linux which is also rather simple.
Data structure used in this example
To do experimentation in the field of computer vision itās often simple
and feasible to keep a whole uncompressed bitmap of the source images
in main memory. This of course assumes that either enough memory is present
of one wants to rely on swapping and/or memory mapping large regions
of data into memory. Note that this approach is really nice when doing
experimentation and is also feasible for some real world tasks (like
image classification) but might be problematic when one has do deal with
high resolution images without downsampling or wants to keep a huge number
of images in main memory (for example when doing reconstruction in radio
astronomy, etc.). In this case one should start thinking about resource
management before implementing anything though.
Keeping the image inside main memory as a continuous block also allows
easy transfer to OpenCL or CUDA processing pipelines.
To store the image for easy accessing the following datastructure will be
defined:
struct imgRawImage {
unsigned int numComponents;
unsigned long int width, height;
unsigned char* lpData;
};
numComponents
specifies the number (but not the type) of components.
Usually these are 3 components for RGB and YCbCr, 4 for ARGB
and AYCbCr and 1 for greyscale color space. Note that this functions
do not store any information about the used color space - this is of
course a drawback in real world scenarios but the extension is pretty simple.
width
and height
specify the number of pixels per scanline
and the number of scanlines (i.e. width and height of the image).
- lpData points to raw data. For this library itās assumed to point to
RGB888, Ycbcr888 or greyscale8 data (i.e. 3 bytes for 3 components, 1 byte
for 1 component). Data is normalized into the range 0-255.
How to load a JPEG using libjpeg
The basic process is rather simple as one can also see from the code
given below:
- Create a JPEG decompressor using standard error handling methods
- Set a libc
FILE
reference as source when reading from disk
- Read the image header
- Start the decompressor
- Allocate the required buffer
- Read the JPEG file scanline by scanline into the target buffer
- Release associated resources
Note that the following code does not perform proper error handling. This
has been left out due to readability of the code. Error handling has to be
implemented in any real life scenario thatās going to be used for more than
a simple experiment. Crashing a program is no error handling (except when
developing with a framework like Erlang/OTP of course)!
The code
#include <jpeglib.h>
#include <jerror.h>
struct imgRawImage* loadJpegImageFile(char* lpFilename) {
struct jpeg_decompress_struct info;
struct jpeg_error_mgr err;
struct imgRawImage* lpNewImage;
unsigned long int imgWidth, imgHeight;
int numComponents;
unsigned long int dwBufferBytes;
unsigned char* lpData;
unsigned char* lpRowBuffer[1];
FILE* fHandle;
fHandle = fopen(lpFilename, "rb");
if(fHandle == NULL) {
#ifdef DEBUG
fprintf(stderr, "%s:%u: Failed to read file %s\n", __FILE__, __LINE__, lpFilename);
#endif
return NULL; /* ToDo */
}
info.err = jpeg_std_error(&err);
jpeg_create_decompress(&info);
jpeg_stdio_src(&info, fHandle);
jpeg_read_header(&info, TRUE);
jpeg_start_decompress(&info);
imgWidth = info.output_width;
imgHeight = info.output_height;
numComponents = info.num_components;
#ifdef DEBUG
fprintf(
stderr,
"%s:%u: Reading JPEG with dimensions %lu x %lu and %u components\n",
__FILE__, __LINE__,
imgWidth, imgHeight, numComponents
);
#endif
dwBufferBytes = imgWidth * imgHeight * 3; /* We only read RGB, not A */
lpData = (unsigned char*)malloc(sizeof(unsigned char)*dwBufferBytes);
lpNewImage = (struct imgRawImage*)malloc(sizeof(struct imgRawImage));
lpNewImage->numComponents = numComponents;
lpNewImage->width = imgWidth;
lpNewImage->height = imgHeight;
lpNewImage->lpData = lpData;
/* Read scanline by scanline */
while(info.output_scanline < info.output_height) {
lpRowBuffer[0] = (unsigned char *)(&lpData[3*info.output_width*info.output_scanline]);
jpeg_read_scanlines(&info, lpRowBuffer, 1);
}
jpeg_finish_decompress(&info);
jpeg_destroy_decompress(&info);
fclose(fHandle);
return lpNewImage;
}
Walkthrough and explaination
First the compressor is created using the jpeg_create_decompress
function. This function requires a set of error handling routines. In the
most simple case one can use the default ones provided by libjpeg
. The
error manager state structure struct jpeg_error_mgr
can be initialized
by jpeg_std_error
. This function also returns a reference to the newly
initialized structure. (Since a student of mine made that mistake a number of
times: Note that this error manager will be declared as
a local variable in the following example - when modularizing further one
should take care that this error manager stays valid till the decoder is released!)
After that the compressor has itās data supplied from one of the sources. In
the following example the source is set to a libc FILE
reference to
read out of a file located on disk. This is done using jpeg_stdio_src
.
Another way would be reading from a memory location using jpeg_mem_src
after an JPEG has been received via any other mean (camera device for MJPEG
camera streams, network without caching, using memory mapping for file access,
etc.)
Then the decompressor is initialized and the header is read (jpeg_read_header
followed by jpeg_start_decompress
). Note that both functions might fail
so proper error handling is required.
Then the buffer is allocated. In this example two different data buffers are
used - one might also use a flexible array member for that. Iāve implemented
it that way to allow easy handling (including releasing, re-allocating, etc.)
of a raw data array independent of any metadata. This sampel code of course
also lacks error handling (malloc returns NULL
in hard out of memory
conditions if no out-of-memory killer is configured or in case resource
limits are reached).
After the buffer has been allocated the code reads the image scanline
by scanline using jpeg_read_scanlines
. One could also read multiple
scanlines at a time but since it might be desired to process them while
streaming this example has been implemented that way. One could of course
substitute the whole loop
while(info.output_scanline < info.output_height) {
lpRowBuffer[0] = (unsigned char *)(&lpData[3*info.output_width*info.output_scanline]);
jpeg_read_scanlines(&info, lpRowBuffer, 1);
}
by a single read:
lpRowBuffer[0] = (unsigned char *)(&lpData[0]);
jpeg_read_scanlines(&info, &lpRowBuffer, info.output_height);
Note that this function might also fail - add error handling again.
At the end the decompressor is finalized using jpeg_finish_decompress
and then released using jpeg_destroy_decompress
. Again please
take care that jpeg_finish_decompress
might indicate an error in which
case one might not want to use the already read data.
The other way: Storing raw images into JPEG
This works similar to the example above:
- Create a compressor
- Set itās target either to a file or memory sink
- Provide it metadata (image width and height, number of components
and used colorspace - RGB in this example)
- Setting compression parameters (to default in this case) and setting a quality
setting for the compressor.
- Initializing the compressors internal state machine
- Writing one scanline after each other (or a number of them in a bulk)
- Finishing the compressor
- Releasing resources
In this section no walkthrough will be provided since the idea is the same
as for the decompressor described above (except the direction of data
transfer).
The code
#include <jpeglib.h>
#include <jerror.h>
int storeJpegImageFile(struct imgRawImage* lpImage, char* lpFilename) {
struct jpeg_compress_struct info;
struct jpeg_error_mgr err;
unsigned char* lpRowBuffer[1];
FILE* fHandle;
fHandle = fopen(lpFilename, "wb");
if(fHandle == NULL) {
#ifdef DEBUG
fprintf(stderr, "%s:%u Failed to open output file %s\n", __FILE__, __LINE__, lpFilename);
#endif
return 1;
}
info.err = jpeg_std_error(&err);
jpeg_create_compress(&info);
jpeg_stdio_dest(&info, fHandle);
info.image_width = lpImage->width;
info.image_height = lpImage->height;
info.input_components = 3;
info.in_color_space = JCS_RGB;
jpeg_set_defaults(&info);
jpeg_set_quality(&info, 100, TRUE);
jpeg_start_compress(&info, TRUE);
/* Write every scanline ... */
while(info.next_scanline < info.image_height) {
lpRowBuffer[0] = &(lpImage->lpData[info.next_scanline * (lpImage->width * 3)]);
jpeg_write_scanlines(&info, lpRowBuffer, 1);
}
jpeg_finish_compress(&info);
fclose(fHandle);
jpeg_destroy_compress(&info);
return 0;
}
The color space provided has to match the number of components and the
supplied data. Normally this is one of:
JCS_RGB
with 3 components (red, green and blue channels; each 1 byte
in size)
JCS_GRAYSCALE
with 1 component (only the illumination channel) with 1
byte per pixel.
A simple experiment (implementing a grayscale filter)
Just as an example how one might implement a simple grayscale filter. In this
case itās assumed that the image should keep 3 color channels, all representing
the same intensity value as the greyscale channel. This has the advantage that
other processing functions do not have to discriminate the number of
components they are using. The disadvantage is three times as much memory
usage.
So how does greyscale conversion work? Basically the intensity value of
each channel is mapped to a given contribution into general intensity values.
This is done to reflect the sensitivity of the eye for different colors.
There is a huge number of different greyscale schemes, all differing
somehow in detail (the magnitude of numbers is similar).
In the most naive way one might:
- Allocate an output buffer in case one doesnāt want to replace input
data. For a grayscale filter one is capable of replacing each
pixel with itās greyscale value without being capable of continuing processing.
This would not be possible when one would calculate an integral image
or apply any kernel function like an Gaussian blur.
- One can simply use the factors
0.299
, 0.587
and 0.114
that are supplied in the models used by PAL and NTSC (other values
are nicely summarized at Wikipedia).
Note that the values supplied are not perceptual luminance preserving.
enum imageLibraryError filterGrayscale(
struct imgRawImage* lpInput,
struct imgRawImage** lpOutput
) {
unsigned long int i;
if(lpOutput == NULL) {
(*lpOutput) = lpInput; /* We will replace our input structure ... */
} else {
(*lpOutput) = malloc(sizeof(struct imgRawImage));
(*lpOutput)->width = lpInput->width;
(*lpOutput)->height = lpInput->height;
(*lpOutput)->numComponents = lpInput->numComponents;
(*lpOutput)->lpData = malloc(sizeof(unsigned char) * lpInput->width*lpInput->height*3);
}
for(i = 0; i < lpInput->width*lpInput->height; i=i+1) {
/* Do a grayscale transformation */
unsigned char luma = (unsigned char)(
0.299f * (float)lpInput->lpData[i * 3 + 0]
+ 0.587f * (float)lpInput->lpData[i * 3 + 1]
+ 0.114f * (float)lpInput->lpData[i * 3 + 2]
);
(*lpOutput)->lpData[i * 3 + 0] = luma;
(*lpOutput)->lpData[i * 3 + 1] = luma;
(*lpOutput)->lpData[i * 3 + 2] = luma;
}
return imageLibE_Ok;
}
Some other things to implement when doing first experiments
Some of the most useful filters and modules Iāve implemented during my
early experiments with computer vision have been:
- Integral image computation. Integral images are a great tool for simple
weak classifier cascades like the famous ViolaāJones object detection framework
that allows pretty easy object detection - like fast face detectors.
- Gaussian blur. This is a filter thatās pretty useful during downsampling
and downsampling is a requirement to build ā¦
- ⦠Difference of Gaussian (DoG) pyramids. These are great pyramids to
detect local blobs or corners.
- The SIFT keypoint detector and descriptor. Although being patented itās one
of the most powerful ones.
- The BRIST keypoint detector and descriptor. This is a really good alternative
to SIFT (but it includes some training to build an simple ID3 tree to train
the detector to fit the desired application domain)
- Global image descriptors like HOG might be nice when dealing with larger
data sets.
This article is tagged: Programming, Data Mining, Computer Vision