Digital images are represented as rectangular arrays of square
pixels.
Digital images use a left-hand coordinate system, with the origin in
the upper left corner, the x-axis running to the right, and the y-axis
running down. Some learners may prefer to think in terms of counting
down rows for the y-axis and across columns for the x-axis. Thus, we
will make an effort to allow for both approaches in our lesson
presentation.
Most frequently, digital images use an additive RGB model, with
eight bits for the red, green, and blue channels.
scikit-image images are stored as multi-dimensional NumPy
arrays.
In scikit-image images, the red channel is specified first, then the
green, then the blue, i.e., RGB.
Lossless compression retains all the details in an image, but lossy
compression results in loss of some of the original image detail.
BMP images are uncompressed, meaning they have high quality but also
that their file sizes are large.
JPEG images use lossy compression, meaning that their file sizes are
smaller, but image quality may suffer.
TIFF images can be uncompressed or compressed with lossy or lossless
compression.
Depending on the camera or sensor, various useful pieces of
information may be stored in an image file, in the image metadata.
Images are read from disk with the iio.imread()
function.
We create a window that automatically scales the displayed image
with Matplotlib and calling imshow() on the global figure
object.
Colour images can be transformed to grayscale using
ski.color.rgb2gray() or, in many cases, be read as
grayscale directly by passing the argument mode="L" to
iio.imread().
We can resize images with the ski.transform.resize()
function.
NumPy array commands, such as
image[image < 128] = 0, can be used to manipulate the
pixels of an image.
Array slicing can be used to extract sub-images or modify areas of
images, e.g., clip = image[60:150, 135:480, :].
Metadata is not retained when images are loaded as NumPy arrays
using iio.imread().
Thresholding produces a binary image, where all pixels with
intensities above (or below) a threshold value are turned on, while all
other pixels are turned off.
The binary images produced by thresholding are held in
two-dimensional NumPy arrays, since they have only one colour value
channel. They are boolean, hence they contain the values 0 (off) and 1
(on).
Thresholding can be used to create masks that select only the
interesting parts of an image, or as the first step before edge
detection or finding contours.