- OpenCV By Example
- Prateek Joshi David Millán Escrivá Vinícius Godoy
- 455字
- 2025-02-20 19:20:49
Images and matrices
The most important structure in a Computer Vision is without any doubt the images. The image in Computer Vision is a representation of the physical world captured with a digital device. This picture is only a sequence of numbers stored in a matrix format, as shown in the following image. Each number is a measurement of the light intensity for the considered wavelength (for example, red, green, or blue in color images) or for a wavelength range (for panchromatic devices). Each point in an image is called a pixel (for a picture element), and each pixel can store one or more values depending on whether it is a gray, black, or white image (called a binary image as well) that stores only one value, such as 0 or 1, a gray-scale-level image that can store only one value, or a color image that can store three values. These values are usually integer numbers between 0 and 255, but you can use the other range. For example, 0 to 1 in a floating point numbers such as HDRI (High Dynamic Range Imaging) or thermal images.

The image is stored in a matrix format, where each pixel has a position in it and can be referenced by the number of the column and row. OpenCV uses the Mat
class for this purpose. In the case of a grayscale image, a single matrix is used, as shown in the following figure:

In the case of a color image, as shown in the following image, we use a matrix of size width x height x number of colors:

The Mat
class is not only used to store images, but also to store different types of arbitrarily sized matrices. You can use is it as an algebraic matrix and perform operations with it. In the next section, we are going to describe the most important matrix operations such as add, matrix multiplication, create a diagonal matrix, and so on.
However, before that, it's important to know how the matrix is stored internally in the computer memory because it is always better to have efficient access to the memory slots instead of access to each pixel with the OpenCV functions.
In memory, the matrix is saved as an array or sequence of values ordered by columns and rows. The following table shows the sequence of pixels in the BGR image format:

With this order, we can access any pixel, as shown in the following formula:
Value= Row_i*num_cols*num_channels + Col_i + channel_i
Note
OpenCV functions are quite optimized for random access, but sometimes direct access to the memory (working with pointer arithmetic) is more efficient—for example, when we have access to all the pixels in a loop.