3.图像色彩学 · 算法文档

# 3. 图像色彩学图像色彩处理是基于色彩和色域来进行图像处理。在OpenCV中，每一个图像对象被抽象成为n维矩阵（下面详细叙述），而图像对象可以是一张图片，视频里的某一帧，摄像头读取到的数据流中的一个数据部分，或者人为赋值产生的一个数据矩阵。在opencv-python中，借助了Numpy进行运算，在C++中可以使用Eigen进行运算加速。 | ![图像坐标系](https://img.kancloud.cn/a3/fe/a3fef0969037f00ef79d200bb01760e4_323x240.png) | | :-----------------------------------------------------------------------: | | *图像坐标系* | 将图像用最小单位分割开，可视作lines * columns pixels，即width * height pixels，$image[u] [v]$即是最小元素。此时我们将图像对象扩展成为2维数组，以RGB颜色空间为例，每一个元素的值为一个数值对，即 ```[tex] image[u] [v] = (R,G,B) ``` 在此基础上，增添C轴，即Channel，通道数。 ```[tex] image[u] [v] [c] = X ``` 此时图像已经被成为了一个三维数组，便于计算，可以写成一个三维矩阵，第三个维度是通道数，此时只有RGB通道。回顾前文，举例通过RGB颜色空间进行举例，对于OpenCV，有很多颜色通道可以进行处理 ## 1. RGB ↔ GRAY Transformations within RGB space like adding/removing the alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5), as well as conversion to/from grayscale using: ```[tex] RGB[A] \to\ Gray:Y←0.299⋅R+0.587⋅G+0.114⋅B \\ Gray \to\ RGB[A]:R←Y,G←Y,B←Y,A←max(ChannelRange) ``` ## 2. RGB ↔ HSV HSV（Hue, Saturation, Value）是根据颜色的直观特性由 A. R. Smith 在 1978 年创建的一种颜色空间, 也称六角锥体模型（Hexcone Model）。这个模型中颜色的参数分别是色调（H）、饱和度（S）和明度（V）。 :-: ![hsv](https://bkimg.cdn.bcebos.com/pic/b151f8198618367ab5bcd4792e738bd4b31ce559?x-bce-process=image/watermark,image_d2F0ZXIvYmFpa2U3Mg==,g_7,xp_5,yp_5/format,f_auto) In case of 8-bit and 16-bit images, R, G, and B are converted to the floating-point format and scaled to fit the 0 to 1 range. ```[tex] V←max(R,G,B) \\\\ S← \left\{ \begin{matrix} \frac{V−min(R,G,B)}{V} & if\ V≠0,\\ 0 & otherwise \end{matrix} \right. \\\\ H= \left\{ \begin{matrix} 60(G−B)/(V−min(R,G,B)) & if\ V=R\\ 120+60(B−R)/(V−min(R,G,B)) & if\ V=G\\ 240+60(R−G)/(V−min(R,G,B)) & if\ V=B\\ \end{matrix} \right. \\\\ If\ H<0\ then\ H←H+360\ . On\ output\ 0≤V≤1, 0≤S≤1, 0≤H≤360 . ``` The values are then converted to the destination data type: - 8-bit images: `$ V←255V,S←255S,H←H/2(to\ fit\ to\ 0\ to\ 255) $` - 16-bit images: (currently not supported) `$ V<−65535V,S<−65535S,H<−H $` - 32-bit images: H, S, and V are left as is ## 3. Bayer → RGB The Bayer pattern is widely used in CCD and CMOS cameras. It enables you to get color pictures from a single plane where R,G, and B pixels (sensors of a particular component) are interleaved as follows: | ![bayer.png](https://img.kancloud.cn/49/40/49400f69db68919b33eadf5e6201bb2d_256x170.png) | | :------------------------------: | | *Bayer pattern* | The output RGB components of a pixel are interpolated from 1, 2, or 4 neighbors of the pixel having the same color. There are several modifications of the above pattern that can be achieved by shifting the pattern one pixel left and/or one pixel up. The two letters *C1* and *C2* in the conversion constants CV_Bayer *C1C2* 2BGR and CV_Bayer *C1C2* 2RGB indicate the particular pattern type. These are components from the second row, second and third columns, respectively. For example, the above pattern has a very popular "BG" type. ## 常用函数 ### 二值化函数 ```C++ double cv::threshold ( InputArray src, OutputArray dst, double thresh, double maxval, int type ) /** @brief Applies a fixed-level threshold to each array element. 简介将固定级别阈值应用于每个数组元素。 The function applies fixed-level thresholding to a multiple-channel array. The function is typically used to get a bi-level (binary) image out of a grayscale image ( #compare could be also used for this purpose) or for removing a noise, that is, filtering out pixels with too small or too large values. There are several types of thresholding supported by the function. They are determined by type parameter. Also, the special values #THRESH_OTSU or #THRESH_TRIANGLE may be combined with one of the above values. In these cases, the function determines the optimal threshold value using the Otsu's or Triangle algorithm and uses it instead of the specified thresh函数将固定级别阈值应用于多通道阵列。该函数通常用于从灰度图像中获取二级（二进制）图像（“比较”也可用于此目的）或用于去除噪声，即过滤掉值过小或过大的像素。该函数支持多种类型的阈值设置。它们由类型参数确定。此外，特殊值#THRESH_OTSU或#THRESH_TRIANGLE可与上述值之一组合。在这些情况下，函数使用大津算法或三角形算法确定最佳阈值，并使用它代替指定的阈值。 @note Currently, the Otsu's and Triangle methods are implemented only for 8-bit single-channel images. 请注意，目前，大津和三角形方法仅适用于8位单通道图像。 @param src input array (multiple-channel, 8-bit or 32-bit floating point). @param dst output array of the same size and type and the same number of channels as src. @param thresh threshold value. @param maxval maximum value to use with the #THRESH_BINARY and #THRESH_BINARY_INV thresholding types. @param type thresholding type (see #ThresholdTypes). @return the computed threshold value if Otsu's or Triangle methods used. @sa adaptiveThreshold, findContours, compare, min, max */ ``` type of the threshold operation | type name | method | image![origin](https://img.kancloud.cn/cc/f2/ccf22eaf33e7cd887f96821bebb82f7e_327x112.png) | | ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- | | THRESH_BINARY = 0 | `$ \texttt{dst} (x,y) =\left\{\begin{matrix}\texttt{maxval} & if\ (\texttt{src}(x,y) > \texttt{thresh})\\0 & otherwise\end{matrix}\right. $` | ![binary](https://img.kancloud.cn/9b/d1/9bd17ff532a4c22cc293bcccdb6df2ce_330x109.png) | | THRESH_BINARY_INV = 1 | `$ \texttt{dst} (x,y) =\left\{\begin{matrix}0 & if\ (\texttt{src}(x,y) > \texttt{thresh}) \\\texttt{maxval} & otherwise\end{matrix}\right. $` | ![binary_inv](https://img.kancloud.cn/17/f0/17f0c4552b0cdd69d07f6a161df6c865_329x113.png) | | THRESH_TRUNC = 2 | `$ \texttt{dst} (x,y) =\left\{\begin{matrix}\texttt{threshold} & if\ (\texttt{src}(x,y) > \texttt{thresh})\\\texttt{src}(x,y) & otherwise\end{matrix}\right. $` | ![trunc](https://img.kancloud.cn/4f/fd/4ffd1e1e0eb256f9c3391ebcf036547b_328x130.png) | | THRESH_TOZERO = 3 | `$ \texttt{dst} (x,y) =\left\{\begin{matrix}\texttt{src}(x,y) & if\ (\texttt{src}(x,y) > \texttt{thresh})\\0 & otherwise\end{matrix}\right. $` | ![to_zero](../image//image/tozero.png) | | THRESH_TOZERO_INV = 4, | `$ \texttt{dst} (x,y) =\left\{\begin{matrix}0 & if\ (\texttt{src}(x,y) > \texttt{thresh})\\\texttt{src}(x,y) & otherwise\end{matrix}\right. $` | ![to_zero_inv](../image//image/tozero_inv.png) | | THRESH_MASK = 7, | | | | THRESH_OTSU = 8, | flag, use Otsu algorithm to choose the optimal threshold value | | | THRESH_TRIANGLE = 16 | flag, use Triangle algorithm to choose the optimal threshold value | | ### 色彩空间转换函数 ```C++ void cv::cvtColor ( InputArray src, OutputArray dst, int code, int dstCn = 0 ) /** @brief Converts an image from one color space to another. 简介将图像从一个颜色空间转换为另一个颜色空间。 The function converts an input image from one color space to another.In case of a transformation to-from RGB color space, the order of the channels should be specified explicitly (RGB or BGR). Note that the default color format in OpenCV is often referred to as RGB but it is actually BGR (the bytes are reversed). So the first byte in a standard (24-bit) color image will be an 8-bit Blue component, the second byte will be Green, and the third byte will be Red. The fourth, fifth, and sixth bytes would then be the second pixel (Blue, then Green, then Red), and so on. 该函数用于将输入图像从一个颜色空间转换为另一个颜色空间。如果从RGB颜色空间转换为，则应明确指定通道的顺序（RGB或BGR）。请注意，OpenCV中的默认颜色格式通常称为RGB，但实际上是BGR（字节颠倒）。因此，标准（24位）彩色图像中的第一个字节将是8位蓝色分量，第二个字节将是绿色，第三个字节将是红色。第四、第五和第六个字节将是第二个像素（蓝色、绿色、红色），依此类推。 The conventional ranges for R, G, and B channel values are: - 0 to 255 for CV_8U images - 0 to 65535 for CV_16U images - 0 to 1 for CV_32F images In case of linear transformations, the range does not matter. But in case of a non-linear transformation, an input RGB image should be normalized to the proper value range to get the correct results, for example, for RGB → L*u*v* transformation. For example, if you have a 32-bit floating-point image directly converted from an 8-bit image without any scaling, then it will have the 0..255 value range instead of 0..1 assumed by the function. So, before calling cvtColor , you need first to scale the image down: 在线性变换的情况下，范围并不重要。但在非线性变换的情况下，输入的RGB图像应规格化为适当的值范围，以获得正确的结果，例如RGB→ L*u*v*变换。例如，如果您有一个从8位图像直接转换而来的32位浮点图像，而不进行任何缩放，那么它将具有0..255值范围，而不是函数假定的0..1。因此，在调用cvtColor之前，首先需要缩小图像的比例： */ img *= 1./255; cvtColor(img, img, COLOR_BGR2Luv); /* If you use #cvtColor with 8-bit images, the conversion will have some information lost. For many applications, this will not be noticeable but it is recommended to use 32-bit images in applications that need the full range of colors or that convert an image before an operation and then convert back. If conversion adds the alpha channel, its value will set to the maximum of corresponding channel range: 255 for CV_8U, 65535 for CV_16U, 1 for CV_32F. 如果对8位图像使用#cvtColor，转换过程中会丢失一些信息。对于许多应用程序，这一点并不明显，但建议在需要全套颜色的应用程序中使用32位图像，或在操作前转换图像，然后再转换回图像的应用程序中使用32位图像。如果转换添加alpha通道，其值将设置为相应通道范围的最大值：CV_8U为255，CV_16U为65535，CV_32F为1。 @param src input image: 8-bit unsigned, 16-bit unsigned ( CV_16UC... ), or single-precision floating-point. @param dst output image of the same size and depth as src. @param code color space conversion code (see #ColorConversionCodes). @param dstCn number of channels in the destination image; if the parameter is 0, the number of the channels is derived automatically from src and code. @see @ref imgproc_color_conversions */ ``` 常见的转换格式代码有：BGR2RGB BGR2GRAY BGR2BGRA BGR2HSV ### 图像通道分割 ```C++ void cv::split ( const Mat & src, Mat * mvbegin ) /** @brief Divides a multi-channel array into several single-channel arrays. 简介将多通道阵列划分为多个单通道阵列。 If you need to extract a single channel or do some other sophisticated channel permutation, use mixChannels . 如果需要提取单个通道或进行其他复杂的通道排列，请使用MixChannel。 The following example demonstrates how to split a 3-channel matrix into 3 single channel matrices. 以下示例演示如何将3通道矩阵拆分为3个单通道矩阵。 */ char d[] = {1,2,3,4,5,6,7,8,9,10,11,12}; Mat m(2, 2, CV_8UC3, d); Mat channels[3]; split(m, channels); /* channels[0] = [ 1, 4; 7, 10] channels[1] = [ 2, 5; 8, 11] channels[2] = [ 3, 6; 9, 12] */ /* @snippet snippets/core_split.cpp example @param src input multi-channel array. @param mvbegin output array; the number of arrays must match src.channels(); the arrays themselves are reallocated, if needed. @sa merge, mixChannels, cvtColor */ void cv::split ( InputArray m, OutputArrayOfArrays mv ) /** @overload @param m input multi-channel array. @param mv output vector of arrays; the arrays themselves are reallocated, if needed. */ ``` split()计算公式 `$ \texttt{mv} [c] (I) = \texttt{src} (I)_c $` ### 通道合并 ```C++ void cv::merge ( const Mat* mv, size_t count, OutputArray dst ) /** @brief Creates one multi-channel array out of several single-channel ones. 简介从多个单通道阵列中创建一个多通道阵列。 The function cv::merge merges several arrays to make a single multi-channel array. That is, each element of the output array will be a concatenation of the elements of the input arrays, where elements of i-th input array are treated as mv[i].channels()-element vectors. 函数cv::merge将多个数组合并为一个多通道数组。也就是说，输出数组的每个元素都是输入数组元素的串联，其中第i个输入数组的元素被视为mv[i].channels()-元素向量。 The function cv::split does the reverse operation. If you need to shuffle channels in some other advanced way, use cv::mixChannels. 函数cv::split执行相反的操作。如果需要以其他高级方式裁剪排序通道，请使用cv::MixChannel。 The following example shows how to merge 3 single channel matrices into a single 3-channel matrix. */ Mat m1 = (Mat_<uchar>(2,2) << 1,4,7,10); Mat m2 = (Mat_<uchar>(2,2) << 2,5,8,11); Mat m3 = (Mat_<uchar>(2,2) << 3,6,9,12); Mat channels[3] = {m1, m2, m3}; Mat m; merge(channels, 3, m); /* m = [ 1, 2, 3, 4, 5, 6; 7, 8, 9, 10, 11, 12] m.channels() = 3 */ /** @snippet snippets/core_merge.cpp example @param mv input array of matrices to be merged; all the matrices in mv must have the same size and the same depth. @param count number of input matrices when mv is a plain C array; it must be greater than zero. @param dst output array of the same size and the same depth as mv[0]; The number of channels will be equal to the parameter count. @sa mixChannels, split, Mat::reshape */ void cv::merge ( InputArrayOfArrays mv, OutputArray dst ) /** @overload @param mv input vector of matrices to be merged; all the matrices in mv must have the same size and the same depth. @param dst output array of the same size and the same depth as mv[0]; The number of channels will be the total number of channels in the matrix array. */ ``` ### 通道混合 ```C++ void cv::mixChannels ( const Mat * src, size_t nsrcs, Mat * dst, size_t ndsts, const int * fromTo, size_t npairs ) /** @brief Copies specified channels from input arrays to the specified channels of output arrays. 简介将指定通道从输入阵列复制到输出阵列的指定通道。 The function cv::mixChannels provides an advanced mechanism for shuffling image channels. 函数cv::mixChannels提供了一种洗牌图像通道的高级机制。 cv::split,cv::merge,cv::extractChannel,cv::insertChannel and some forms of cv::cvtColor are partial cases of cv::mixChannels. In the example below, the code splits a 4-channel BGRA image into a 3-channel BGR (with B and R channels swapped) and a separate alpha-channel image: */ Mat bgra( 100, 100, CV_8UC4, Scalar(255,0,0,255) ); Mat bgr( bgra.rows, bgra.cols, CV_8UC3 ); Mat alpha( bgra.rows, bgra.cols, CV_8UC1 ); // forming an array of matrices is a quite efficient operation, // because the matrix data is not copied, only the headers Mat out[] = { bgr, alpha }; // bgra[0] -> bgr[2], bgra[1] -> bgr[1], // bgra[2] -> bgr[0], bgra[3] -> alpha[0] int from_to[] = { 0,2, 1,1, 2,0, 3,3 }; mixChannels( &bgra, 1, out, 2, from_to, 4 ); /** @note Unlike many other new-style C++ functions in OpenCV (see the introduction section and Mat::create ), cv::mixChannels requires the output arrays to be pre-allocated before calling the function. @param src input array or vector of matrices; all of the matrices must have the same size and the same depth. @param nsrcs number of matrices in `src`. @param dst output array or vector of matrices; all the matrices must be allocated; their size and depth must be the same as in `src[0]`. @param ndsts number of matrices in `dst`. @param fromTo array of index pairs specifying which channels are copied and where; fromTo[k\*2] is a 0-based index of the input channel in src, fromTo[k\*2+1] is an index of the output channel in dst; the continuous channel numbering is used: the first input image channels are indexed from 0 to src[0].channels()-1, the second input image channels are indexed from src[0].channels() to src[0].channels() + src[1].channels()-1, and so on, the same scheme is used for the output image channels; as a special case, when fromTo[k\*2] is negative, the corresponding output channel is filled with zero . fromTo索引对数组，指定复制的通道和位置；fromTo[k\*2]是src中输入通道的基于0的索引，fromTo[k\*2+1]是dst中输出通道的索引；使用连续通道编号：第一个输入图像通道的索引从0到src[0]。channels()-1，第二个输入图像通道的索引从src[0]。channels()到src[0]。channels()+src[1]。channels()-1，依此类推，输出图像通道使用相同的方案；作为一种特殊情况，当fromTo[k\*2]为负时，相应的输出通道填充为零。 @param npairs number of index pairs in `fromTo`. @sa split, merge, extractChannel, insertChannel, cvtColor */ void cv::mixChannels ( InputArrayOfArrays src, InputOutputArrayOfArrays dst, const int * fromTo, size_t npairs ) /** @overload @param src input array or vector of matrices; all of the matrices must have the same size and the same depth. @param dst output array or vector of matrices; all the matrices must be allocated; their size and depth must be the same as in src[0]. @param fromTo array of index pairs specifying which channels are copied and where; fromTo[k\*2] is a 0-based index of the input channel in src, fromTo[k\*2+1] is an index of the output channel in dst; the continuous channel numbering is used: the first input image channels are indexed from 0 to src[0].channels()-1, the second input image channels are indexed from src[0].channels() to src[0].channels() + src[1].channels()-1, and so on, the same scheme is used for the output image channels; as a special case, when fromTo[k\*2] is negative, the corresponding output channel is filled with zero . @param npairs number of index pairs in fromTo. */ void cv::mixChannels ( InputArrayOfArrays src, InputOutputArrayOfArrays dst, const std::vector< int > & fromTo ) /** @overload @param src input array or vector of matrices; all of the matrices must have the same size and the same depth. @param dst output array or vector of matrices; all the matrices must be allocated; their size and depth must be the same as in src[0]. @param fromTo array of index pairs specifying which channels are copied and where; fromTo[k\*2] is a 0-based index of the input channel in src, fromTo[k\*2+1] is an index of the output channel in dst; the continuous channel numbering is used: the first input image channels are indexed from 0 to src[0].channels()-1, the second input image channels are indexed from src[0].channels() to src[0].channels() + src[1].channels()-1, and so on, the same scheme is used for the output image channels; as a special case, when fromTo[k\*2] is negative, the corresponding output channel is filled with zero . */ ```