Color Transforms

Most image and video compressors exploit the statistical correlation (and also perceptual redundacy¹) between the \(\text {RGB}\) color components² of the pixels, using a color transform.

Color transforms are pixel-wise operators. As a result, each pixel is represented in a different (color) domain where (usually) three new coefficients³ express the same⁴ information.

2 Luma and chromas

Most color transforms are designed to split the color information of a pixel into luminance (or simply luma) and chrominance (chroma). The luma is basically the low frequency⁵ information of the color of the pixel, and the chroma (logically) the high frequency information.

For example, in JPEG and H.264/AVC the color information of each pixel is transformed from the \(\text {RGB}\) color space to the \(\text {YCrCb}\) color space, and in JPEG XR, the destination color space is \(\text {YCoCg}\). In these luma/chroma-based color spaces, (the symbol) \(\text {Y}\) represents the luma (coefficient) of the pixel. The other two coefficients form the chroma. Note that the chrominance of a pixel is determined by two chromas.

2.1 Components, channels, coefficients and subbands

In image and video coding, most color transforms map 3 (color) components (\(\text {RGB}\)) into 3 coefficients. It is common to call to the same component(-index) of all the pixels of an image: channel (for eample, the \(\text {R}\)-channel). In most bibliography, the same coefficient(-index) of all the coefficients of a transformed image is denoted by a subband (for example, the \(Y\)-subband).

3 Benefits of color transforms

Color transforms applied to natural visual information generally have two key advantages:

4 Scalar quantization in the color transform domain

If the color transform is orthogonal, (that is, the luma and the cromas are independent features of the signal⁸), the quantization noise generated in the subbands is additive respect to the reconstructed signal [1]. Therefore, from a pure RD point of view, the quantization step sizes for each subband should be selected using the same RD slope in all subbands (see the notebook Scalar Quantization of RGB images. Notice that this implies to compute the RD curves.

Therefore, taking a generic luma-chroma transform \(\text {YUV}\), we expect that \begin {equation} \lambda _{\text {Y}} \approx \lambda _{\text {U}} \approx \lambda _{\text {V}} \label {eq:optimal_lambda} \end {equation} for the quantization step sizes \begin {equation} \Delta _{\text {Y}} = \Delta _{\text {U}} = \Delta _{\text {V}}, \label {eq:optimal_delta} \end {equation} and therefore the Rate-Distortion Optimization (RDO) [2] can be ignored. In the notebook Scalar Quantization of RGB images we can explore (at least visually) the grade of compilance of Eq. \eqref{eq:optimal_lambda}.

Resources

5 References

[1] W. Burger and M.J. Burge. Digital Image Processing: An Algorithmic Introduction Using Java. Springer, 2016.

[2] V. González-Ruiz. Information Theory.

[3] V. González-Ruiz. Scalar Quantization.

[4] V. González-Ruiz. Transform Coding.

[5] K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.

¹This will be explained latter in this course.

²A component of a pixel in the \(\text {RGB}\) domain refer to one of the values \(\text {R}\) (red), \(\text {G}\) (green) and \(\text {B}\) (blue) coordinates in the \(\text {RGB}\) color 3D space.

³Most part of the transforms, including the color ones, analyze the signal information from a frequency perspective, generating the so called coefficients whose index in the transform domain is related to a different frequency of the signal.

⁴In general, the color transforms can be considered lossless, although this is only true if fixed-point arithmetic is used.

⁵It is worth understanding that the frequency concept in the color transform domain is not related to the frequency concept in the original pixel domain. For example, the \(\text {R}\) component or a pixel represents the amount of red in the pixel, and in the visible spectrum we are refering to frequencies that are lower than the frequency that the \(\text {G}\) and \(\text {B}\) components represent. However, in a color transformed domain, the luma measures the brightness level of the pixel, and we cannot found a subband of frequencies in the visible spectrum that can represent such information because we are using a different representation domain.

⁶In general, the information provided by the signals.

⁷Notice again, that we will study this effect in a posterior session.

⁸The luna does not define the chroma and viceversa.