Color Transforms

Vicente González Ruiz - Departamento de Informática - UAL

March 22, 2025

Contents

 1 Spectral decorrelation
 2 Luma and chromas
  2.1 Components, channels, coefficients and subbands
 3 Benefits of color transforms
 4 Scalar quantization in the color transform domain
 5 References

1 Spectral decorrelation

Most image and video compressors exploit the statistical correlation (and also perceptual redundacy1) between the \(\text {RGB}\) color components2 of the pixels, using a color transform.

Color transforms are pixel-wise operators. As a result, each pixel is represented in a different (color) domain where (usually) three new coefficients3 express the same4 information.

2 Luma and chromas

Most color transforms are designed to split the color information of a pixel into luminance (or simply luma) and chrominance (chroma). The luma is basically the low frequency5 information of the color of the pixel, and the chroma (logically) the high frequency information.

For example, in JPEG and H.264/AVC the color information of each pixel is transformed from the \(\text {RGB}\) color space to the \(\text {YCrCb}\) color space, and in JPEG XR, the destination color space is \(\text {YCoCg}\). In these luma/chroma-based color spaces, (the symbol) \(\text {Y}\) represents the luma (coefficient) of the pixel. The other two coefficients form the chroma. Note that the chrominance of a pixel is determined by two chromas.

2.1 Components, channels, coefficients and subbands

In image and video coding, most color transforms map 3 (color) components (\(\text {RGB}\)) into 3 coefficients. It is common to call to the same component(-index) of all the pixels of an image: channel (for eample, the \(\text {R}\)-channel). In most bibliography, the same coefficient(-index) of all the coefficients of a transformed image is denoted by a subband (for example, the \(Y\)-subband).

3 Benefits of color transforms

Color transforms applied to natural visual information generally have two key advantages:

  1. Energy concentration: In general, transforms “move” the energy6 between subbands, accumulating most of the energy in a reduced number of them (aspect related to the so-called coding gain of the transform [4]). In our case, where the transformations are between color spaces, in the transform domain, most of the energy is concentrated in the \(\text {Y}\) subband. As a consequence of this, usually, the entropy [2] is decreased and the dynamic range of the signal is increased. The first fact means that we can encode the same information using less bit-rate, and the second, that we can use a higher range of quantization step sizes [35], increasing also the number of feasible points in the RD curve [2].
  2. Luma/chroma perceptual analysis: Our visual system is more sensitive in terms of spatial resolution to the luma (“black and white”) information than to the chroma (“color”) information, which basically means that we can quantize more the chroma7 without generating perceptual distortion.

4 Scalar quantization in the color transform domain

If the color transform is orthogonal, (that is, the luma and the cromas are independent features of the signal8), the quantization noise generated in the subbands is additive respect to the reconstructed signal [1]. Therefore, from a pure RD point of view, the quantization step sizes for each subband should be selected using the same RD slope in all subbands (see the notebook Scalar Quantization of RGB images. Notice that this implies to compute the RD curves.

Therefore, taking a generic luma-chroma transform \(\text {YUV}\), we expect that \begin {equation} \lambda _{\text {Y}} \approx \lambda _{\text {U}} \approx \lambda _{\text {V}} \label {eq:optimal_lambda} \end {equation} for the quantization step sizes \begin {equation} \Delta _{\text {Y}} = \Delta _{\text {U}} = \Delta _{\text {V}}, \label {eq:optimal_delta} \end {equation} and therefore the Rate-Distortion Optimization (RDO) [2] can be ignored. In the notebook Scalar Quantization of RGB images we can explore (at least visually) the grade of compilance of Eq. \eqref{eq:optimal_lambda}.

Resources

  1. YCrCb.ipynb: Notebook showing the use of YCrCb.py.
  2. YCoCg.ipynb: Notebook showing the use of YCoCg.py.
  3. color-DCT.ipynb: Notebook showing the use of color-DCT.py.
  4. Vector Quantization (in the color domain) of a RGB image.
  5. Vector Quantization (in the 2D domain) of a color (RGB) image.
  6. Removing RGB redundancy with the DCT.
  7. Removing RGB redundancy with the \(\text {YCoCg}\) transform.
  8. Removing RGB redundancy with the \(\text {YCrCb}\) transform.

5 References

[1]   W. Burger and M.J. Burge. Digital Image Processing: An Algorithmic Introduction Using Java. Springer, 2016.

[2]   V. González-Ruiz. Information Theory.

[3]   V. González-Ruiz. Scalar Quantization.

[4]   V. González-Ruiz. Transform Coding.

[5]   K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.

1This will be explained latter in this course.

2A component of a pixel in the \(\text {RGB}\) domain refer to one of the values \(\text {R}\) (red), \(\text {G}\) (green) and \(\text {B}\) (blue) coordinates in the \(\text {RGB}\) color 3D space.

3Most part of the transforms, including the color ones, analyze the signal information from a frequency perspective, generating the so called coefficients whose index in the transform domain is related to a different frequency of the signal.

4In general, the color transforms can be considered lossless, although this is only true if fixed-point arithmetic is used.

5It is worth understanding that the frequency concept in the color transform domain is not related to the frequency concept in the original pixel domain. For example, the \(\text {R}\) component or a pixel represents the amount of red in the pixel, and in the visible spectrum we are refering to frequencies that are lower than the frequency that the \(\text {G}\) and \(\text {B}\) components represent. However, in a color transformed domain, the luma measures the brightness level of the pixel, and we cannot found a subband of frequencies in the visible spectrum that can represent such information because we are using a different representation domain.

6In general, the information provided by the signals.

7Notice again, that we will study this effect in a posterior session.

8The luna does not define the chroma and viceversa.