In Information Theory [2], quantization [3, 4, 5, 6] is any mapping process between two sets of elements \(A\) and \(B\), where all elements of \(A\) are associated with one (not necessarily the same) element of \(B\), being \(|A|>|B|\), where \(|\cdot |\) represents the order (the number of elements) of a set.
Notice that, when \(|A|>|B|\), quantization is an irreversible process because at least two elements of \(A\) will be mapped to the same element of \(B\), and therefore, there is no way to find the reverse mapping1.
In the quantization of digital signals [3, 6], the elements of \(A\) and \(B\) are digital samples [4]. Since, by definition, digital samples have been quantized to be converted from the analog world to the digital one, we are actually applying a re-quantization. Again, if \(|A|>|B|\), such quantization implies a loss of information in the quantized signal.
When working with 1D signals, quantizers divide the range of possible input values that the samples can take into a collection of non-overlapped intervals2. The values that define such intervals are called decision levels, and the value that in the quantized domain represents all the input possible values that fall into an interval is called the representation level of the interval.
When quantization maps single elements of \(A\) to single elements of \(B\), the quantizer is said to be scalar [3] (Scalar Quantization, SQ). When tuples (vectors) of elements are mapped, Vector Quantization (VQ) is used [6]. Notice that this classification is not related to the dimensionality3 of the signal, but whether the samples are processed independently or whether they are quantized by tuples. In this last case, we can exploit the correlation between samples.
In an uniform quantizer [3], all the intervals have the same the size, which is equal to the quantization step size, \(\Delta \).4 In a non-uniform quantizer [3], at least one of the intervals is different to the rest. For example, a deadzone quantizer [3] has an interval size for the representation level 0 that doubles the size of the the rest of intervals. Another example of a non-uniform quantizer is a Lloyd-Max quantizer [3] because it divides the range of input values in a set of intervals whose size is inversely proportional to the probability of using such interval.
Static quantizers do not modify the partitioning of the input signal’s dynamic range (the decision levels), nor the representation level assigned to each interval. On the contrary, adaptive quantizers adapt such parameters to the sequence of samples that are quantizing.5 A Jayant quantizer [3] is a example of adaptive quantization.
Deadzone quantizers [3] are static quasi-uniform scalar quantizers. These are used frequently in lossy compression systems because when the quantization step size is a power of two and the input sample values are integers6, then the representation levels can be found simply by discarding low-significant bits of the input samples (in other words, we only need to perform a bit-shift operation to find the corresponding quantization index). This also means that it is possible to build a progressive encoding system using a deadzone quantizer.
Another reason why dead-zone quantization is popular in lossy encoding systems is that it tends to remove electronic noise more than other scalar quantizers where the signal is weaker. If we suppose that the signal has 0 mean7 and that the dead-zone is placed where the SNR is smaller, we basically replace electronic noise by quantization noise8. Obviously, this does not improve the RD performance of the encoder, but the perceived (subjective) quality is increased for the same bit rate.9
A Lloyd-Max quantizer [3] minimizes quantization noise given a signal with a known probabilistic distribution (histogram) of the input samples. As a result, the density of quantization intervals is higher where the probability of the samples is higher, and vice versa.
Lloyd-Max quantizers are considered non-uniform quantizers. Notice that the histogram must be also known by the decoder to “restore”10 the information.
Vector quantizers [6] input blocks11 of samples and output a quantization index per block. For example, in most natural images, the spatial correlation [7] generates that some blocks of the image are similar to other blocks. If this is true, we can compute a set of centroids (blocks) and use them to represent the original blocks. As a result, we will obtain a matrix of quantization indexes that can be entropy coded.
Notice that VQ exploits the spatial correlation. For this reason, the encoding performance of a vector quantizer is usually superior compared to that of a scalar quantizer (the number of output quantization indexes12 is smaller).
SQ (Scalar Quantization) [3, 8] would be an optimal solution only if the image colors are uniformly distributed within the RGB cube. However, the typical color distribution in natural images is anything but uniform, with some regions of the color space being densely populated and many potentially used colors completely missing. In this case, depending on the quantization step size [4], SQ could be suboptimal because the colors used may not be sampled with sufficient density, while at the same time the encoding system considers colors that do not appear in the image at all [1].
On the other hand, VQ [6, 8] applied to the color domain does not treat the individual \(\text {RGB}\) components separately as does scalar quantization, but each color vector used \({\mathbf C}_i = (\text {R}_i, \text {G}_i, \text {B}_i )\) in the image is treated as a minimum structure. VQ determines a code-book of \(K\) code-vectors (centroids) that minimizes the distortion between the original image and the reconstructed one. Notice that the code-book must be known by the decoder to find a reconstruction.
Quantizers generate quantization noise, which can be modeled as flat-spectrum zero-mean uniform noise. On the contrary, most of the energy of natural images is concentrated in the low frequencies. Consequently, when an image is quantized, the SNR is higher in the low frequencies than in the high frequencies, which means that if the quantized image is low-pass filtered, the overall SNR should increase (the filter will remove the high frequencies that basically are generated by the quantization noise).
Currently, VCF implements in the module blur.py a collection of low-pass filters. However, most efficient denoising filters can be used, such as NLM, can be used. Create a new Python module NLM.py to provide the use of this filter in VCF. 3 points.
[1] W. Burger and M.J. Burge. Digital Image Processing: An Algorithmic Introduction Using Java. Springer, 2016.
[2] V. González-Ruiz. Information Theory.
[3] V. González-Ruiz. Scalar Quantization.
[4] V. González-Ruiz. Signal Quantization.
[5] V. González-Ruiz. Trellis Coding Quantization.
[6] V. González-Ruiz. Vector Quantization.
[7] V. González-Ruiz. Visual Redundancy.
[8] K. Sayood. Introduction to Data Compression (Slides). Morgan Kaufmann, 2017.
1Dequantization does not exist.
2If the signal has more than 1 dimension, the process is exactly the same, but we don’t speak of intervals, but regions, for example, in the 2D case.
3For example, a color image has 3 dimensions: height, width, and depth.
4Notice that, in an uniform quantizer, the distance between all the decision levels is also \(\Delta \).
5Notice that, in the case of the quantizer depends on the characteristics of the signal but they are known a priori, it should be considered static.
6... and negative integers are represented using two’s complement ...
7The electronic noise has 0 arithmetic mean and a flat spectrum.
8Which also has 0 average and a flat spectrum
9Obviously, if the electronic noise is perceived as a source of distortion comparable to the quantization noise.
10Remember that quantization is a irreversible process and therefore, the original signal is never restored (except if \(\Delta =1\) and we are using digital quantization).
11If we are removing spatial redundancy, the blocks are usually squared tiles of pixels. If we remove color redundancy, the blocks are multicomponent pixels, for example, RGB values.
12Indexes of the centroids.