MRVCs is a transform that inputs a video (a sequence of frames of pixels) and outputs a decomposition (a sequence of spatio-temporal subbands of coefficients). It consists of two stages:
MRVC can be applied recursively to the lower SLR, producing a multiresolution representation of the video and compensating the motion of the sequence. The output decomposition concentrates most of the energy in a small number of subbands (usually, those representing the low frquencies) which enables the lossy compression of the video.
Video data contain high amounts of redundancy, spatial and temporal. For this reason, most of video encoders compress an input sequence of images1, basically, in two stages: (1) a transform in the time and in the spatial domains that produces a collection of uncorrelated subbands, and (2) an entropy encoding stage which removes the statistical redundancy that can still remains after the decorrelation (transform). Usually, these coefficients have two interesting features:
The 2D-DWT (2-Dimensions Distrete Wavelet Transform)2 is a digital transform that, applied to an image, performs a accumulation of the energy of the image in a small number of subbands and obtains a multiresolution (generally dyadic) representation of such image. The subbands conform a decomposition3 \(\{LL, LH, HL, HH\}\), were \(L\) stands for 1D low-pass analysis filtering and \(H\) for 1D high-pass analysis filtering. Thus, for example, \(LH\) is the result of applying the \(L\) filter to the rows and downsample4 the output by a factor of 2, and then applying the \(H\) filter to the columns of the previous decomposition and again, donwsampling by a factor of 2. The subbands are populated by coefficients representing different spatial areas, depending on the localization of the coefficient and the resolution level. Notice that it is possible to compute the (2D-)DWT using 1D filters. Therefore, the DWT is separable.
The downsampling operation generates a critically sampled structure, where the number of output DWT coefficients is equal to the number of input pixels. This is possible because ideally, the \(L\) filter is designed to reject the frecuencies that the \(H\) filter pass, and viceversa (it is said that the filters are complementary, or that they form a perfect-reconstruction filter bank). In the practice, there is always some aliasing (overlaping between the response of the filters in the frequency domain), but this does not affect5 to the number of output coefficients. In terms of Linear Algebra, analysis \(L\) and \(H\) filters, and the synthesis \(L^{-1}\) and \(H^{-1}\) filters (the filters that reconstruct the pixels from the coefficients generated by \(L\) and \(H\)) would be orthogonal, that is, the basis functions that they convolve with the signals are completely indenpendent.
For the sake of simplicity, we will denote the 1-iteration DWT spatial decomposition as \(\{L, H\}\) where \(L=LL\) and \(H=\{LH, HL, HH\}\). See also DWT vs LPT.
DWT.py (see Fig. 1) implements the forward (DWT) and backward (iDWT) 1-iteration (step) DWT for color images.
import distortion import image_3 #import L_DWT as L #import H_DWT as H #import debug import information #import logging #logging.basicConfig( # level=logging.INFO, ## format="%(pathname)s:%(lineno)d %(levelname)s %(message)s", # format = "[%(filename)s:%(lineno)s - %(funcName)20s() ] %(message)s" ## handlers=[ ## #logging.FileHandler("debug.log"), ## logging.StreamHandler() ## ] #) import logging logger = logging.getLogger(__name__) #logging.basicConfig(format="[%(filename)s:%(lineno)s %(funcName)s()] %(message)s") #logger.setLevel(logging.CRITICAL) #logger.setLevel(logging.ERROR) logger.setLevel(logging.WARNING) #logger.setLevel(logging.INFO) #logger.setLevel(logging.DEBUG) #_wavelet = pywt.Wavelet("haar") #_wavelet = pywt.Wavelet("db1") _wavelet = pywt.Wavelet("db5") #_wavelet = pywt.Wavelet("db20") #_wavelet = pywt.Wavelet("bior3.5") #_wavelet = pywt.Wavelet("bior3.7") #_wavelet = pywt.Wavelet("bior6.8") #_wavelet = pywt.Wavelet("rbio6.8") # Number of levels of the DWT #N_levels = config.n_levels _N_levels = 5 # Signal extension mode #_extension_mode = "symmetric" # default #_extension_mode = "constant" #_extension_mode = "reflect" #_extension_mode = "periodic" #_extension_mode = "smooth" #_extension_mode = "antisymmetric" #_extension_mode = "antireflect" _extension_mode = "periodization" # Generates the inimal number of coeffs #_extension_mode = config.dwt_extension_mode logger.info(f"Wavelet={_wavelet}") logger.info(f"DWT extension mode={_extension_mode}") def analyze_step(color_image, wavelet=_wavelet): '''Color 1-levels forward 2D-DWT. Parameters ---------- color_image : [row, column, component] np.ndarray Color image to be analyzed. wavelet : pywt.Wavelet Wavelet name. Returns ------- A 1-levels color decomposition : tuple Subbands (LL, (LH, HL, HH)). ''' logger.info(f"wavelet={wavelet}") N_comps = color_image.shape[2] decomposition_by_component = [None]*N_comps # A color decomposition for c in range(N_comps): decomposition_by_component[c] = pywt.dwt2(data=color_image[:,:,c], wavelet=wavelet, mode=_extension_mode) assert decomposition_by_component[0][0].shape == decomposition_by_component[0][1][0].shape N_rows_subband, N_columns_subband = decomposition_by_component[0][0].shape # All subbands have the same shape LL = np.empty(shape=(N_rows_subband, N_columns_subband, N_comps), dtype=np.float64) LH = np.empty(shape=(N_rows_subband, N_columns_subband, N_comps), dtype=np.float64) HL = np.empty(shape=(N_rows_subband, N_columns_subband, N_comps), dtype=np.float64) HH = np.empty(shape=(N_rows_subband, N_columns_subband, N_comps), dtype=np.float64) for c in range(N_comps): LL[:,:,c] = decomposition_by_component[c][0][:,:] LH[:,:,c] = decomposition_by_component[c][1][0][:,:] HL[:,:,c] = decomposition_by_component[c][1][1][:,:] HH[:,:,c] = decomposition_by_component[c][1][2][:,:] return (LL, (LH, HL, HH)) # Ojo, para ser coherentes, deberia # retornarse una lista, no una tupla, y # asi tendriamos una descomposicion de 1 # nivel def synthesize_step(LL, H, wavelet=_wavelet): '''Color 1-levels backward 2D-DWT. Parameters ---------- LL : np.ndarray Low-pass subband. H : tuple
A \(N\)-iterations6 dyadic DWT is the result of a recursive use of the 1-iteration DWT applied to the \(L\) subband, \(N\) times, producing a \((N+1)\)-levels dyadic structure which is able to provide \(N+1\) Spatial Resolution Levels (SRL’s). For example, in the Fig. 2 the DWT has been applied 2-times, generating 3 SRLs: (1) \(L^2\), (2) \(L^1=\mathtt {iDWT}(L^2, H^2)\), and (3) \(L^0=\mathtt {iDWT}(L^1, H^1)\), where \(\mathtt {iDWT}\) stands for inverse DWT.
+------+------+-------------+ 0 +------+------+-------------+ | | | | | | | | | LL^2 | LH^2 | | | L^2 | | | | | | | | | | | +------+------+ LH^1 | +------+ | | | | | | | | | | HL^2 | HH^2 | | | H^2 | | | | | | Y/2-1 | | | +------+------+-------------+ = +-------------+ | | | | | | | | | | | | | | | | | HL^1 | HH^1 | | H^1 | | | | | | | | | | | | | | Y-1 | | +-------------+-------------+ +---------------------------+ 0 X/2-1 X-1
Depending on the number of subband levels (pyramid levels in the OLP), we can reconstruct an image at different spatial scales (resolutions). For example, if a image has been transformed using the DWT of \(2\) iterations (see Fig. 2), we get:
Subband \(L^2\) with the scale \(2\) of the image. If such image is \(I\), we will denote the scale \(2\) as \(I.L^2\).
+------+ | | Y/4 | L^2 | | | +------+ X/4
Subband \(L^1\coloneqq L=\mathtt {iDWT}(L^2, H^2)\) with \(I.L^1\).
+-------------+ | | | | Y/2 | L^1 | | | | | +-------------+ X/2
\(L^0=\mathtt {iDWT}(L, H)\) (the original image) with the scale \(I.L^0=I\).
+---------------------------+ | | | | | | | | | | | | Y | L^0 | | | | | | | | | | | | | +---------------------------+ X
An image is transformed, generating four subbands \(LL\), \(LH\), \(HL\) and \(HH\).
# You must be in the ’src’ directory. # Copy an image (the output directory must be the same as the input one). rm /tmp/*.png cp ../sequences/stockholm/000.png /tmp # 1-iteration 2D DWT. python3 -O DWT.py -p /tmp/ -i 000 # Visualize the subbands. display -normalize /tmp/LL000.png # <- LL^1 display -normalize /tmp/LH000.png # <- LH^1 display -normalize /tmp/HL000.png # <- HL^1 display -normalize /tmp/HH000.png # <- HH^1 # 1-iteration 2D iDWT. python3 -O DWT.py -b -p /tmp/ -i 000 # Visualize the reconstruction. python3 ../tools/substract_offset.py -i /tmp/000.png -o /tmp/1.png display /tmp/1.png # Show diferences (PyWavelets uses floating point arithmetic). ../tools/show_differences.sh -1 /tmp/000.png -2 ../sequences/stockholm/000.png -o /tmp/diffs.png display /tmp/diffs.png
# You must be in the ’src’ directory. # Copy a image (the output directory must be the same than the input one). rm /tmp/*.png cp ../sequences/stockholm/000.png /tmp # First iteration. python3 -O DWT.py -p /tmp/ -i 000 # Second iteration. python3 -O DWT.py -p /tmp/LL -i 000 # Visualize the subbands. ls -l /tmp/*.png display -normalize /tmp/LL000.png # <- LL^1 display -normalize /tmp/LH000.png # <- LH^1 display -normalize /tmp/HL000.png # <- HL^1 display -normalize /tmp/HH000.png # <- HH^1 display -normalize /tmp/LLLL000.png # <- LL^2 display -normalize /tmp/LLLH000.png # <- LH^2 display -normalize /tmp/LLHL000.png # <- HL^2 display -normalize /tmp/LLHH000.png # <- HH^2 # Reverse second iteration. python3 -O DWT.py -b -p /tmp/LL -i 000 display -normalize /tmp/LL000.png # Reverse first iteration. python3 -O DWT.py -b -p /tmp/ -i 000 python3 ../tools/substract_offset.py -i /tmp/000.png -o /tmp/1.png display /tmp/1.png # Show diferences. ../tools/show_differences.sh -1 /tmp/000.png -2 ../sequences/stockholm/000.png -o /tmp/diffs.png display /tmp/diffs.png
The 2D-DWT can be applied to a sequence of frames (images) by simply transforming each one independently. This is done, for example, in the Motion JPEG2000 video compression standard. Notice that only the spatial redundancy is exploited in MDWT (all the temporal redundancy still remains in the video). The decomposition structure generated by MDWT is shown in the Fig 3. The 1-iteration MDWT inputs a sequence of frames \(\{F_i\}\) and outputs a sequence of decompositions \(\{\{F_i\}.L, \{F_i\}.H\}\), also called sequence spatial subbands. In all this discussion, \(i\) is an integer number starting a \(0\) (\(F_0\) is always the first frame of the video). Notice that \(\{F_i\}.L\) represents the scale \(1\) of the sequence (a version of the original video \(\{F_i\}\) but with a resolution \(Y/2\times X/2\)), and that \(\{F_i\}.L^0 = \{F_i\} = \mathtt {iMDWT}(\{F_i\}.L,\{F_i\}.H )\). See the Fig. 4 for more details.
lass MDWT: def __init__(self, wavelet = "bior3.5"): self.dwt = DWT(wavelet) def forward(self, prefix="/tmp/", N=5): '''1-iteration Motion 2D DWT of a sequence of images. Compute the forward 2D-DWT of each image of the sequence placed at <prefix>. Input ----- prefix: the sequence of images to be transformed. N: number of images. Output ------ (disk): the sequence of decompositions. ''' for i in range(N): img = image.read(prefix, "{:03d}".format(i)) pyr = self.dwt.forward(img) decomposition.write(pyr, prefix, "{:03d}".format(i)) def backward(self, prefix="/tmp/", N=5): '''1-iteration Inverse Motion 2D DWT of a sequence of decompositions. Compute the inverse 2D-DWT of each decomposition of the sequence of decompositions placed at <prefix>. Input: ----- prefix: the sequence of decompositions to be transformed. N: the number of decompositions. Output: ------ (disk): the sequence of images. ''' for i in range(N): pyr = decomposition.read(prefix, "{:03d}".format(i)) img = self.dwt.backward(pyr) image.write(img, prefix, "{:03d}".format(i))
MDWT sequences (of (subband) decompositions) are scalable in space and in time. Spatial scalability is a direct consequence of the DWT (the number of SRLs can be controlled by the number of iterations of DWT). On the other hand, it is trivial to observe that MDWT provides fully temporal scalability (we can access to the images randomly) because each image of the input sequence is transformed independently.
# You must be in the ’src’ directory. rm /tmp/*.png # Create the output directory for the 1st-level decompositions. yes | cp ../sequences/stockholm/*.png /tmp # 2D 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ -N 5 # Visualize the subbands. for i in /tmp/LL00?.png; do convert -normalize $i $i.norm; done animate /tmp/LL*.norm for i in /tmp/LH00?.png; do convert -normalize $i $i.norm; done animate /tmp/LH*.norm for i in /tmp/HL00?.png; do convert -normalize $i $i.norm; done animate /tmp/HL*.norm for i in /tmp/HH00?.png; do convert -normalize $i $i.norm; done animate /tmp/HH*.norm # 1-iteration iMDWT. python3 -O MDWT.py -b -p /tmp/ -N 5 # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Visualization of the residue. for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png
# You must be in the ’src’ directory. rm /tmp/*.png # Create the output directory for the 1st-level decompositions. yes | cp ../sequences/stockholm/*.png /tmp # 2-iterations MDWT. python3 -O MDWT.py -p /tmp/ -N 5 python3 -O MDWT.py -p /tmp/LL -N 5 # 2-iterations iMDWT. python3 -O MDWT.py -b -p /tmp/LL -N 5 python3 -O MDWT.py -b -p /tmp/ -N 5 # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Visualization of the residue. for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png
If we apply an inverse transform to the original frames7, supposing that they are the low-frequency subbands of the input decomposition, and that the high-frequency subbands are zero, we can interpolate each frame of the original sequence by a factor of \(2\) in each direction.
# You must be in the ’src’ directory. rm /tmp/*.png # Copy the current images as LL subbands. rm /tmp/*.png for i in {0..4}; do image_number=$(printf "%03d" $i) cp ../sequences/stockholm/$image_number.png /tmp/LL$image_number.png done # Apply the inverse DWT. python3 -O MDWT.py -b -p /tmp/ -N 5 # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Notice that the dynamic range of the sequence could be changed (the # images could appear brighter of darker). This is consequence of the # relative gain of the LL subband in respect of the rest of subbands. rm /tmp/*.norm for i in /tmp/???.png; do convert -normalize $i $i.norm; done animate /tmp/*.png.norm
To remove both, spatial and temporal redundancies, two different alternatives are available:
Each choice has a number of pros and cons. For example, in a “t+2D” transform we can apply directly any frame predictor based on Motion Estimation (ME) because the input is a normal video. On the contrary, if we implement a “2D+t” transform, the input to the motion estimator is a sequence of decompositions. The overwhelming majority of DWT’s are not shift invariant, which basically means that, exactly the same object placed in two different frames in different positions will generate different wavelet coefficients, even if the displacement is an integer number of pixels and therefore, the pixels of the object in both frames are identical. Therefore, motion estimators that compare coefficient values will not work with accuracy on the decomposition domain. On the other hand, if we want to provide true spatial scalability (processing only those spatial resolutions (scales) necessary to get a spatially scaled of our video), a “t+2D” transformed video presents several drawbacks:
This, as it will be seen after, shift variance of the wavelet coefficients is a problem when we want to perform comparisons in the DWT domain. As it can be supposded, this problem can be mitigated if the downsampling operation is removed, as happens in the Overcomplete DWT and in the Laplacian Pyramids (LP). Fortunately, there is an equivalence between DWT subbands and a family of Laplacian Pyramids that we will call Orthogonal LP’s (OLP’s), and it is quite straightforward to convert a decomposition representation into a pyramid representation and viceversa, when the \(L\) and \(H\) filters are used for both, the DWT and the LP, are orghogonal. If the filters used for generating the LP were not orthogonal, then it would not be possible to travel from the LP domain through the DWT domain without lossing information.
MCOLP is a “2D+t” transform. The “2D” stage is an 1-iteration MDWT, and the “t” stage is basicallt a \(T\)-iterations 1D Motion Compensated (MC) DWT. The MDWT removes the spatial redundancy of the video, and the temporal DWT removes the temporal redundancy between \(H\) subbands. The number of iterations \(T\) determines the GOP size \begin {equation} G=2^T. \label {eq:GOP_size} \end {equation}
We can think of MCLOPT such as a MCTF applied to a given spatial resolution level, and because the motion is estimated using the previous lower resolution level (which is available at the decoder), it is not necessary to send the motion information in the code-stream. Notice also that this process can be used for any number of spatial resolution levels, simply by using MCLOPT iteratively on the low spatial resolution level (resulting of the previous iteration of MRVC that applies first a MDWT).
A peculiar feature of MCLOP is that the predict step used in the “t”-DWT is performed in the OLP domain, which is shift invariant because the coefficients have not been downsampled (as happens in the DWT domain).
The butterly inputs three decompositions \(a=\{a.L, a.H\}\), \(b=\{b.L, b.H\}\) and \(c=\{c.L, c.H\}\), and outputs a residue subband \(\tilde {b}.H\), which replaces to \(b.H\) in the original \(b\) decomposition. Therefore, after the use of the bufferfly, we get \(a\) , \(\tilde {b}=\{b.L, \tilde {b}.H\}\) (notice that only the high-frequency information of \(b\) has been compensated) and \(c\). This replacement is fully reversible because the forward transform uses only the information that the inverse transform will have access to.
Notice that the pyramid domain (which is near-invariant to the pixels displacements) has been used to estimate and compensate the \(H\) subbands. This means that the motion has been estimated using integer (not sub-) pixel accuracy, and that the motion has been compensated also integer pixel accuracy in the OLP domain, that implies \(1/2\)-pixel accuracy in the DWT domain.
After the MC, a DWT is computed to \(\tilde {b}.H\) in order to get again a critically sampled version these subbands. After this, the \(b.L\) subband is discarded (in fact, replaced by the original subband). The MC operation can generate information in this subband, because the MC can produce low-frequency information that could not be represented by the \(H\) subband. However, hopefully the energy of the \(L\) subband is going to be small in most of the cases, and in any case, this is not an unsolvable problem because the decoder is going to replicate exactly this behaviour.
lass MCOLP: def __init__(self, shape, wavelet = "bior3.5"): self.zero_L = np.zeros(shape, np.float64) self.zero_H = (np.zeros(shape, np.float64), np.zeros(shape, np.float64), np.zeros(shape, np.float64)) self.dwt = DWT(wavelet) def __forward_butterfly(self, aL, aH, bL, bH, cL, cH): '''Forward MCOLP butterfly. Input: ----- aL, aH, bL, bH, cL, cH: array[y, x, component], the decomposition of the images a, b and c. Output: ------ residue_bH: array[y, x, component], the residue of the high-requency information of the image b. ''' AL = self.dwt.backward((aL, self.zero_H)) BL = self.dwt.backward((bL, self.zero_H)) CL = self.dwt.backward((cL, self.zero_H)) AH = self.dwt.backward((self.zero_L, aH)) BH = self.dwt.backward((self.zero_L, bH)) CH = self.dwt.backward((self.zero_L, cH)) prediction_BH = predictor.generate_prediction(AL, BL, CL, AH, CH) residue_BH = BH - prediction_BH residue_bH = self.dwt.forward(residue_BH) return residue_bH[1] def __backward_butterfly(self, aL, aH, bL, residue_bH, cL, cH): '''Backward MCOLP butterfly. Input: ----- aL, aH, bL, residue_bH, cL, cH: array[y, x, component], the \tilde{b} image. Output: ------ bH: array[y, x, component], the high-frequencies of the image b. ''' AL = self.dwt.backward((aL, self.zero_H)) BL = self.dwt.backward((bL, self.zero_H)) CL = self.dwt.backward((cL, self.zero_H)) AH = self.dwt.backward((self.zero_L, aH)) residue_BH = self.dwt.backward((self.zero_L, residue_bH)) CH = self.dwt.backward((self.zero_L, cH)) prediction_BH = predictor.generate_prediction(AL, BL, CL, AH, CH) BH = residue_BH + prediction_BH bH = self.dwt.forward(BH) return bH[1]
ef generate_prediction(aL, bL, cL, AH, CH): flow_AL_BL = motion_estimation(aL, bL) flow_CL_BL = motion_estimation(cL, bL) flow_AL_BL = DWT.backward((flow_AL_BL, [None, None, None])) flow_CL_BL = DWT.backward((flow_CL_BL, [None, None, None]))
The \(T\)-iterations MCOLP (see Fig. 6) is the result of applying the MCOLP butterfly to all the frames at \(T\) different temporal subsamplings (or scales). At the iteration \(1\leq t\leq T\) of MCOLP, the involved frames are multiple of \(2^t\).
def forward(self, prefix = "/tmp/", N=5, T=2): '''Forward MCOLP. Compute a MC 1D-DWT, estimating in the L subbands and compensaing in the H subbands of the Orthogonal Laplacian Pyramid. The input video (as a sequence of 1-iteration decompositions) must be stored in disk in the directory <prefix>, and the output (as a 1-iteration MC decompositions) will generated in the same directory. Input ----- prefix : str Localization of the input/output images. Example: "/tmp/". N : int Number of decompositions to process. T : int Number of iterations of the MCOLP (temporal scales). Controls the GOP size. T | GOP_size ----+---------- 0 | 1 1 | 2 2 | 4 3 | 8 4 | 16 5 | 32 : | : Returns ------- The output motion compensated decompositions. ''' x = 2 for t in range(T): # Temporal scale #print("a={}".format(0), end=' ') i = 0 aL, aH = decomposition.read(prefix, "{:03d}".format(0)) while i < (N//x): #print("b={} c={}".format(x*i+x//2, x*i+x)) bL, bH = decomposition.read(prefix, "{:03d}".format(x*i+x//2)) cL, cH = decomposition.read(prefix, "{:03d}".format(x*i+x)) residue_bH = self.__forward_butterfly(aL, aH, bL, bH, cL, cH) decomposition.writeH(residue_bH, prefix, "{:03d}".format(x*i+x//2)) aL, aH = cL, cH #print("a={}".format(x*i+x), end=' ') i += 1 x *= 2 #print('\n') def backward(self, prefix = "/tmp/", N=5, T=2): '''Backward MCOLP. Compute the inverse MC 1D-DWT. The input sequence of 1-iteration MC decompositions must be stored in disk in the directory <prefix>, and the output (as a 1-iteration decompositions) will generated in the same directory. Input ----- prefix : str Localization of the input/output images. Example: "/tmp/". N : int Number of decompositions to process. T : int Number of iterations of the MCOLP (temporal scales). Returns ------- The sequence of 1-iteration decompositions. ''' x = 2**T for t in range(T): # Temporal scale i = 0 aL, aH = decomposition.read(prefix, "{:03d}".format(0)) while i < (N//x): bL, residue_bH = decomposition.read(prefix, "{:03d}".format(x*i+x//2)) cL, cH = decomposition.read(prefix, "{:03d}".format(x*i+x)) bH = self.__backward_butterfly(aL, aH, bL, residue_bH, cL, cH) decomposition.writeH(bH, prefix, "{:03d}".format(x*i+x//2)) aL, aH = cL, cH i += 1 x //=2
Notice that, if the parameter \(N=G+1\) (where \(G\) is the GOP size, see Eq. 1), except for the first and the last frame of the sequence, all the \(H\)-subbands are compensated.
In this example (see Fig. 7), a sequence of five frames is transformed first with \(\mathtt {MDWT}(N=5)\), producing 2 SRLs by frame. Next, the \(H\) subbands of the odd frames (frames 1 and 3) are motion compensated using \(\mathtt {MCOLP}(T=1, N=5)\).
predictor=1 iterations=1 # You must be in the ’src’ directory to run this. rm -f /tmp/*.png cp ../sequences/stockholm/*.png /tmp # 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ # Show the length of the subbands. for i in /tmp/LL???.png; do ls -l $i; done for i in /tmp/LH???.png; do ls -l $i; done for i in /tmp/HL???.png; do ls -l $i; done for i in /tmp/HH???.png; do ls -l $i; done # 1-iteration MCOLP. python3 -O MCOLP.py -P $predictor -p /tmp/ -T $iterations # Show the length of the subbands. for i in /tmp/LL???.png; do ls -l $i; done for i in /tmp/LH???.png; do ls -l $i; done for i in /tmp/HL???.png; do ls -l $i; done for i in /tmp/HH???.png; do ls -l $i; done # Has changed in length any of them? Remember that a change in lenght # implies a change in content. # Lets recover the original sequence ... rm /tmp/???.png # 1-iteration iMCOLP. python3 -O MCOLP.py -P $predictor -b -p /tmp/ -T $iterations # 1-iteration iMDWT. python3 -O MDWT.py -b -p /tmp/ # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Visualization of the residue. for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png
This example is similar to the previous, but now the number of iterations of MCOLP is \(T=2\), and therefore, the second iteration of MCOLP compensates also the \(H\) subband of the frame 2. This example matches exactly with the described in the Fig. 8. Notice that the GOP size is \(4\) (\(3\) TRLs) and the number of SRLs is \(2\).
predictor=1 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/*.png /tmp # 1D 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ # Save a copy of the MDWT subbands (for comparing later). rm -rf /tmp/tmp mkdir /tmp/tmp cp /tmp/*.png /tmp/tmp # 1-iteration MCOLP. python3 -O MCOLP.py -p /tmp/ -T 1 # Is the content of the MCOLP subbands different to the MDWT’s output? for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/LL$ii.png /tmp/tmp/LL$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/LH$ii.png /tmp/tmp/LH$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/HL$ii.png /tmp/tmp/HL$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/HH$ii.png /tmp/tmp/HH$ii.png;echo; done rm /tmp/*.png cp ../sequences/stockholm/*.png /tmp # 1D 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ # Save a copy of the MDWT subbands (for comparing later). rm -rf /tmp/tmp mkdir /tmp/tmp cp /tmp/*.png /tmp/tmp # 2-iterations MCOLP. python3 -O MCOLP.py -p /tmp/ -T 2 # Is the content of the MCOLP subbands different to the MDWT’s output? for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/LL$ii.png /tmp/tmp/LL$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/LH$ii.png /tmp/tmp/LH$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/HL$ii.png /tmp/tmp/HL$ii.png;echo; done for i in {0..4}; do ii=$(printf "%03d" $i); ls -l /tmp/HH$ii.png /tmp/tmp/HH$ii.png;echo; done # Comparing LH visually. for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/tmp/LH$ii.png /tmp/tmp/MDWT_LH$ii.png; done for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/LH$ii.png /tmp/MCOLP_LH$ii.png; done animate /tmp/tmp/MDWT_LH00?.png & animate /tmp/MCOLP_LH00?.png & # Comparing LH002 numerically. echo MDWT > /tmp/1 printf "%61s\n" MCOLP > /tmp/2 python3 ../tools/show_statistics.py -i /tmp/tmp/LH002.png >> /tmp/1 python3 ../tools/show_statistics.py -i /tmp/LH002.png >> /tmp/2 paste /tmp/1 /tmp/2 # Comparing HL visually. for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/tmp/HL$ii.png /tmp/tmp/MDWT_HL$ii.png; done for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/HL$ii.png /tmp/MCOLP_HL$ii.png; done animate /tmp/tmp/MDWT_HL00?.png & animate /tmp/MCOLP_HL00?.png & # Comparing HL002 numerically. echo MDWT > /tmp/1 printf "%61s\n" MCOLP > /tmp/2 python3 ../tools/show_statistics.py -i /tmp/tmp/HL002.png >> /tmp/1 python3 ../tools/show_statistics.py -i /tmp/HL002.png >> /tmp/2 paste /tmp/1 /tmp/2 # Comparing HH visually. for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/tmp/HH$ii.png /tmp/tmp/MDWT_HH$ii.png; done for i in {0..4}; do ii=$(printf "%03d" $i); convert -normalize /tmp/HH$ii.png /tmp/MCOLP_HH$ii.png; done animate /tmp/tmp/MDWT_HH00?.png & animate /tmp/MCOLP_HH00?.png & # Comparing HH002 numerically. echo MDWT > /tmp/1 printf "%61s\n" MCOLP > /tmp/2 python3 ../tools/show_statistics.py -i /tmp/tmp/HH002.png >> /tmp/1 python3 ../tools/show_statistics.py -i /tmp/HH002.png >> /tmp/2 paste /tmp/1 /tmp/2 # Now we recover the original video. # 2-iterations iMCOLP. python3 -O MCOLP.py -P $predictor -b -p /tmp/ -T $iterations # 1-iteration iMDWT python3 -O MDWT.py -b -p /tmp/ # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Visualization of the residue. for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png python3 ../tools/show_statistics.py -i /tmp/diff_002.png
For the sake of simplifying the notation, we will represent \(\{F_t\}.L^s\) by \(L_t^s\), and \(\{F_t\}.H^s\) by \(H_t^s\). Thus, for example, the subbands shown in the Fig. 8 can be represented as: (1) \(L\) (that is a version of the sequence with spatial resolution \(X/2\times Y/2\)), (2) \(H_{2^2i}\) (made up of intra-coded high-frequency spatial information of the frames with numbers \(2^2i\)), (3) \(\tilde {H}_{2^2i+2}\) (with the high-frequency spatio-temporal prediction error resulting from decorrelating even frames with distance \(2^2\)), and (4) \(\tilde {H}_{2i+1}\) (high-frequency spatio-temporal prediction error resulting from decorrelating frames with distance \(2\)).
An example with \(G=2^4=16\), i.e. \(5\) TRLs and \(2\) SRLs. The subbands generated are:
The temporal decorrelation pattern followed by \(\mathtt {iMCOLP(T=4, N=17)}\) is (notice that the hexadecimal notation has been followed for indexing the frames) shown in the Fig. 9.
GOP0 GOP1 ---- ------------------------------------------------ F0 F8 F10 -> H_{8i } (= TRL3) F4 Fc -> H_{8i+4} (+ TRL3 = TRL2) F2 F6 Fa Fe -> H_{4i+2} (+ TRL2 = TRL1) F1 F3 F5 F7 F9 Fb Fd Ff -> H_{2i+1} (+ TRL1 = TRL0)
# You must be in the ’src’ directory. rm /tmp/*.png video_file=/tmp/akiyo_cif.y4m if test -f $video_file; then print akiyo exists else wget https://media.xiph.org/video/derf/y4m/akiyo_cif.y4m -O /tmp/akiyo_cif.y4m fi mplayer /tmp/akiyo_cif.y4m ffmpeg -i /tmp/akiyo_cif.y4m -vframes 17 -start_number 0 /tmp/%03d.png # 1D 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ -N 17 # Save a copy of the MDWT subbands (for comparing later). rm -rf /tmp/tmp mkdir /tmp/tmp cp /tmp/LL*.png /tmp/tmp cp /tmp/LH*.png /tmp/tmp cp /tmp/HL*.png /tmp/tmp cp /tmp/HH*.png /tmp/tmp # 4-iterations MCOLP. python3 -O MCOLP.py -p /tmp/ -T 4 -N 17 ls -l /tmp/LH*.png > /tmp/1 ls -l /tmp/tmp/LH*.png > /tmp/2 paste /tmp/1 /tmp/2 ls -l /tmp/HL*.png > /tmp/1 ls -l /tmp/tmp/HL*.png > /tmp/2 paste /tmp/1 /tmp/2 ls -l /tmp/HH*.png > /tmp/1 ls -l /tmp/tmp/HH*.png > /tmp/2 paste /tmp/1 /tmp/2 # Now we recover the original video. # 4-iterations iMCOLP. python3 -O MCOLP.py -b -p /tmp/ -T 4 -N 17 # 1-iteration iMDWT python3 -O MDWT.py -b -p /tmp/ -N 17 # Visualization of the reconstruction. for i in /tmp/???.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done animate /tmp/???.png.png # Visualization of the residue. for i in {0..16}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png python3 ../tools/show_statistics.py -i /tmp/diff_002.png
MCOLP preserves the dyadic spatial decomposition generated by MDWT, and therefore, MCOLP provides the same spatial scalability than MDWT.
Unfortunately, as the rest of video encoders based on MC, MCOLP reduces the temporal scalability provided by MDWT, allowing (in the case of MCOLP) only a dyadic access to the frames.
predictor=1 iterations=2 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/* /tmp python3 -O MDWT.py -p /tmp/ python3 -O MCOLP.py -p /tmp/ -P $predictor -T $iterations python3 -O MDWT.py -p /tmp/LL python3 -O MCOLP.py -p /tmp/LL -P $predictor -T $iterations python3 -O MDWT.py -p /tmp/LLLL python3 -O MCOLP.py -p /tmp/LLLL -P $predictor -T $iterations rm /tmp/00?.png python3 -O MCOLP.py -p /tmp/LLLL -P $predictor -b -T $iterations python3 -O MDWT.py -p /tmp/LLLL -b python3 -O MCOLP.py -p /tmp/LL -P $predictor -b -T $iterations python3 -O MDWT.py -p /tmp/LL -b python3 -O MCOLP.py -p /tmp/ -P $predictor -b -T $iterations python3 -O MDWT.py -p /tmp/ -b for i in /tmp/00?.png; do python ../tools/substract_offset.py -i $i -o $i.png; done; animate /tmp/00?.png.png for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done animate /tmp/diff*.png
In this case, \(G=4\), generating \(3\) TRLs (see Fig. 10), and \(4\) is the number of SRLs (see Fig. 11).
TRLs: GOP0 GOP1 ---- ------------ F0 F4 -> H_{4i} (= TRL2) F2 -> H_{4i+2} (+ TRL2 = TRL1) F1 F3 -> H_{2i+1} (+ TRL1 = TRL0)
+---+---+-------+---------------+ | 3 | 2 | 1 | 0 | +---+ | | | | | | | +-------+ | | | | | | | | | | | +---------------+ | | | | | | | | | | | | | | | | | | | | | +-------------------------------+
As can be seen in the MCOLP bufferfly (see Fig. 5), both predictions \([b_a.H]\) and \([b_c.H]\) weight the same in the prediction \begin {equation} [\hat {b.H}] = \frac {[b_a.H] + [b_c.H]}{2}. \end {equation} This simple computation, that can have very effective for example in dissolves video transitions, can also be counterproductive when objects appear only in one of the reference frames. For this reason, in this section an improved predictor is proposed, based on the estimated prediction error (see \([\tilde {b}.H]\) in the Fig. 5) generated throughout the ME/MC. To estimate this error, we will suppose that the prediction error generated for the high frequencies of an frame is correlated with the prediction error generated for the low frequencies of such frame.
Let \([b_a.L]\) the prediction generated for the subband \([b.L]\) using \([a.L]\) as reference and \([a.L]\rightarrow [b.L]\) as motion, and let \([b_c.L]\) the prediction generated for \([b.L]\) using \([c.L]\) as reference and \([c.L]\rightarrow [b.L]\) as motion. We now compute the prediction errors \begin {equation} \begin {array}{l} {[e_a.L]} = [b.L] - [b_a.L]\\ {[e_c.L]} = [b.L] - [b_c.L]. \end {array} \end {equation}
We define similarity matrices as \begin {equation} \begin {array}{l} {[s_a.L]} = \frac {1}{1+{|[e_a.L]|}}\\ {[s_c.L]} = \frac {1}{1+{|[e_c.L]|}}. \end {array} \label {eq:weighted_prediction} \end {equation} Notice that, if (for example) the error \([e_a.L]_{x,y}=0\), the similarity is \([s_a.L]_{x,y}=1\) (the maximum similarity). On the other side, if the error is high, the similarity tends to be low (but never \(0\)).
With this information, that can be recovered by the decoder, the improved prediction is defined as \begin {equation} [\hat {b.H}] = \frac {[b_a.H][s_a.L]+[s_c.L][b_c.H]}{[s_a.L]+[s_c.L]}. \end {equation} The Fig. 12 shows an implementation of this equation.
Notice that, if (for example) \([s_a.L]_{x,y}=1\) and \([s_c.L]_{x,y}\approx 0\), then \([\hat {b.H}]_{x,y} \approx [b_a.H]_{x,y}\), and viceversa. If \([s_a.L]_{x,y}=[s_c.L]_{x,y}\), then \([\hat {b.H}]_{x,y}=\frac {[b_a.H]_{x,y}+[b_c.H]_{x,y}}{2}\) (even if both similarities are small).
ef generate_prediction(AL, BL, CL, AH, CH): flow_AL_BL = motion_estimation(AL, BL) flow_CL_BL = motion_estimation(CL, BL) BAH = estimate_frame(AH, flow_AL_BL) BCH = estimate_frame(CH, flow_CL_BL) BAL = estimate_frame(AL, flow_AL_BL) BCL = estimate_frame(CL, flow_CL_BL) EAL = BL - BAL ECL = BL - BCL SAL = 1/(1+abs(EAL)) SCL = 1/(1+abs(ECL)) return (BAH*SAL+BCH*SCL)/(SAL+SCL)
Quantization allows to remove information from signals. We will use it in MCOLP to achieve higher compression ratios, but at the expense of losing visual information (ideally, the less relevant information for the HVS should be removed first).
When the signal cannot be recoverd perfectly, we can use a distortion metric to find how much error has been generated.
predictor=2 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/*.png /tmp # 1D 1-iteration MDWT. python3 -O MDWT.py -p /tmp/ # 1D 1-iteration MCOLP. python3 -O MCOLP.py -p /tmp/ # Let’s delete the motion compensated subbands and reconstruct the sequence ... rm -f /tmp/LH001.png rm -f /tmp/HL001.png rm -f /tmp/HH001.png rm -f /tmp/LH002.png rm -f /tmp/HL002.png rm -f /tmp/HH002.png rm -f /tmp/LH003.png rm -f /tmp/HL003.png rm -f /tmp/HH003.png # 1D 1-iteration iMCOLP. python3 -O MCOLP.py -P $predictor -b -p /tmp/ # 1D 1-iteration iMDWT. python3 -O MDWT.py -b -p /tmp/ # Visualization of the reconstruction. for i in /tmp/00?.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done; animate /tmp/00?.png.png # Visualization of the differences with the original sequence. rm -f /tmp/diff*.png; for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done; animate /tmp/diff*.png # MSE per frame for i in {0..4}; do ii=$(printf "%03d" $i); MSE=‘python3 -O ../tools/MSE.py -x ../sequences/stockholm/$ii.png -y /tmp/$ii.png‘; printf "MSE(%s)=%s\n" $ii $MSE; done # PSNR per frame for i in {0..4}; do ii=$(printf "%03d" $i); PSNR=‘python3 -O ../tools/PSNR.py -x ../sequences/stockholm/$ii.png -y /tmp/$ii.png‘; printf "PSNR(%s)=%s\n" $ii $PSNR; done
q_step=32 predictor=2 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/* /tmp python3 -O MDWT.py -p /tmp/ python3 -O MCOLP.py -P $predictor -p /tmp/ #python3 ../tools/show_statistics.py -i /tmp/LH001.png python3 ../tools/quantize.py -i /tmp/LH001.png -o /tmp/LH001.png -q $q_step #python3 ../tools/show_statistics.py -i /tmp/LH001.png python3 ../tools/quantize.py -i /tmp/HL001.png -o /tmp/HL001.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH001.png -o /tmp/HH001.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH002.png -o /tmp/LH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL002.png -o /tmp/HL002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH002.png -o /tmp/HH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH003.png -o /tmp/LH003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL003.png -o /tmp/HL003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH003.png -o /tmp/HH003.png -q $q_step python3 -O MCOLP.py -P $predictor -p /tmp/ -b python3 -O MDWT.py -p /tmp/ -b for i in /tmp/00?.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done; animate /tmp/00?.png.png rm -f /tmp/diff*.png; for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done; animate /tmp/diff*.png for i in {0..4}; do ii=$(printf "%03d" $i); MSE=‘python3 -O ../tools/MSE.py -x ../sequences/stockholm/$ii.png -y /tmp/$ii.png‘; printf "MSE(%s)=%s\n" $ii $MSE; done
q_step=32 predictor=2 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/* /tmp python3 -O MDWT.py -p /tmp/ python3 -O MCOLP.py -P $predictor -p /tmp/ python3 ../tools/quantize.py -i /tmp/LH000.png -o /tmp/LH000.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL000.png -o /tmp/HL000.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH000.png -o /tmp/HH000.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH001.png -o /tmp/LH001.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL001.png -o /tmp/HL001.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH001.png -o /tmp/HH001.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH002.png -o /tmp/LH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL002.png -o /tmp/HL002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH002.png -o /tmp/HH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH003.png -o /tmp/LH003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL003.png -o /tmp/HL003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH003.png -o /tmp/HH003.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH004.png -o /tmp/LH004.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL004.png -o /tmp/HL004.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH004.png -o /tmp/HH004.png -q $q_step python3 -O MCOLP.py -P $predictor -p /tmp/ -b python3 -O MDWT.py -p /tmp/ -b for i in /tmp/00?.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done; animate /tmp/00?.png.png rm -f /tmp/diff*.png; for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done; animate /tmp/diff*.png for i in {0..4}; do ii=$(printf "%03d" $i); MSE=‘python3 -O ../tools/MSE.py -x ../sequences/stockholm/$ii.png -y /tmp/$ii.png‘; printf "MSE(%s)=%s\n" $ii $MSE; done
q_step=128 predictor=2 # You must be in the ’src’ directory. rm /tmp/*.png cp ../sequences/stockholm/* /tmp python3 -O MDWT.py -p /tmp/ python3 -O MCOLP.py -P $predictor -p /tmp/ python3 ../tools/quantize.py -i /tmp/LL000.png -o /tmp/LL000.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH000.png -o /tmp/LH000.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL000.png -o /tmp/HL000.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH000.png -o /tmp/HH000.png -q $q_step python3 ../tools/quantize.py -i /tmp/LL001.png -o /tmp/LL001.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH001.png -o /tmp/LH001.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL001.png -o /tmp/HL001.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH001.png -o /tmp/HH001.png -q $q_step python3 ../tools/quantize.py -i /tmp/LL002.png -o /tmp/LL002.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH002.png -o /tmp/LH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL002.png -o /tmp/HL002.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH002.png -o /tmp/HH002.png -q $q_step python3 ../tools/quantize.py -i /tmp/LL003.png -o /tmp/LL003.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH003.png -o /tmp/LH003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL003.png -o /tmp/HL003.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH003.png -o /tmp/HH003.png -q $q_step python3 ../tools/quantize.py -i /tmp/LL004.png -o /tmp/LL004.png -q $q_step python3 ../tools/quantize.py -i /tmp/LH004.png -o /tmp/LH004.png -q $q_step python3 ../tools/quantize.py -i /tmp/HL004.png -o /tmp/HL004.png -q $q_step python3 ../tools/quantize.py -i /tmp/HH004.png -o /tmp/HH004.png -q $q_step python3 -O MCOLP.py -P $predictor -p /tmp/ -b python3 -O MDWT.py -p /tmp/ -b for i in /tmp/00?.png; do python3 ../tools/substract_offset.py -i $i -o $i.png; done; animate /tmp/00?.png.png rm -f /tmp/diff*.png; for i in {0..4}; do ii=$(printf "%03d" $i); bash ../tools/show_differences.sh -1 /tmp/$ii.png -2 ../sequences/stockholm/$ii.png -o /tmp/diff_$ii.png; done; animate /tmp/diff*.png for i in {0..4}; do ii=$(printf "%03d" $i); MSE=‘python3 -O ../tools/MSE.py -x ../sequences/stockholm/$ii.png -y /tmp/$ii.png‘; printf "MSE(%s)=%s\n" $ii $MSE; done
1Also called frames, which from a pure mathematical point of view are matrices of pixels.
2In this document, we will use the term DWT to refeer to the 2D-DWT.
3The structure that forms the subbands is called also a decomposition.
4This operation is also called “decimation” and “subsampling”.
5This affects to the idependency of the coefficients in terms of energy contribution to the reconstruction of the signal.
6Notice that PyWavelets uses the term “levels” for refering to the number of iterations of the 1-level DWT.
7In video coding, images are commonly refered as frames.