Enhanced Image/Video Compression Using Diagonal Divide

June 7, 2017 | Autor: Farooq Naeem | Categoría: Computational Intelligence, Video Compression, Image compression, Mean square error, Discrete Cosine Transform

Share Embed

Laporkan tautan ini

Descripción

Enhanced Image/Video Compression Using Diagonal Divide Husnain Malik 1), F. Naeem 2),N. Arora 3) Monash University, Caulfield Campus, CSSE Dept., Australia 1) [email protected], 2) [email protected], 3) [email protected] Abstract This paper introduces the concept of Diagonal Divide for image compression which is based on Discrete Cosine Transform. Most of the image/signal compression schemes focus on energy compression and decorrelation. The proposed technique makes use of this energy compaction fact to further reduce the amount of data to be processed. Along with it, the image’s result before and after the implementation of Diagonal Divide are compared. Criterion used here for measuring the quality of the resulting images is Mean Square Error (MSE). Keeping in mind that Mean Square Error is not a reliable criterion for measurement of quality; images are also included so that the reader can better judge the outcome of introducing this compression scheme.

1. Introduction Although significant gains in storage, transmission, and processor technology have been achieved in recent years, it is primarily the reduction of the amount of data that needs to be stored, transmitted, and processed that has made widespread use of digital video a possibility [4].The proposed approach can be applied to all energy compaction transforms typically used in video coding. A codec system uses a number of video compression techniques to reduce the amount of data. The different modules usually present in such a system are given below in a diagrammatical form: Figure 1 [3].

Figure 1: Video Encoder This paper would briefly describe the phases through which the input image passes before our proposed module is encountered. More emphasis is given on our approach, which would drastically reduce the amount of data resulting from the Quantized DCT coefficients, as compared to the rest of video codec modules.

2. Discrete Cosine Transform (DCT) The first phase of the video encoder is to perform DCT on the input image. DCT is a well known compression technique which stands for discrete cosine transform (DCT).DCT provides the best energy packing property which can further be used to greatly compress the data before transmission.

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

DCT is performed on a block of horizontally and vertically adjacent pixels [1]. The outputs represent amplitudes of two dimensional spatial frequency components. These are called DCT coefficients. The coefficient for zero spatial frequency is called the DC coefficient and it is the average value of all the pixels in the block. The rest of the coefficients represent progressively higher horizontal and vertical spatial frequencies in the block [7]. Since adjacent pixel values tend to be similar or vary slowly from one to another, the DCT processing provides opportunity for compression by forcing most of the energy into lower spatial frequency components. In most cases, many of the higher frequency coefficients will have zero values and therefore can be ignored. The decoder performs the reverse process, but due to the transcendental nature of the DCT the reverse process can only be approximated and hence some loss takes place. In this system 8*8 DCT is used. By using this transform, conversion of a 8 by 8 (pixel) block into another 8 by 8 block is performed [4]. As explained above after this transformation most of the energy (value) is concentrated to the top-left corner of the resulting matrix. The DC coefficient, which is located at the upper left corner, holds most of the image energy and represents the proportional average of the 64 blocks. The remaining 63 coefficients denote the intensity changes among the block images. The formula for the calculation of 2-D DCT is [8]: Y = C U CT

(1)

Where Y is output which is found out after the DCT process, C is the N*N DCT matrix, U is the 8*8 image block and CT is the transpose of C. The 8*8 DCT matrix C = c(k,n) is defined as [8]:

{

}

c(k, n) = 1/ n, k = 0, and0 ≤ n ≤ N −1

2 / n cosπ (2n +1) / 2N, for, ½ c(k, n) = ® ¾ ¯1 ≤ k ≤ N −1, and0 ≤ n ≤ N −1¿

(2)

On the decoder side inverse DCT is performed in order to reconstruct the image matrix. The inverse DCT can be found out by the formula given below. U = CT Y C

(3)

Here Y is the reconstructed image matrix which results from equation 1.

3. Quantization In this phase the encoder applies variable quantization to DCT coefficients to reduce the number of bits required to represent them. Quantization is a lossy compression technique which means that some data is lost in this phase which cannot be recovered at all. That’s why special care should be taken as to which frequency components should be quantized coarsely and which should not be. The way in which encoder performs quantization is as follows: Assume that “DCT” is the un-quantized DCT coefficient, “Quantizationmatrix” is a value in the quantization matrix, Qs is the quantization step size and QM is the Quantized DCT matrix value.

QM = 8* DCT /(Quantisationmatrix*Qs)

(4)

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

The above formula is for the AC coefficients in the un-quantized DCT matrix. Varying quantization step size directly affects the bit rate and Peak Signal To Noise Ratio (PSNR). For quantizing the DC coefficient the formula is given below [3].

QM = DC − coefficient / 8

(5)

QM should not be a decimal number. In order to convert QM into an integer value the value in QM is rounded off. The equations 4 and 5 are used for intra-frame quantization. After the first frame is sent to the decoder, only the difference between the frames is transmitted In order to quantize the difference inter frame quantization matrix is used [1, 3]. On the decoder part dequantization is performed. The dequantized values are found by simply inversing the Quantization formulas. The matrices mentioned above are suitable for the coefficients resulting from the discrete cosine transform. After DCT, Quantization and then Inverse Quantization and Inverse DCT, the resultant image is almost the same as the original matrix. But still these phases are lossy compression techniques so the reconstructed image and the original image are not exactly the same. In order to measure the loss occurred during the compression phases Mean square Error (MSE) can be used.

4. Zigzag scan & Run length coding After DCT and Quantization most AC values would be zero. By using zigzag scan we can gather more consecutive zeros, after that use Run Length Encoding to further gain the compression ratio. Below is a zigzag scan example:

Figure 2: ZigZag Scan As evident from the above Figure 2 [4], after zigzag scan there are a lot of consecutive zeros. By simply counting these consecutive zeros and sending only the count to the decoder can drastically reduce the bits to be transferred. 4.1 Proposed Diagonal Divide As already mentioned, after DCT most of image’s/signal’s energy is packed on the top left corner of the resulting matrix. Using this assumption, we can separate the packed energy part from rest of the matrix by diagonally dividing it. This process is shown in the figure below.

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

Figure 3: The Diagonal Divide Only the values which are above the diagonal are passed through the zigzag scan while the rest of the matrix values are discarded. It is apparent from the above diagram that this process would half the amount of bits to be transferred. Along with it, same zigzag scan can be used to gather the consecutive zeros for Run Length Encoding. After ZigZag scan, run length coding is performed. Run Length Coding is a statistical coding technique. Run Length encoding counts the consecutive entries of the resultant zigzag array and instead of all of the identical consecutive entries. We use an escape code followed by the symbol and the number of repetitions.

5. Decoder After the run length coding the data is transmitted to the decoder. Decoder simply performs the inverse of the modules which have been implemented in the encoder. Now in the Inverse Zigzag scan process only half of the matrix values are received because on the encoder side the coefficients below the diagonal were discarded during the “Diagonal Divide”. In order to reconstruct the 8*8 Quantized DCT matrix, padding of zeros is done in the lower half of the reconstructed matrix. It is clear that the above process definitely take its toll on the image quality. This (Diagonal Divide) process slightly increases the MSE and PSNR but it has almost no effect on the human visual system. Statistical analysis of the images after Diagonal Divide is given in the next section.

6. Statistical analysis In this part, effect of introducing the suggested “Diagonal Divide” phenomena on the compression of images is shown. These results were tested for 200 images. Well known techniques for measuring the quality of reconstructed images namely MSE (Mean Square Error) and PSNR (Peak Signal To Noise Ratio) were used. 6.1 Mean Square Error Various quantitative criteria, in general, are not true evaluators of the performance of a video codec. In literature, quantitative measure such as MSE and PSNR are often used to compare different systems. The formula for calculating the mean square error is [8]: M −1 N −1

MSE =

1 / MN ¦¦ [x(m, n) − x' (m, n)]^ 2

(6)

m=0 n=0

Where x is the original image, x’ is the reconstructed image, M and N are the width and height of x and x’ respectively.

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

Figure 4: MSE for Akiyo Seq. Before & After the Diagonal Divide It can be seen from the above graph that MSE increases for the images with Diagonal Divide. But the thing to note here is that this difference in MSE (Mean Square Error) does not affect the human visual system. It is obvious from the figures given in the next section.

Figure 5: MSE for Flower_Garden Seq. Before & After the Diagonal Divide Figure 5, shows the comparison between the images of a Flower Garden before and after the Diagonal Divide. It is obvious from the graph that MSE increases greatly when Diagonal Divide is applied to the images, but as such there is no effect when we look at the two images. Another thing to consider is that all of these images had the same Quantization Factor which was 4. This also shows that MSE does not take the human visual system into consideration. Now if we compare the graphs in Figure 4 and Figure 5 we note that this technique slightly increases the MSE in Akiyo’s image as compared to Flower_Garden’s image. The reason being Akiyo’s image has less radical frequency changes as compared to Flower_Garden’s image. Though our proposed system does increases MSE in those images which have high variation in coefficients but this is not evident to Human Visual system, because of the fact that our visual system is less able to detect artifacts in those areas that have high variation of frequencies.

7. Conclusion This technique (Diagonal Divide) can be used in all those video codec applications that use Discrete Cosine Transform (DCT) or similar data compression technique to reduce the amount of data to send. By using this Diagonal Divide, applications can achieve the same quality of image with half of the amount of data which they previously had to transfer. Another observation which can be seen from the results is that Mean Square Error which is computed by taking the difference between the pixels is not such a good criteria for rating images. The mean square error (MSE) cannot capture the artifacts like blurring or blocking, and do not correlate well with visual error perception.

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

Up-till now this proposed technique has only been tested on grayscale images. The next phase would be to implement such a system for colored images.

References [ 1 ] Roger J Clarke, “Digital Compression Of Still Images And Video”. [ 2 ] “Video Compression – An Introduction” By Array MicroSystems Inc. Copyright © 1997 by Array Microsystems, Inc. All rights reserved. [ 3 ] Andrew B. Watson, “Image Compression Using Discrete Cosine Transform” , NASA Ames Research Center. [ 4 ] Shahnawaz A. Basith London.

& Stephen R. Done, “Digital Video, MPEG and Associated Artifacts” Imperial College

[ 5 ] Umbaugh, Scott E ,“Computer Vision And Image Processing”. [ 6 ] “An Overview of H.261 , Application Note” By Zarlink Semiconductors, October 1996. [ 7 ] Ahmet M. Eskicioglu, “Quality measurement for monochrome compressed images In the past 25 years”, Thomson Consumer Electronics,101 West 103rd Street, Indianapolis, IN 46290, USA [ 8 ] Pengwei Hao, Qingyun Shi, Ying Chen, “Co-histogram and its application in Remote sensing image compression evaluation”, Center for Information Science, Peking University, Beijing,

Figure 6.

Figure 7.

Figure 8.

RESULTS: Figure 6. shows Akiyo Image after compression with Quantization Factor of 4 & Without Diagonal Divide. Figure 7. shows Akiyo Image after compression with Quantization Factor of 4 & With Diagonal Divide. Figure 8. shows Flower_Garden Image after compression with Quantization Factor of 4 & Without Diagonal Divide. Figure 9. shows Flower_Garden Image after compression with Quantization Factor of 4 & With Diagonal Divide

Figure 9.

Proceedings of the Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05) 0-7695-2358-7/05 $20.00 © 2005 IEEE

Lihat lebih banyak...

Enhanced Image/Video Compression Using Diagonal Divide

Descripción

Comentarios