Principle of MPEG-2 Compression Encoder
Principle of MPEG-2 Compression Encoder
MPEG-2 compression encoder is a front-end device that performs MPEG-2 compression and encoding of analog TV video and audio signals to output real-time TS streams. It is suitable for digital TV transmission or front-end source coding, as well as various applications such as conference TV and distance education. An advanced encoder not only has a DVB interface, but also a telecommunications interface, so that the equipment can be conveniently used in HFC networks, microwave MMDS or 8GHZ systems, SDH or PDH networks, as shown in Figure 1.

Figure 1 Block diagram of the encoder.
MPEG-2 video coding system and key technology
The principle of MPEG-2 image compression is to use two characteristics in images: spatial correlation and temporal correlation. Any scene in a frame of image is composed of several pixels, so a pixel usually has a certain relationship with some pixels around it in brightness and chroma. This relationship is called spatial correlation; a program A plot in is often composed of an image sequence composed of several frames of continuous images. In an image sequence, there is also a certain relationship between the previous and subsequent frames. This relationship is called time correlation. These two correlations cause a lot of redundant information in the image. If we can remove this redundant information and only retain a small amount of non-related information for transmission, we can greatly save the transmission band, and the receiver uses this non-related information. According to a certain decoding algorithm, the original image can be restored under the premise of ensuring a certain image quality. A good compression coding scheme is to remove the redundant information in the image to the maximum extent.
The coded images in MPEG-2 are divided into 3 types, called I frame, P frame and B frame respectively.
I frame images adopt intra-frame coding, that is, only the spatial correlation within a single frame image is used, and the temporal correlation is not used. The I frame is mainly used for the initialization of the receiver, the acquisition of the channel, and the switching and insertion of the program. The compression factor of the I frame image is relatively low. The I frame image appears periodically in the image sequence, and the frequency of occurrence can be selected by the encoder.
P frame and B frame image adopt inter-frame coding method, that is, use space and time correlation at the same time. P-frame images only use forward time prediction, which can improve compression efficiency and image quality. The P frame image may contain an intra-frame coded part, that is, each macro block in the P frame may be forward prediction or intra-frame coding. The B-frame image adopts two-way time prediction, which can greatly increase the compression ratio. It is worth noting that because the B-frame image uses the future frame as a reference, the transmission order and display order of the image frames in the MPEG-2 code stream are different.
MPEG-2 encoding stream is divided into 6 levels. In order to better represent the encoded data, MPEG-2 uses a syntax to specify a hierarchical structure, which is divided into 6 layers, from top to bottom: image sequence layer, group of pictures (GOP), image, macro block strip, macro Block, block. The main applications of the MPEG-2 standard are as follows: storage of video and audio data; non-linear editing systems and non-linear editing networks; microwave, satellite, and optical cable transmission; TV program broadcasting. In all-digital TV technology, there are two very key coding technologies, namely source coding and channel coding. They use MPEG-2 technology. The main task of source coding is to solve the problem of image signal compression and storage. The main task is to solve the problem of image signal transmission. The image signal has a large amount of data. If it is not compressed, the digital TV signal cannot be transmitted in real time. The main way of compression is to remove redundant signals. The so-called redundant signal refers to those redundant parts that have nothing to do with information or have little effect on image quality. This is the principle of MPEG-2 image compression.
(1) Spatial redundancy. An image is composed of hundreds of thousands of pixels, and there is great similarity (or correlation) between two or even a few adjacent pixels. During transmission, there will be a situation of continuous transmission of many identical data, which is called Spatial redundancy, using a certain coding method (such as orthogonal transform coding) to remove redundant information in space, reducing transmission and recording code rates.
(2) Time redundancy. TV images also have a strong time correlation. For a 25 frame/s image, the difference between the previous image and the next image is usually very small, and most of the screen content is the same, which indicates that two adjacent images The correlation of the images is very large, and when the images are far apart, the correlation of the images gradually decreases, and the changes of such highly correlated images are generally regular, which means that each image The changes are predictable. Using the temporal redundancy feature of the image, removing the temporal redundant information of the image signal can also reduce the transmission and recording code rate.
( 3) Statistical redundancy. After the image and sound signals are digitized, they follow certain statistical laws. For example, under the image predictive coding system, the predicted value of the current pixel signal is predicted by the previous several adjacent pixel values or the time value of the pixel in the previous period. According to the spatial correlation and time correlation of the image, it can be known that the probability of the signal with small prediction error is large, and on the contrary, the probability of occurrence is small. Using the statistical coding method, short codes are used for small error signal values with a large probability of occurrence, and long codes are used for large error signal values with a small probability of occurrence, thus removing the statistically redundant information of the signal.
(4) Perceptual redundancy. Human audiovisual organs have some insensitivity. Perceptual redundancy refers to video and audio signals that are insensitive or inaccessible to people's visual and auditory discrimination. These insignificant information are given greater distortion, and people will not noticeably feel the degradation of image and sound quality. , Even unconsciously. Therefore, when encoding, the long code and the short code can be divided to encode different content. This is called doing something and not doing something, so as to achieve the purpose of reducing the code rate.