Problem 2 [25 pts (3, 5, 6, 6, 5)]: Entropy An MPEG file encodes video partly by
ID: 3736421 • Letter: P
Question
Problem 2 [25 pts (3, 5, 6, 6, 5)]: Entropy An MPEG file encodes video partly by using the previous frame of video as a reference. Each macroblock" of 16x16 pixels gets one "motion vector" associated with it that indicates where to look in the previous frame for a block of 16x16 pixels that could be used as a starting point for the new 16x16 block of pixels. Suppose for this problem that we aren't interested in the magnitude of each motion vector - we want to encode, for each frame of video, just the rough direction of each motion vector. It's either up, up-right, right, down-right, down, down-left, left, up-left, or no motion (9 possibilities). i. If the "no motion" possibility occurs half the time, how many bits should we use to encode this possibility in an efficient encoding of these directions? ii. Of the remaining possibilities, left movement occurs 1/8 the time, right movement occurs 1/8 the time, down and up each occur 1/16 of the time, and the diagonals each occur 1/32 of the time apiece. Calculate the minimum number of bits necessary to represent each direction, assuming we have one code per symbol. iii. What is the entropy of this directional data, using the frequencies given in the preceding teps? (Up to 3 decimal places.) iv. Suppose instead of the frequencies just mentioned, we have more active video where all 9 symbols are equally likely. Calculate the entropy of the motion directions in this scenario. (Up to 2 decimal places.) v. Explain why it makes sense from a qualitative point of view that the number you obtained in the preceding step is iarger or smaller than the entropy that you calculated before.Explanation / Answer
g A video scene captured as a sequence of frames can be efficiently coded by estimating and compensating for motion between frames prior to generating interframe difference signal for coding. Since motion compensation is a key element in most video coders, it is worthwhile understanding the basic concepts in this processing step. For the ease of processing, each frame of video is uniformly partitioned into smaller units called Macroblocks (MBs, formally defined a bit later) where each macroblock consists of a 16 × 16 block of luma, and corresponding chroma blocks. The way that the motion estimator works is illustrated in Fig. 2.1. Each block of pixels (say 16 × 16 luma block of a MB) in the current frame is compared with a set of candidate blocks of same size in the previous frame to determine the one that best predicts the current block. The set of blocks includes those within a search region in previous frame centered on the position of current block in the current framUncompressed digital video of full component TV resolution requires a very high transmission bandwidth, while VHS VCR-grade equivalent raw digital video requires transmission bandwidth of around 30 Mbits/s, with compression still necessary to reduce the bit-rate to suit most applications. The required degree of compression is achieved by exploiting the spatial and temporal redundancy present in a video signal. However, the compression process is inherently lossy, and the signal reconstructed from the compressed bit stream is not identical to the input video signal. Compression typically introduces some artifacts into the decoded signal. The primary requirement of the MPEG-1 video standard was that it should achieve the high quality of the decoded motion video at a given bit-rate. In addition to picture quality under normal play conditions, different applications have additional requirements. For instance, multimedia applications may require the ability to randomly access and decode any single video picture3 in the bitstream. Also, the ability to perform fast search directly on the bit stream, both forward and backward, is extremely desirable if the storage medium has “seek” capabilities. It is also useful to be able to edit compressed bit streams directly while maintaining decodability
standard employs interframe video coding that was described earlier. H.261 codes video frames using a DCT on blocks of size 8 × 8 pixels, much the same as used for the original JPEG coder for still images. An initial frame (called an INTRA frame) is coded and transmitted as an independent frame. Subsequent frames, which are modeled as changing slowly due to small motions of objects in the scene, are coded efficiently in the INTER mode using a technique called Motion Compensation (MC) in which the displacement of groups of pixels from their position in the previous frame (as represented by so-called motion vectors) are transmitted together with the DCT coded difference between the predicted and original images. A single-motion vector (horizontal and vertical displacement) is transmitted for one Inter_MC MB. That is, the four Y blocks, one Cb, and one Cr block all share the same motion vector. The range of motion vectors is + 15 Y pels with integer values. For color blocks, the motion vector is obtained by halving the transmitted vector and truncating the magnitude to an integer value. Motion vectors are differentially coded using, in most cases, the motion vector of the MB to the left as a prediction. Zero is used as a prediction for the leftmost MBs of the GOB, and also if the MB to the left has no motion vector. The transform coefficients of either the original (Intra) or the differential (Inter) pels are ordered according to a zigzag scanning pattern. These transform coefficients are selected and quantized at the encoder, and then coded using variable-length codewords (VLCs) and/or fixed-length codewords (FLC), depending on the values. Just as with JPEG, successive zeros between two nonzero coefficients are counted and called a RUN. The value of a transmitted nonzero quantized coefficient is called a LEVEL. The most likely occurring combinations of (RUN, LEVEL) are encoded with a VLC, with the sign bit terminating the RUN-LEVEL VLC codeword. The standard requires a compatible IDCT (inverse DCT) to be close to the ideal 64-bit floating point IDCT. H.261 specifies a measuring process for checking a valid IDCT. The error in pel values between the ideal IDCT and the IDCT under test must be less than certain allowable limits given in the standard, e.g., peak error <= 1, mean error <= 0.0015, and mean square error <= 0.02. A few other items are also required by the standard. One of them is the image-block updating rate. To prevent mismatched IDCT error as well as channel error propagation, every MB should be intra-coded at least once in every 132 transmitted picture frames. The contents of the transmitted video bit stream must also meet the requirements of the hypothetical reference decoder (HRD). For CIF pictures
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.