Polytechnic University, Dept. Electrical and Computer Engineering
Multimedia Communication System II
Fall 2005, Yao Wang
Second Exam (12/8, 11:00-12:50)
Closed-book, 1 sheet of notes (single or double sided) allowed, no peeking into neighbors!
1. Video coding standards (20 pt)
a) Describe two features incorporated in the H.263 video coding standard that helped to improve the
coding efficiency over the earlier H.261 video coding standard. (2.5pt for each feature)
i) Using half-pel accuracy motion estimation instead of integer-pel.
ii) Variable block size for
motion compensation. (allow a 16x16 block to be divided into 4 8x8 blocks, and estimating motion
vector for each 8x8 block separately. This is helpful when the 16x16 block includes two objects
If a student list another legitimate difference, it is acceptable too.
b) Why do MPEG-1 and MPEG-2 use the GOP structure with periodic I-frames? (2 pt) For video
conferencing or video phone applications, can the encoder insert I-frames periodically? What may
be the problem?
The GOP structure enables random access, which is important for video broadcasting, video
streaming, and DVD playback applications, which are the targeted applications of MPEG1/2.
Inserting I-frames periodically generally cause the bitstream to have spikes at I-frames. When the
bit stream is sent through a constant rate channel, the I-frame data will take longer time to send, this
will cause variable delay at the receiver. In order to display the video at constant frame rate, a large
smoothing buffer is needed at the receiver. This will significantly increase the delay time between
when a frame is sent at the sender and when it is decoded and displayed. The delay may exceed
several seconds. For video distribution applications targeted by MPEG1/2, this delay is typically
acceptable. However, for video conferencing/telephony applications, the acceptable delay
between 150 ms and 400 ms. Therefore, inserting I-frames periodically is not advisable for video
c) What is scalable coding? (2 pt)
Why is it beneficial for video streaming applications? (3pt)
Scalable coding generates, for each group of video frames, a bit stream that can be truncated either
at any point
or at several defined points. When a user receives a truncated bit stream, he/she will
see a correspondingly lower video quality (either in spatial resolution, temporal frame rate, or color
accuracy, or a combination of these). In video streaming applications, the same video
requested by users with different access bandwidth or decoding/display capability. Without scalable
coding, multiple versions of this video has to be encoded at different bit rate, with different