|
DVD Forum FAQS
Audio Audio data specifications Linear PCM Dolby AC-3 MPEG-2 audio Sampling frequency 48 or 96 kHz 48 kHz 48 kHz Number of bits per sample 16/20/24 compressed (16 bits)compressed (16 bits)Max transfer rate 6.144 Mbit/sec 448 kbits/sec 640 kbits/sec Max Number of channels 8 5.1 5.1 or 7.1
NTSC/PAL Mandatory Dolby AC-3 and/or Linear PCM MPEG-2 audio and/or Linear PCM Optional MPEG-2 Audio Dolby AC-3
Philips' provided three practical scenarios for audio.
UseChannelskbits/sec Case 1: One mono language channel to be mixed with the Center multichannel set. Multichannel music & effects 5.1 or 7.1 384Mono English dialogue 1 64Mono French dialogue 1 64Mono German dialogue 1 64 Case 2: One of the stereo lingual signals mixed with the L & R channel of the playback multichannel set. Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 2 128 Mono French dialogue 2 128 Mono German dialogue 2 128 Case 3: One to be selected for playback.Multichannel with English dialogue 5.1 or 7.1 384 Multichannel with French dialogue 5.1 or 7.1 384 Multichannel with German dialogue 5.1 or 7.1 384
Audio Signal Decoding System
General
up to a maximum of 8 audio streams can be multiplexed into the same cell with a single video stream. Each stream for example is designated for a particular language or special effects & music tracks. Dolby AC-3 used mandatory for 525/60 (NTSC) players and MPEG-2 is mandatory for 625/50 (PAL) players, but optional on discs themselves. LPCM (Linear Pulse Code Modulated) is mandatory for all players, but optional on discs themselves.
48 kHz and 96 kHz uncompressed PCM audio High Definition Audio Experience
A 525/60 disc must contain either Dolby AC-3 or LPCM. A 625/50 disc must contain either MPEG-2 audio or LPCM. Due to bandwidth efficiency, most titles will use the more compact Dolby AC-3 or MPEG-2 audio. Extendibility is reserved for new algorithms such as DTS, Sony SDDS, et al. IEC-958 Digital Audio Interface for external decoder/receiver. Output types: compressed AC-3 or MPEG stream, two channel LPCM. DVD players are required only to output a full reconstruction of the Left and Right channels. An external AC-3 decoder would optionally decode all 5.1 channels. A more expensive DVD player would output all 5.1 reconstructed channels.
Dolby AC-3 parameters
Sampling frequency 48 kHzbitrate 64 kbits/sec to 448 kbits/sec per streamAudio coding mode 1/0, 2/0, 3/0, 2/1, 2/2, 3/1, and 3/2 (acmod)Characteristics dialog normalization dynamic range compression downmixing (5.1 -> 2 channel) capability Dolby Pro-Logic Encoding (5.1 -> 2 channel) Karaoke mode (voice overlay)
MPEG Audio parameters
Sampling frequency 48 kHzMPEG-1:Layer II only Mono (32 to 192 kb/s) and Stereo (64 to 384 kb/s) MPEG-2 main stream (same as MPEG-1) extension stream (up to 528 kbit/sec) sum of main and extension stream up to 912 kb/s unmatrix mode excluded (always MPEG-1 compatible)
LPCM Coding
Lossless/uncompressed PCM audio Sampling frequency: either 48 kHz or 96 kHz bits/sample: 16, 20, or 24 bits up to 8 PCM channels.
Due to the user rate bandwidth limitation of 6.144 mbit/sec for any LPCM audio stream, not all combinations of channel count, sample precision and sample rates are permitted. However, up to 8 separate streams are permitted, as long as the combined stream rate is less than or equal to 9.8 mbit/sec. DVD nomenclature states that a single LPCM stream consists of one to 8 channels.
Sample RateSample Prec.Channel Count(Hz)(bits)Mono 2 CH 5 CH 8 CH48,00016 Yes Yes Yes Yes20 Yes Yes Yes No 24 Yes Yes Yes No96,00016 Yes Yes No No20 Yes Yes No No24 Yes Yes No No Source: Pioneer
Basic User interface: Control: ten keys and cursor keys Display: menu graphics and high-light
GUI Display: Menu picture with subpicture and MPEG graphics highlighted area ------------------------------------------------------------------------ Menu:
Basic
1. Title A 2. Title B 3. Title C 4. Previous 5. next
Multi-page Menu
1. Title A 4. Title D 7. Title G 2. Title B 5. Title E 8. Title H 3. Title C 6. Title F 9. Title I Exit Next Prev Exit Next Prev Exit
------------------------------------------------------------------------
Interactivity
Level of functionality 1. simply play 2. interactivity similar to Video-CD 3. Interactivity simular to PC Applications
------------------------------------------------------------------------
Functions
Information Control
parental control copy management
Menu
Title: sub-picture Root: Angle Audio: part of title
Search functions:
program search time search angle search part of title search
Seamless play function
Still picture function
Search Functions by User
There are 6 search functions defined for DVD. Two are present in most of today's VCRs: the linear style Time Search and Scan (Fast forward, rewind). The other 4 are made possible thanks to the non-linear, random-access playback capability of DVD.
User operation (ability to scan through or play) can be prohibited by content. This is signalled by such attributes as the parental control level. For example, certain Part_of_Title's can be skipped over which contain R-rated (US) scenes.
Title SearchUser can select the exact title to shuttle to.Part_of_Title SearchUser can go to specific version (PG-13, R, directors cut, children's version) or camera angle by either title name or number.Program Search User can go to a specific scene (car chase, opening credits, gun fight, etc.) within a program chain.Time SearchUser can go to a specific SMPTE style time code (HH:MM:SS:FF) location within a program chain.Scan:Scan (linearly) forward or backwards in time.GoUp:Within the current program chain, jump to the next program chain. This command traverses the DVD control information hierarchy.
For Time Searches, all DVD players are required to arrive to the nearest I picture. It is optional that DVD players be capable of arriving at the exact picture (regardless of its picturing coding type).
Navigation Commands and Parameters
The author (content provider) is given the freedom of creating an arbitrary branching structure for a given title. Of course some restraint should be exercised since, thanks to interframe MPEG coding dependencies and physical servo mechanism limitations, a program chain cannot be constructed of 30 pictures/sec of totally randomly located information on the disc.
However, the constant DVD transfer rate of 11 Mbit/sec provides some flexibility when the average program rate is kept lower. For example, if the average bit rate is only 5 Mbit/sec, then the player can waste 6 Mbit/sec of potential transfer rate in random access overhead.
Player Settings:
There are 24 system parameters for player setting:
SPRMMeaning 0Menu Description Language Code 1-Audio stream number 2-Sub-picture Stream number 3-Angle Number 4-Title Number 5-VTS title Number 6-Title PGC Number 7-Part of title number for one sequential_PGC_Title 8-Highlighed Buttom number 9-Navigation Timer 10-Title PGC number for Navigation Timer 11-Audio Mixing Mode for Karaoke 12-Country Code for Parental Management 13-Parental Level 14-Player Configuration for Video 15-Player Configuration for Audio 16-Initial Language Code for Audio 17-Initial Language Code for Sub-picture 18-Initial Language Code Extension for Sub-picture 19-Initial Language Code for Sub-picture 20-Reserved 21-Reserved 22-Reserved 23-Reserved
General Parameters:
Used for interactive operation of titles, such as quizzes, or games. 16 general parameters for navigation. These are RAM variables in the DVD players for use as, e.g., arithmetic scratch pads, counters, etc. Arithmetical operations are available (add, compare, etc.)
Navigation Commands
Each command consists of a single instruction or a combination of two or three instructions.
Instruction Groups:
Goto branch between command Link transfer between same Domain Jump transfer between each Domain Compare recognition of parameter value SetSystem player system setting Set calculate GPRM values Letterbox
Picture Size Conversion
All DVD players are required to have built-in vertical filters which scale a 16:9 coded video image to fit within a traditional 4:3 display. This player feature is needed since it is anticipated that a majority of movies will be coded for the 16:9 aspect ratio, while at the same time most TV displays (in the early years) will be 4:3. In the same vien as multilingual audio, a single coded aspect ratio in market distribution reduces confusion and bolsters economy of scale.
525/60 (NTSC-rate display):
(Note: 480*(4/3)/(16/9) = 480*0.75 = 360)
_____________________ | 60 | |---------------------| | | | 360 | 480 lines total | | |---------------------| | 60 | -----------------------
625/50 (PAL-rate display):
(Note: 576*(4/3)/(16/9) = 576*0.75 = 432)
_____________________ | 72 | |---------------------| | | | 432 | 576 lines total | | |---------------------| | 72 | -----------------------
A simple bi-linear vertical filter can be applied, yielding good visual results. Here, two source samples (s[n],s[n+1]) are weighted by simple complementary factors and added together to form the destination sample value (d[m]). These weights are easily implemented with shifters. For interlaced displays, vertical filtering occurs only within the same field parity.
d[0] = (3/4)*s[0] + (1/4)*s[1] d[1] = (1/2)*s[1] + (1/2)*s[2] d[2] = (1/4)*s[2] + (3/4)*s[3]
A decoder can determine whether inter or intra-parity vertical filtering is applied by testing the progressive_frame flag of the MPEG-2 video stream. (MPEG-1 frames are always progressive by definition). This flag indicates that a picture contains interlaced or progressive vertically correlated information. Almost all MPEG-2 coded movies consist exclusively of progressive frames. In a sense, MPEG-2's interlaced prediction modes are underutilized by DVD. Presentation
Combinations of presentation types
TypeCountRepresentationVideo 1 stream only MPEG-1 or MPEG-2 VideoAudio maximum of 8 streams Linear PCM and/or: Dolby AC-3 (NTSC) MPEG audio (PAL)Sub-picture max 32 streams Run-length encoded with bitmap of 2 bits/pixel
Presentation stream rates
max total of combined audio and video: 9.8 Mbit/sec max sum of Elementary streams + systems overhead: 10.08 Mbit/sec.
Presentation of PGC
The program chain (PGC) can be presented either serially (linear) or in random/shuffle (non-linear) fashion.
For example, a quiz title should break each question into separate programs. The next program chain branched to would be determined by the answered provided by the user.
Still image presentation
Still pictures are coded as MPEG intra frames. They may be displayed for indefinite duration. They can be accompanied by background music, or audio can be muted.
still function is created by the action of the navigation system The same video frame and sub-picture is frozen (displayed over and over again on the TV) while audio is or playing in background.
There are three types of the Still Function:
TypeTimingStill time in secondsPGC StillStills at end of the PGC0-254, limitlessCell StillStills at end of the Cell0-254, limitlessVOBU StillStills in every VOBU in the Cell limitless
VOBU: Video Object Unit.
Location of each command
Within a program chain (PGC), commands can be located at the front of the chain, in between cells of the chain, and at the end of the chain.
Program chain
[Pre-Commands] [Cell] [Cell] [Cell-Command] [Cell] [Post-Commands]
Each cell can have one command. There is a restriction that no more than 128 commands can be contained within a program chain:
Pre-commands + Cell Commands + Post Commands <= 128
Further, there are a maximum of 36 buttons, each of which can have one associated command.
Example of a PGC transition
[taken from the Hitachi overheads]
3 quiz problems are presented to the user. Each quiz problem/question is coded as a separate program chain. One of the questions prompts the user for a "Yes" or "No" answer.
The Link command is used to branch from the original top-level menu to one of the three program chains. The Set Command is used to tally a score. Finally, the CompareLink command (which consist of two commands, Compare & Link) branches to a particular Program depending on the user's answer.
File Structure Hierarchy
The DVD is broken into two separate types of information: Navigation Data (control) and Presentation (object) data. Control data acts as pointers (like an operating system's File Allocation Table) to the actual video and audio object data on the disc.
In the DVD reference player model, Presentation and Navigation data packets are separated at the track buffer.
Control data can be expressed as a series of nested layers:
Titledistinguishes multiple movies or TV episodes on one disc. Each title is one of two types: a single program chain (One_Sequential_PGC_Title) or a collection of different program chains (Multi_PGC_Title).Program ChainA collection of programs with, e.g., a particular theme in common.Part_of_TitleLinks to one or more Program (PG) units on the disc. Like PGC, this mechanism can be used to create different versions (camera angle, ratings, outcomes, etc.) of the same program chain. POTs can also be used to mark scenes.ProgramUsually a scene. Consists of multiple cells.CellPreceded by a navigation packet, and alternating video and audio packets. A cell is typically all the video and audio data associated with an integer number of a group of pictures. VOBUVideo Object Unit: nominally a group of pictures (GOP)GOP1. smallest granularity of random access on disc (Group of pictures being with a coded Intra frame) 2. largest interframe dependent coding unit. (Interframe compression is bounded within a GOP) Usually 15 display frames of data (0.5 seconds duration) for NTSC-rate (525/60) content.PacketDVD packets are 2048 bytes (sector payload size) large. As per MPEG-2 PES/Program streams, they contain data from only one data elementary stream (video, audio, etc.)NAV packetcontains the optional Buttom-Command defining the playback behaviour of the current cell.
1. Logical structure of Video Manager and Video Title Set [notes from Hitachi] =========================================================
A disc volume may contain up to 99 different titles, each with an initial Navigation Menu allowing the user to select among different versions of the title. The root menu which branches to all titles on the disc is coded within the Video Manager. Each title is organized as a Video Tile Set (VTS).
DVD: [VM][VTS #1][VTS #2] ..... [VTS #n] where n<=99
The VM's VMGI includes: Attributes for the Menu, Tile Search Pointers, and the PCGI for the Menu.
VM: [VMGI][VOBS for Menu][Back up for VMGI]
The Control Data (VTSI) for the title (VTS) includes: attributes for Menu, Attributes for Title, Part of Title Search Pointer, Time Map Table, PGCI for Menu, and PGCI for Title. The Video Objects (VOBS) contain the actual program chains, Part_of_Tiles, programs, and so forth.
VTS: [VTSI][VOBS for Menu][VOBS for Title][Back up for VTSI]
Legend: VM Video Manager: sets up menus for a series of titles (1 through n) VTS Video Tile Set: a collection of video objects. VMGI Video Manager Information: VOBS Video Object Set PGCI Program Chain Information
Structure of Title
A title begins with the entry program chain (Entry PGC). It can branch to a single program chain (One_Sequential_PGC_Title) or multiple program chains (Multi_PGC_Title). The location of the branch is determined by the link condition.
Structure of a Program Chain (PGC)
The program chain is broken into two separate entities: program control information (PGCI) video object (VOB)
The PGCI defines the playback order of Programs by acting as a table of addresses which point to the sector locations of the program cells on the DVD. A program cell is essentially a group of pictures (GOP), spanning multiple sectors, and contains the actual interleaved packets of compressed bits for video and audio data.
Part_of_Title (PTT)
The Part_of_Title divides a title in a maximum of 99 different pieces. The intent of the PTT is aid in the construction of multiple versions of the same title.
One_Sequential_PCG_Title: The Part_of_Title and Program numbers are synchronized.
[ PTT #1 | PTT #2 | .... | PTT #n ] Part_of_Title [ [PG #1] | [PG #2] | .... | [PG #n] ] Program Chain (PGC)
Multi_PGC_Title:
branch PTT #2 --> [PG #1] (PGC1) PTT #3 PTT #m PTT #1 --> [PG #1] [PG #j] ... [PG #k] (PGC2) [PG #1] --> [PG #1] (PGC3)
Subpictures
run-length compressed bitmaps that are overlayed ontop of the MPEG reconstructed video. Applications include: Menus, sub-titles, karaoke, and simple animation. Pixels are divided into four types: 1. background 2. Foregound 3. Empahsis-1 4. Emphasis-2 4 colors out of 16 color palette (4 colors are determined once per PGC). 4 out of 16 contrast values up to a maximum of 32 sub-picture bitstreams. Each subpicture stream could, for example, could contain text from a particular language. subpicture buffer size is restricted to 62 Kbytes. This means a maximum of 62 KB per GOP/cell. 32 Kbytes of this is control data. Maximum number of bits per run-length coded line is 1440 bits. Display area maximum: 720x480 (525/60) and 720x576 (625/50) area, content, color, and contrast can be changed every video field Sub-Picture Display Control Sequences (SP_DCSQ) control the presentation of Sub-pictures. Presentation effects include: scroll up/down, fade in/out, etc.
Structure of Sub-picture Decoding Unit (SPU):
[ SPUH ][ PXD ][ DCSQT ]
SPUH: Sub-picture Unit Header (size of SPU, start address of DCSQT)
PXD: Pixel Data (variable length run-length coded)
DCSQT: Display Control Sequence Table (one or more display control command sequences).
DCSQT: [DCSQ 0][DCSQ 1][DCSQ 2] ... [DCSQ n]
DCSQ: [Start time] [ Pointer to next DCSQ] [Command Sequence]
Command Sequence: [DCC 0][DCC 1]... [DCC m]
Display Control Commands (DCC):
Set start address in PXD Set colors Set contrast Set SP screen position Start/stop display Set CHG_COLCON areas.
VBI Decoding
The Vertical Blanking Interval (VBI) packet (multiplexed at the Cell level along with Navigation, Video, and Audio packets) contains information which is directly inserted into the reconstructed video signal, sans level adjustments (16 levels into a, e.g. 256 nominal level video signal).
only 1 VBI channel per program (sub-pictures have up to 32) Line range is from 10 to 23 NTSC and 6 to 23.5 for PAL. Separate palette (16 Y values, Cr=Cb=128) from subpictures. No highlight Restricted DCSQ command set closed captioning for NTSC-rate (525/60) is coded exclusively in the user_data() field of the group_of_pictures_header()
VBI information is losslessly represented as a waveform, and coded into packets. The 525/60 player uses a far more efficient alternative: the source character stream is coded in the MPEG video user_data() field. The NTSC/PAL modulator chip then creates the VBI closed caption (Line 21) signal from this character stream.
This brings our tally of closed caption representations to THREE ways!!
as packets of 16-level sampled VBI waveforms (PAL) as user_data() character streams (NTSC) as rendered subpictures (NTSC and PAL)
Video Data Specifications
DVD adds many additional restrictions to the popular compliance parameter sets of MPEG. One good example is the restriction on the coded size of a picture: MPEG-2 Main Profile @ Main Level allows any coded frame size between 16 and 720 pixels horizontally and 16 and 576 pixels vertically. DVD, however, restricts the coded frame sizes to a very limited but practical subset.
In MPEG, audio can be coded at a sample rate of 32, 44.1 or 48 kHz. In DVD, the rates of both Dolby AC-3 and MPEG audio are strictly set to 48 kHz.
MPEG is a generic representation meant for a wide variety of applications. DVD has taken a practical subset to promote interoperability by simplifying implementations and insuring features (such as random accessibility).
Coded representationMPEG-1 (SIF combo) MPEG-2 (Main Profile @ Main Level)Frame rate29.97 or 25 HzTV system525/60 or 625/50Aspect ratio4:3 (all video formats) 16:9 (all formats except 352 pixels/line)Display Modepan & scan, letterboxUser_dataclosed captionCoded frame sizes525/60: 720x480, 704x480, 352x480, 352x240 625/50: 720x576, 704x576, 352x576, 352x288 (MPEG-1 is allowed only in 352x240 or 352x288 res).GOP sizemax 36 fields or 18 frames (NTSC) max 30 fields or 15 frames (PAL)Buffer size1.8535008 Mbits (MPEG-2) max 327689 bits (MPEG-1)Transfer methodVBR, CBR (MPEG-2) only CBR for MPEG-1Maximum bitrate9.8 Mbit/secLow_delayNOT permitted !!!!
Notes:
the frame rate is the intended display frame rate. The number of coded frames in a sequence may vary due to 3:2 pulldown (the DVD MPEG decoder performs this function). The permitted values in DVD are more restrictive than MPEG-2 MP@ML which also includes 23.976, 24, and 30 frames/sec rates. aspect ratio is the display aspect ratio. Only 16:9 and 4:3 are permitted. Note: MP@ML's 2.21:1 is not included. MP@ML has no GOP size restriction. In fact, the GOP() is considered to be an insignificant layer in MPEG-2. Instead the sequence() layer serves as the most important boundary in the generic MPEG sense. The MPEG-1 and MPEG-2 vbv_buffer_size limits are the same as MP@ML and Constrained Parameters Bitstreams, respectively. The maximum bitrate of 9.8 Mbit/sec is more restrictive than MP@ML's 15 Mbit/sec limit. However, the point of diminishing returns (no visual difference between original video and compressed video) is widely known to be around 9 Mbit/sec. user_data() fields in MPEG video picture headers contain closed captioning (similar to Grand Alliance and DVB methods). See this ATSC (Advanced Television Systems Committee) site for more information: http://www.atsc.org/ [is this the same in DVB, DVD, and ATSC ? ] For picture sizes, only a very limited set of coded dimensions are legal. Variable bit rate is permitted only in MPEG-2 streams since the VBV model in MPEG-2 has provisions for it. MPEG-1 was an earlier standard (by two years) and not developed the VBV model to handle 3:2 pulldown cases. contrary to popular belief: all DVD players are required to decode video streams up to 9.8 Mbit/sec for indefinite periods of time. The popular average rate of 3.5 Mbit/sec or 4.7 Mbit/sec are merely canonical figures created by the notion that only single sided, single layer discs will hold feature length films. Should Single Sided, Double Layer discs prevail, the average rate would be almost twice as great. ALL DVD PLAYERS MUST SUSTAIN A 9.8 MBIT/SEC VIDEO DECODE RATE!!!!!!! Hardwired (Application Specific Integrated Circuits---ASICs) implementations of MPEG-2 MP@ML decoders are generally capable of handling 15 mbit/sec sustained rates.
MPEG Display Formats
MPEG-2 video decoder chips have implemented pan & scan for a few years already since it has been a requirement for cable TV and direct broadcast satellite applications. The letterbox requirement (vertical filter) is a relatively new addition to the MPEG decoder universe. The second generation DVD MPEG-2 video decoders will most likely perform "on-chip" sub-picture reconstruction.
Display 4:3Display 16:9Source 4:3 No conversionhorizontal filtering accomplished by TV monitor.Source 16:9letterbox (vertical filter) or Pan & ScanNo conversion
Note: Letterbox Conversion is a mandatory feature in the DVD Player !!!
|