DVD Forum FAQS

 

Audio

Audio data specifications

Linear PCM Dolby AC-3 MPEG-2 audio

Sampling frequency 48 or 96 kHz 48 kHz 48 kHz

Number of bits per sample 16/20/24 compressed (16 bits)compressed (16 bits)Max transfer rate 6.144 Mbit/sec 448 kbits/sec 640 kbits/sec Max Number of channels 8 5.1 5.1 or 7.1

 

NTSC/PAL

Mandatory Dolby AC-3 and/or Linear PCM MPEG-2 audio and/or Linear PCM

Optional MPEG-2 Audio Dolby AC-3

 

 

 

Philips' provided three practical scenarios for audio.

 

 

 

 

UseChannelskbits/sec

Case 1: One mono language channel to be mixed with the Center multichannel set. Multichannel music & effects 5.1 or 7.1 384Mono English dialogue 1 64Mono French dialogue 1 64Mono German dialogue 1 64

Case 2: One of the stereo lingual signals mixed with the L & R channel of the playback multichannel set. Multichannel music & effects 5.1 or 7.1 384 Mono English dialogue 2 128 Mono French dialogue 2 128 Mono German dialogue 2 128

Case 3: One to be selected for playback.Multichannel with English dialogue 5.1 or 7.1 384 Multichannel with French dialogue 5.1 or 7.1 384 Multichannel with German dialogue 5.1 or 7.1 384

 

 

 

 

Audio Signal Decoding System

 

General

 

 up to a maximum of 8 audio streams can be multiplexed into the same cell with a single video stream. Each stream for example is designated for a particular language or special effects & music tracks.

 Dolby AC-3 used mandatory for 525/60 (NTSC) players and MPEG-2 is mandatory for 625/50 (PAL) players, but optional on discs themselves.

 LPCM (Linear Pulse Code Modulated) is mandatory for all players, but optional on discs themselves.

 

 

 48 kHz and 96 kHz uncompressed PCM audio

 High Definition Audio Experience

 

 A 525/60 disc must contain either Dolby AC-3 or LPCM.

 A 625/50 disc must contain either MPEG-2 audio or LPCM. Due to bandwidth efficiency, most titles will use the more compact Dolby AC-3 or MPEG-2 audio.

 Extendibility is reserved for new algorithms such as DTS, Sony SDDS, et al.

 IEC-958 Digital Audio Interface for external decoder/receiver. Output types: compressed AC-3 or MPEG stream, two channel LPCM. DVD players are required only to output a full reconstruction of the Left and Right channels. An external AC-3 decoder would optionally decode all 5.1 channels. A more expensive DVD player would output all 5.1 reconstructed channels.

 

 

Dolby AC-3 parameters

 

 

Sampling frequency 48 kHzbitrate 64 kbits/sec to 448 kbits/sec per streamAudio coding mode 1/0, 2/0, 3/0, 2/1, 2/2, 3/1, and 3/2 (acmod)Characteristics  dialog normalization

 dynamic range compression

 downmixing (5.1 -> 2 channel) capability

 Dolby Pro-Logic Encoding (5.1 -> 2 channel)

 Karaoke mode (voice overlay)

 

 

MPEG Audio parameters

 

 

Sampling frequency 48 kHzMPEG-1:Layer II only

Mono (32 to 192 kb/s) and Stereo (64 to 384 kb/s) MPEG-2 main stream (same as MPEG-1)

 extension stream (up to 528 kbit/sec)

 sum of main and extension stream up to 912 kb/s

 unmatrix mode excluded (always MPEG-1 compatible)

 

 

LPCM Coding

 

 Lossless/uncompressed PCM audio

 Sampling frequency: either 48 kHz or 96 kHz

 bits/sample: 16, 20, or 24 bits

 up to 8 PCM channels.

 

Due to the user rate bandwidth limitation of 6.144 mbit/sec for any LPCM audio stream, not all combinations of channel count, sample precision and sample rates are permitted. However, up to 8 separate streams are permitted, as long as the combined stream rate is less than or equal to 9.8 mbit/sec. DVD nomenclature states that a single LPCM stream consists of one to 8 channels.

 

Sample RateSample Prec.Channel Count(Hz)(bits)Mono 2 CH 5 CH 8 CH48,00016 Yes Yes Yes Yes20 Yes Yes Yes No 24 Yes Yes Yes No96,00016 Yes Yes No No20 Yes Yes No No24 Yes Yes No No

Source: Pioneer

 

Basic User interface:

 Control: ten keys and cursor keys

 Display: menu graphics and high-light

 

GUI Display:

 Menu picture with subpicture and MPEG graphics

 highlighted area

------------------------------------------------------------------------

Menu:

 

 

  Basic

 

    1. Title A

    2. Title B

    3. Title C

    4. Previous   5. next

 

  Multi-page Menu

 

    1. Title A          4. Title D              7. Title G

    2. Title B          5. Title E              8. Title H

    3. Title C          6. Title F              9. Title I

    Exit  Next          Prev Exit Next          Prev  Exit

 

------------------------------------------------------------------------

   

 

 

Interactivity

 

 

 Level of functionality

   1. simply play

   2. interactivity similar to Video-CD

   3. Interactivity simular to PC Applications

 

 

 

 

------------------------------------------------------------------------

 

 

Functions

 

Information Control

 

 parental control

 copy management

 

Menu

 

 Title: sub-picture

 Root: Angle

 Audio: part of title

 

Search functions:

 

 program search

 time search

 angle search

 part of title search

 

 

 

Seamless play function

 

Still picture function

 

Search Functions by User

 

There are 6 search functions defined for DVD. Two are present in most of today's VCRs: the linear style Time Search and Scan (Fast forward, rewind). The other 4 are made possible thanks to the non-linear, random-access playback capability of DVD.

 

User operation (ability to scan through or play) can be prohibited by content. This is signalled by such attributes as the parental control level. For example, certain Part_of_Title's can be skipped over which contain R-rated (US) scenes.

 

Title SearchUser can select the exact title to shuttle to.Part_of_Title SearchUser can go to specific version (PG-13, R, directors cut, children's version) or camera angle by either title name or number.Program Search User can go to a specific scene (car chase, opening credits, gun fight, etc.) within a program chain.Time SearchUser can go to a specific SMPTE style time code (HH:MM:SS:FF) location within a program chain.Scan:Scan (linearly) forward or backwards in time.GoUp:Within the current program chain, jump to the next program chain. This command traverses the DVD control information hierarchy.

 

 

 

For Time Searches, all DVD players are required to arrive to the nearest I picture. It is optional that DVD players be capable of arriving at the exact picture (regardless of its picturing coding type).

 

Navigation Commands and Parameters

 

The author (content provider) is given the freedom of creating an arbitrary branching structure for a given title. Of course some restraint should be exercised since, thanks to interframe MPEG coding dependencies and physical servo mechanism limitations, a program chain cannot be constructed of 30 pictures/sec of totally randomly located information on the disc.

 

However, the constant DVD transfer rate of 11 Mbit/sec provides some flexibility when the average program rate is kept lower. For example, if the average bit rate is only 5 Mbit/sec, then the player can waste 6 Mbit/sec of potential transfer rate in random access overhead.

 

 

 

Player Settings:

 

There are 24 system parameters for player setting:

 

SPRMMeaning

0Menu Description Language Code

1-Audio stream number

2-Sub-picture Stream number

3-Angle Number

4-Title Number

5-VTS title Number

6-Title PGC Number

7-Part of title number for one sequential_PGC_Title

8-Highlighed Buttom number

9-Navigation Timer

10-Title PGC number for Navigation Timer

11-Audio Mixing Mode for Karaoke

12-Country Code for Parental Management

13-Parental Level

14-Player Configuration for Video

15-Player Configuration for Audio

16-Initial Language Code for Audio

17-Initial Language Code for Sub-picture

18-Initial Language Code Extension for Sub-picture

19-Initial Language Code for Sub-picture

20-Reserved

21-Reserved

22-Reserved

23-Reserved

 

 

 

General Parameters:

 

Used for interactive operation of titles, such as quizzes, or games.

16 general parameters for navigation. These are RAM variables in the DVD players for use as, e.g., arithmetic scratch pads, counters, etc.

 Arithmetical operations are available (add, compare, etc.)

 

Navigation Commands

 

Each command consists of a single instruction or a combination of two or three instructions.

 

Instruction Groups:

 

Goto branch between command

Link transfer between same Domain

Jump transfer between each Domain

Compare recognition of parameter value

SetSystem player system setting

Set calculate GPRM values

Letterbox

 

 

Picture Size Conversion

 

All DVD players are required to have built-in vertical filters which scale a 16:9 coded video image to fit within a traditional 4:3 display. This player feature is needed since it is anticipated that a majority of movies will be coded for the 16:9 aspect ratio, while at the same time most TV displays (in the early years) will be 4:3. In the same vien as multilingual audio, a single coded aspect ratio in market distribution reduces confusion and bolsters economy of scale.

 

525/60 (NTSC-rate display):

 

(Note: 480*(4/3)/(16/9) = 480*0.75 = 360)

 

 

 

 

   _____________________

  |  60                 |

  |---------------------|

  |                     |

  |   360               |  480 lines total

  |                     |

  |---------------------|

  |   60                |

  -----------------------

 

 

 

 

 

 

625/50 (PAL-rate display):

 

(Note: 576*(4/3)/(16/9) = 576*0.75 = 432)

 

 

 

 

   _____________________

  |  72                 |

  |---------------------|

  |                     |

  |  432                |  576 lines total

  |                     |

  |---------------------|

  |  72                 |

  -----------------------

 

 

 

 

A simple bi-linear vertical filter can be applied, yielding good visual results. Here, two source samples (s[n],s[n+1]) are weighted by simple complementary factors and added together to form the destination sample value (d[m]). These weights are easily implemented with shifters. For interlaced displays, vertical filtering occurs only within the same field parity.

 

 

 

 

     d[0] = (3/4)*s[0] + (1/4)*s[1]

     d[1] = (1/2)*s[1] + (1/2)*s[2]

     d[2] = (1/4)*s[2] + (3/4)*s[3]

 

 

 

A decoder can determine whether inter or intra-parity vertical filtering is applied by testing the progressive_frame flag of the MPEG-2 video stream. (MPEG-1 frames are always progressive by definition). This flag indicates that a picture contains interlaced or progressive vertically correlated information. Almost all MPEG-2 coded movies consist exclusively of progressive frames. In a sense, MPEG-2's interlaced prediction modes are underutilized by DVD.

Presentation

 

 

Combinations of presentation types

 

TypeCountRepresentationVideo 1 stream only MPEG-1 or MPEG-2 VideoAudio maximum of 8 streams Linear PCM and/or:

Dolby AC-3 (NTSC)

MPEG audio (PAL)Sub-picture max 32 streams Run-length encoded with bitmap of 2 bits/pixel

 

 

 

Presentation stream rates

 

max total of combined audio and video: 9.8 Mbit/sec max sum of Elementary streams + systems overhead: 10.08 Mbit/sec.

 

Presentation of PGC

 

The program chain (PGC) can be presented either serially (linear) or in random/shuffle (non-linear) fashion.

 

For example, a quiz title should break each question into separate programs. The next program chain branched to would be determined by the answered provided by the user.

 

Still image presentation

 

Still pictures are coded as MPEG intra frames. They may be displayed for indefinite duration. They can be accompanied by background music, or audio can be muted.

 

 

 

 still function is created by the action of the navigation system

 The same video frame and sub-picture is frozen (displayed over and over again on the TV) while audio is or playing in background.

 

There are three types of the Still Function:

 

TypeTimingStill time in secondsPGC StillStills at end of the PGC0-254, limitlessCell StillStills at end of the Cell0-254, limitlessVOBU StillStills in every VOBU in the Cell limitless

 

VOBU: Video Object Unit.

 

Location of each command

 

Within a program chain (PGC), commands can be located at the front of the chain, in between cells of the chain, and at the end of the chain.

 

 

                          Program chain

 

    [Pre-Commands] [Cell] [Cell] [Cell-Command] [Cell] [Post-Commands]

 

  Each cell can have one command.  There is a restriction that

  no more than 128 commands can be contained within a program chain:

 

       Pre-commands + Cell Commands + Post Commands <= 128

 

  Further, there are a maximum of 36 buttons, each of which can

  have one associated command.

 

 

 

Example of a PGC transition

 

[taken from the Hitachi overheads]

 

3 quiz problems are presented to the user. Each quiz problem/question is coded as a separate program chain. One of the questions prompts the user for a "Yes" or "No" answer.

 

The Link command is used to branch from the original top-level menu to one of the three program chains. The Set Command is used to tally a score. Finally, the CompareLink command (which consist of two commands, Compare & Link) branches to a particular Program depending on the user's answer.

 

File Structure Hierarchy

 

The DVD is broken into two separate types of information: Navigation Data (control) and Presentation (object) data. Control data acts as pointers (like an operating system's File Allocation Table) to the actual video and audio object data on the disc.

 

In the DVD reference player model, Presentation and Navigation data packets are separated at the track buffer.

 

Control data can be expressed as a series of nested layers:

 

Titledistinguishes multiple movies or TV episodes on one disc. Each title is one of two types: a single program chain (One_Sequential_PGC_Title) or a collection of different program chains (Multi_PGC_Title).Program ChainA collection of programs with, e.g., a particular theme in common.Part_of_TitleLinks to one or more Program (PG) units on the disc. Like PGC, this mechanism can be used to create different versions (camera angle, ratings, outcomes, etc.) of the same program chain. POTs can also be used to mark scenes.ProgramUsually a scene. Consists of multiple cells.CellPreceded by a navigation packet, and alternating video and audio packets. A cell is typically all the video and audio data associated with an integer number of a group of pictures. VOBUVideo Object Unit: nominally a group of pictures (GOP)GOP1. smallest granularity of random access on disc (Group of pictures being with a coded Intra frame)

2. largest interframe dependent coding unit. (Interframe compression is bounded within a GOP)

Usually 15 display frames of data (0.5 seconds duration) for NTSC-rate (525/60) content.PacketDVD packets are 2048 bytes (sector payload size) large. As per MPEG-2 PES/Program streams, they contain data from only one data elementary stream (video, audio, etc.)NAV packetcontains the optional Buttom-Command defining the playback behaviour of the current cell.

 

1. Logical structure of Video Manager and Video Title Set [notes from Hitachi] =========================================================

 

A disc volume may contain up to 99 different titles, each with an initial Navigation Menu allowing the user to select among different versions of the title. The root menu which branches to all titles on the disc is coded within the Video Manager. Each title is organized as a Video Tile Set (VTS).

 

 

 

  DVD:   [VM][VTS #1][VTS #2] ..... [VTS #n]       where n<=99

 

 

 

The VM's VMGI includes: Attributes for the Menu, Tile Search Pointers, and the PCGI for the Menu.

 

 

  VM:    [VMGI][VOBS for Menu][Back up for VMGI]

 

 

 

 

 

The Control Data (VTSI) for the title (VTS) includes: attributes for Menu, Attributes for Title, Part of Title Search Pointer, Time Map Table, PGCI for Menu, and PGCI for Title. The Video Objects (VOBS) contain the actual program chains, Part_of_Tiles, programs, and so forth.

 

 

  VTS:   [VTSI][VOBS for Menu][VOBS for Title][Back up for VTSI]

 

 

 

 

 Legend:

  VM    Video Manager: sets up menus for a series of titles (1 through n)

  VTS   Video Tile Set: a collection of video objects.

  VMGI  Video Manager Information:

  VOBS  Video Object Set

  PGCI  Program Chain Information

 

 

 

Structure of Title

 

A title begins with the entry program chain (Entry PGC). It can branch to a single program chain (One_Sequential_PGC_Title) or multiple program chains (Multi_PGC_Title). The location of the branch is determined by the link condition.

 

Structure of a Program Chain (PGC)

 

The program chain is broken into two separate entities:

 program control information (PGCI)

 video object (VOB)

 

The PGCI defines the playback order of Programs by acting as a table of addresses which point to the sector locations of the program cells on the DVD. A program cell is essentially a group of pictures (GOP), spanning multiple sectors, and contains the actual interleaved packets of compressed bits for video and audio data.

 

Part_of_Title (PTT)

 

The Part_of_Title divides a title in a maximum of 99 different pieces. The intent of the PTT is aid in the construction of multiple versions of the same title.

 

One_Sequential_PCG_Title: The Part_of_Title and Program numbers are synchronized.

 

 

   [ PTT #1  |  PTT #2  | .... | PTT #n  ]    Part_of_Title

   [ [PG #1] |  [PG #2] | .... | [PG #n] ]    Program Chain (PGC)

 

 

Multi_PGC_Title:

 

             branch   PTT #2

               -->    [PG #1]                            (PGC1)

                                PTT #3      PTT #m

    PTT #1     -->    [PG #1]   [PG #j] ... [PG #k]      (PGC2)

    [PG #1]

               -->    [PG #1]                            (PGC3)

 

 

Subpictures

 

 

 

 

 run-length compressed bitmaps that are overlayed ontop of the MPEG reconstructed video.

 Applications include: Menus, sub-titles, karaoke, and simple animation.

 Pixels are divided into four types: 1. background 2. Foregound 3. Empahsis-1 4. Emphasis-2

 4 colors out of 16 color palette (4 colors are determined once per PGC).

 4 out of 16 contrast values

 up to a maximum of 32 sub-picture bitstreams. Each subpicture stream could, for example, could contain text from a particular language.

 subpicture buffer size is restricted to 62 Kbytes. This means a maximum of 62 KB per GOP/cell. 32 Kbytes of this is control data.

 Maximum number of bits per run-length coded line is 1440 bits.

 Display area maximum: 720x480 (525/60) and 720x576 (625/50)

 area, content, color, and contrast can be changed every video field

 Sub-Picture Display Control Sequences (SP_DCSQ) control the presentation of Sub-pictures.

 Presentation effects include: scroll up/down, fade in/out, etc.

 

 

 

  Structure of Sub-picture Decoding Unit (SPU):

 

    [ SPUH   ][        PXD        ][ DCSQT  ]

 

   SPUH:        Sub-picture Unit Header (size of SPU, start address of DCSQT)

 

   PXD:         Pixel Data (variable length run-length coded)

 

   DCSQT:       Display Control Sequence Table (one or more display control

                command sequences).

  

 

   DCSQT:   [DCSQ 0][DCSQ 1][DCSQ 2] ... [DCSQ n]

 

   DCSQ:  [Start time] [ Pointer to next DCSQ] [Command Sequence]

 

   Command Sequence:  [DCC 0][DCC 1]... [DCC m]

 

 

 

 

Display Control Commands (DCC):

 

 Set start address in PXD

 Set colors

 Set contrast

 Set SP screen position

 Start/stop display

 Set CHG_COLCON areas.

 

VBI Decoding

 

 

The Vertical Blanking Interval (VBI) packet (multiplexed at the Cell level along with Navigation, Video, and Audio packets) contains information which is directly inserted into the reconstructed video signal, sans level adjustments (16 levels into a, e.g. 256 nominal level video signal).

 

 

 

 only 1 VBI channel per program (sub-pictures have up to 32)

 Line range is from 10 to 23 NTSC and 6 to 23.5 for PAL.

 Separate palette (16 Y values, Cr=Cb=128) from subpictures.

 No highlight

 Restricted DCSQ command set

 closed captioning for NTSC-rate (525/60) is coded exclusively in the user_data() field of the group_of_pictures_header()

 

 

 

VBI information is losslessly represented as a waveform, and coded into packets. The 525/60 player uses a far more efficient alternative: the source character stream is coded in the MPEG video user_data() field. The NTSC/PAL modulator chip then creates the VBI closed caption (Line 21) signal from this character stream.

 

This brings our tally of closed caption representations to THREE ways!!

 

 

 

 as packets of 16-level sampled VBI waveforms (PAL)

 as user_data() character streams (NTSC)

 as rendered subpictures (NTSC and PAL)

 

 

Video Data Specifications

 

 

DVD adds many additional restrictions to the popular compliance parameter sets of MPEG. One good example is the restriction on the coded size of a picture: MPEG-2 Main Profile @ Main Level allows any coded frame size between 16 and 720 pixels horizontally and 16 and 576 pixels vertically. DVD, however, restricts the coded frame sizes to a very limited but practical subset.

 

In MPEG, audio can be coded at a sample rate of 32, 44.1 or 48 kHz. In DVD, the rates of both Dolby AC-3 and MPEG audio are strictly set to 48 kHz.

 

MPEG is a generic representation meant for a wide variety of applications. DVD has taken a practical subset to promote interoperability by simplifying implementations and insuring features (such as random accessibility).

 

Coded representationMPEG-1 (SIF combo)

MPEG-2 (Main Profile @ Main Level)Frame rate29.97 or 25 HzTV system525/60 or 625/50Aspect ratio4:3 (all video formats)

16:9 (all formats except 352 pixels/line)Display Modepan & scan, letterboxUser_dataclosed captionCoded frame sizes525/60: 720x480, 704x480, 352x480, 352x240

625/50: 720x576, 704x576, 352x576, 352x288

(MPEG-1 is allowed only in 352x240 or 352x288 res).GOP sizemax 36 fields or 18 frames (NTSC)

max 30 fields or 15 frames (PAL)Buffer size1.8535008 Mbits (MPEG-2)

max 327689 bits (MPEG-1)Transfer methodVBR, CBR (MPEG-2)

only CBR for MPEG-1Maximum bitrate9.8 Mbit/secLow_delayNOT permitted !!!!

 

 

 

Notes:

 

 

 

 the frame rate is the intended display frame rate. The number of coded frames in a sequence may vary due to 3:2 pulldown (the DVD MPEG decoder performs this function). The permitted values in DVD are more restrictive than MPEG-2 MP@ML which also includes 23.976, 24, and 30 frames/sec rates.

 aspect ratio is the display aspect ratio. Only 16:9 and 4:3 are permitted. Note: MP@ML's 2.21:1 is not included.

 MP@ML has no GOP size restriction. In fact, the GOP() is considered to be an insignificant layer in MPEG-2. Instead the sequence() layer serves as the most important boundary in the generic MPEG sense.

 The MPEG-1 and MPEG-2 vbv_buffer_size limits are the same as MP@ML and Constrained Parameters Bitstreams, respectively.

 The maximum bitrate of 9.8 Mbit/sec is more restrictive than MP@ML's 15 Mbit/sec limit. However, the point of diminishing returns (no visual difference between original video and compressed video) is widely known to be around 9 Mbit/sec.

 user_data() fields in MPEG video picture headers contain closed captioning (similar to Grand Alliance and DVB methods). See this ATSC (Advanced Television Systems Committee) site for more information: http://www.atsc.org/

[is this the same in DVB, DVD, and ATSC ? ]

 For picture sizes, only a very limited set of coded dimensions are legal.

 Variable bit rate is permitted only in MPEG-2 streams since the VBV model in MPEG-2 has provisions for it. MPEG-1 was an earlier standard (by two years) and not developed the VBV model to handle 3:2 pulldown cases.

 contrary to popular belief: all DVD players are required to decode video streams up to 9.8 Mbit/sec for indefinite periods of time. The popular average rate of 3.5 Mbit/sec or 4.7 Mbit/sec are merely canonical figures created by the notion that only single sided, single layer discs will hold feature length films. Should Single Sided, Double Layer discs prevail, the average rate would be almost twice as great. ALL DVD PLAYERS MUST SUSTAIN A 9.8 MBIT/SEC VIDEO DECODE RATE!!!!!!! Hardwired (Application Specific Integrated Circuits---ASICs) implementations of MPEG-2 MP@ML decoders are generally capable of handling 15 mbit/sec sustained rates.

 

 

 

MPEG Display Formats

 

MPEG-2 video decoder chips have implemented pan & scan for a few years already since it has been a requirement for cable TV and direct broadcast satellite applications. The letterbox requirement (vertical filter) is a relatively new addition to the MPEG decoder universe. The second generation DVD MPEG-2 video decoders will most likely perform "on-chip" sub-picture reconstruction.

 

Display 4:3Display 16:9Source 4:3

No conversionhorizontal filtering accomplished by TV monitor.Source 16:9letterbox (vertical filter) or Pan & ScanNo conversion

 

 

 

Note: Letterbox Conversion is a mandatory feature in the DVD Player !!!