Digital Compression for Multimedia

Principles & Standards

By

  • Jerry Gibson, Professor, University of California, Santa Barbara, CA, USA
  • Toby Berger
  • Tom Lookabaugh
  • Rich Baker
  • David Lindbergh

Drawing on their experience in industry, research, and academia, this powerful author team combines their expertise to provide an accessible guide to data compression standards and techniques and their applications. The essential ideas and motivation behind the various compression methods are presented, and insight is provided into the evolution of the standards. Standards-compliant design alternatives are discussed, and some noncompliant designs also are treated.

Covering the fundamental underpinnings of the most widely used compression methods, this book is intended for engineers and computer scientists designing, manufacturing, and implementing compression systems, as well as system integrators, technical managers, and researchers. It provides, in a single source, an overview of the current standards for speech, audio, video, image, fax, and file compression.* Authored by five experts from industry and academia who are heavily involved in research, development, and standards-setting activities* Covers the full spectrum of multimedia compression standards including those for lossless data compression, speech coding, high-quality audio coding, still image compression, facsimile, and video compression* Provides enough theory for you to understand the building blocks of the compression systems discussed, with appendices containing necessary algorithmic details and mathematical foundations

View full description

 

Book information

  • Published: January 1998
  • Imprint: MORGAN KAUFMANN
  • ISBN: 978-1-55860-369-1


Table of Contents

Contents

Preface

1 Introduction to Data Compression
1.1 Why Compress?
1.2 The Data Compression Problem
1.2.1 Synonyms for Data Compression
1.2.2 Components of a Data Compression Problem
1.2.3 Types of Compression Problems
1.3 Input Source Formats
1.4 Reconstructed Source Quality
1.4.1 Performance Measurement
1.4.2 Perceptual Distortion Measures
1.5 System Issues and Performance Comparisons
1.6 Applications and Standards
1.7 Outline of the Book

2 Lossless Source Coding
2.1 Introduction
2.2 Instantaneous Variable-Length Codes
2.3 Unique Decipherability
2.4 Huffman Codes
2.5 Nonbinary Hufmann Codes
2.6 The Kraft Inequality and Optimality
2.7 Group 3 and Group 4 Fax Standards
2.7.1 Group 3 Fax
2.7.2 Group 4 Fax
2.7.3 Noise and Half-Toning
2.8 Line Drawing Compression
2.9 Entropy and a Bound on Performance
2.9.1 Some Inequalities
2.9.2 Entropy
2.9.3 Entropy Lower Bounds Achievable Compression
2.10 Conditional Entropy and Mutual Information
2.11 Entropy Rate of a Stationary Source
2.11.1 Joint Entropy and the Chain Rule
2.11.2 Definitions of Entropy Rate
2.11.3 Shannon-Fano Codes

3 Universal Lossless Source Coding
3.1 Adaptivity and Universality
3.2 Parsing
3.3 LZ Compression
3.3.1 LZ78
3.3.2 LZW
3.3.3 LZY
3.3.4 LZ77
3.4 Elias Coding, Arithmetic Coding, and JBIG Fax
3.4.1 Elias Coding
3.4.2 Arithmetic Coding
3.4.3 The JBIG Fax Standard

4 Quantization
4.1 Introduction
4.2 Scalar Quantization
4.2.1 Uniform Quantization
4.2.2 Nonuniform Quantization
4.2.3 Logarithmic Companding
4.2.4 Adaptive Quantization
4.2. 5 Embedded Quantization
4.3 Vector Quantization
4.3.1 VQ Structure, Design, and Performance
4.3.2 Optimal VQ
4.3.3 Structured VQ
4.4 Summary

5 Predictive Coding
5.1 Introduction
5.2 The Linear Prediction Model and Linear Predictive Coding
5.2.1 Coefficient Calculation
5.2.2 Other Parameters
5.2.3 Voiced/Unvoiced Decision and Excitation Signal
5.2.4 Pitch Period Estimation
5.2.5 Excitation Gain
5.2.6 LPC Performance
5.3 Delta Modulation and Differential PCM
5.3.1 Delta Modulation
5.3.2 Nyquist-Sampled Predictive Coders
5.3.3 Short-Term Predictor Adaptation
5.4 Embedded DPCM
5.5 Multipulse Linear Predictive Coding (MPLPC)
5.6 Code Excited Linear Predictive Coding
5.7 Perceptual Weighting and Postfiltering
5.8 Summary

6 Linear Predictive Speech Coding Standards
6.1 Introduction
6.2 ITU G.721/G.726/G.727
6.3 U.S. Federal Standard 1015
6.4 U.S. Federal Standard 1016
6.5 GSM 13-kbps Coder
6.6 TIA 8-kbps VSELP
6.7 TIA QCELP
6.8 LD-CELP, ITU G.728
6.9 ITU G.729
6.10 ITU G.723.1
6.11 JDC (PDC) Full Rate, GSM Half Rate, and JDC Half Rate
6.12 U.S. Federal Standard at 2.4 kbps
6.13 Additional and Forthcoming Standards

7 Frequency Domain Coding
7.1 Introduction
7.2 Subband Coding of Speech
7.2.1 Example 1
7.2.2 Example 2
7.3 Subband Coding of Images
7.4 Transform Coding of Speech and Images
7.4.1 Discrete Transforms
7.5 Wavelet Coding
7.6 Fractal Coding
7.7 Summary

8 Frequency Domain Speech and Audio Coding Standards
8.1 Introduction
8.2 ITU G.722 Wideband Audio and Lower Rate Extensions
8.3 Simulatenous Masking and Temporal Masking in Audio
8.4 High-Quality Audio for Video Standards
8.4.1 MPEG-1 Audio
8.4.2 MPEG-2 Audio
8.4.3 Dolby AC-2 and AC-3
8.4.4 AT&T's Perceptual Audio Coder
8.5 Coding for Audio Storage Devices
8.5.1 DCC PASC Coder
8.5.2 Minidisc ATRAC Coder
8.6 INMARSAT Speech Coder
8.7 Summary

9 JPEG Still-Image Compression Standard
9.1 Introduction
9.2 Baseline JPEG
9.3 Progressive Encoding
9.4 Hierarchical (Pyramidal) Encoding
9.5 Entropy Coding
9.5.1 Example of DCT Coefficient Encoding
9.6 Image Data Conventions
9.7 Lossless Encoding Mode
9.8 Summary

10 Multimedia Conferencing Standards
10.1 Introduction
10.2 H.320 for ISDN Videoconferencing
10.2.1 The H.320 Standards Suite
10.2.2 H.221 Multiplex
10.2.3 System Control Protocol
10.2.4 Audio Coding
10.2.5 Video Coding
10.2.6 H.231 and H.243--Multipoint
10.2.7 H.233 and H.234--Encryption
10.2.8 H.224 and H.281--Real-Time Far-End Camera Control
10.2.9 H.331 Broadcast
10.3 H.320 Network Adaptation Standards: H.321 and H.322
10.3.1 H.321--Adaptation of H.320 to ATM and B-ISDN
10.3.2 H.322--Adaptation of H.320 to IsoEthernet
10.4 A New Generation H.323, H.324, adn H.310
10.4.1 H.245 Control Protocol
10.4.2 Audio and Video Codecs
10.4.3 H.323 for Packet Switched Networks
10.4.4 H.324 for Low-Bit-Rate Circuit Switched Networks
10.4.5 H.310 for ATM and B-ISDN Networks
10.5 T.120 Data Conferencing and Conference Control
10.5.1 T.120 Infrastructure
10.5.2 T.120 Application Protocols
10.6 Delay in Multimedia Conferencing Systems
10.6.1 Sources of Audio Delay
10.7 Summary

11 MPEG Compression
11.1 Introduction
11.2 The MPEG Model
11.2.1 Key Applications and Problems
11.2.2 Strategy for Standardization
11.2.3 Parts of the MPEG-1 and MPEG-2 Standards
11.3 MPEG Video
11.3.1 The Basic Algorithm
11.3.2 Temporal Prediction
11.3.3 Frequency Domain Decomposition
11.3.4 Quantization
11.3.5 Variable-Length Coding
11.3.6 Syntactical Layering in MPEG
11.3.7 Rate Control
11.3.8 Constrained Parameters, Levels, and Profiles
11.4 MPEG Audio
11.4.1 Layers
11.4.2 The Basic Algorithm
11.4.3 Subband Decomposition
11.4.4 Scaling, Quantization, and Coding
11.4.5 Multichannel Compression
11.5 MPEG Systems
11.5.1 Timing
11.5.2 System and Program Streams
11.5.3 Transport Streams
11.5.4 Packetized Elementary Stream (PES) and MPEG-1 Packets
11.5.5 Program-Specific Information
11.6 More MPEG
11.6.1 MPEG-4
11.6.2 Digital Storage Media Command and Control
11.6.3 Advanced Audio Coding
11.6.4 The Professional or 4:2:2 Profile
11.7 Summary

Appendix A - Speech Quality and Intelligibility
A.1 Introduction
A.2 Phases of Speech Coder Evaluation
A.3 Informal Tests
A.3.1 Objective Measures
A.3.2 Subjective Tests
A.4 Formal Tests
A.4.1 Intelligibilty
A.4.2 Quality
A.5 Important Considerations

Appendix B - Proof That Huffman Codes Minimize

Appendix C - Proof That Every UD Code Satisfies the Kraft Inequality

Appendix D - Behavior of Approximations to Entropy Rate

Appendix E - Proof of Forward March Property for LZY

Appendix F - Efficient Coding of Lk for LZ77

References

Glossary

Index

About the Authors