Guide to Digital compression

History of compression

In the early days of the internet, when analog modems were the only means of connecting to the virtual world and bandwidth was a valuable commodity, the ability to transmit great amounts of information using the least amount of bits and bytes was paramount. Even today in the new world of broadband and instant computing, this is still the case.

Audio, video and images inherently take up an enormous amount of space in the digital world of ones and zeros, and if one wanted to create and transmit these files in their raw size, they would take too long and use up too much valuable bandwidth.

Hence compression techniques came into play, reducing size of media files so that they can be transmitted quickly and more efficiently. The infamous and probably most controversial of them all is the Mpeg-1 Layer-3 format, or more commonly known as the Mp3.

A single song in its raw format is about 30MB to 50MB, but the Mp3 compression allows one to reduce it to barely a tenth of its size, allowing music files to be easily shared, which led to the boom in peer-to-peer sharing programs and grand-daddy of them all, Napster.

Main types of compression

There are two main terms one needs to remember when dealing with compression - Lossy and Lossless. Both refers to the method of compression used, and depending on the codec of choice, determines the final quality of the file.

Lossless data compression is a class of data compression algorithms that allow the original data to be exactly reconstructed from the compressed data.

While Lossy data compression where compressing a file and decompressing it retrieves a file that may well be different to the original, but is 'close enough' to be useful in some way.

The advantage of Lossy methods over Lossless method is that a Lossy method will usually produce a much smaller compressed file than any known Lossless method.

Compressed Audio format

Lossy audio format

1. Mp3
Mp3 means MPEG-1 Audio layer 3, and is only one way of converting music into digital files. MPEG (Motion Picture Expert Group) is a family of standards for displaying video and audio using lossy compression, set by the Industry Standard Organization (ISO).

It is based on a psycho-acoustic model which recognizes that the human ear cannot hear all the audio frequencies of a recording. The human hearing range between 20hz to 20Khz and it is most sensitive between 2 to 4Khz. The extreme compression level from a 45MB Mp3 file is achieved by filtering out all data that represent frequencies that are not detectable by the human ear.

When encoding a file into Mp3 format, a variety of compression levels can be set. For instance, an Mp3 created with 128 kilobit compression will be of a greater quality and larger file size than that of a 56 kilobit compression, thus the more the compression level increases, the lower the sound quality will be.

2. WMA
Windows Media audio (WMA) is a proprietary compressed audio file format developed by Microsoft, initially meant to be a competitor to Mp3. Similar to Mp3, you can set a WMA encoder's compression levels to attain smaller file sizes, but with lower sound quality.

3. OGG
The development of the OGG standard began in 1993, then known as 'Squish'. It is an open source project , and hence is free of any patents. The files are backward campatible and it can be played with older player as well.

4. AAC
Advanced Audio Coding (AAC) is a lossy data compression scheme intended for audio streams. AAC was designed as an improved-perfomance codec relative to Mp3. It takes sample frequencies from 8 Khz to 96 Khz, up to 48 channels, which mean that the files sound better and more stable that Mp3 files at equivalent or lower bitrates.

A proprietary compression format from Sony, Adaptive Transform Acoustic Coding (ATRAC) is primarily used in its minidisc products and other portable players.

6. AC-3
Developed by Dolby Laboratories, Audio Code Number 3 (AC-3) refers to a multichannel music compression technology.

7. DTS
Created by Digital Theater Systems (DTS), the format was introduced in theaters in 1993 with Steven Spielberg's blockbuster movie Jurassic Park. DTS is compressed at a ratio about 3:1 from PCM audio recorded with a sample rate at 96 kHz and 20-bit sample size. DTS is also capable of encoding PCM with a sample rate up to 192 kHz and 24-bit sample size with 8 discrete channels.

Lossless audio format

1. APE
APE files are created by Monkey's Audio, which is considered the best Windows-only lossless audio encoder for archiving music because of its efficient file sizes and encoding/decoding speeds. Unlike lossy methods that permanently discard data to save space, Monkey's Audio compresses audio in a mathematically perfect way, creating bit-for-bit copies.

Therefore, an APE file compressed with Monkey's Audio always sound the same as the source file, no matter how many times the resulting file is burnt to a CD, ripped and re-encoded.

Apple Lossless Audio Codec (ALAC) is an audio codec developed by Apple for lossless encoding of music. Supported by iPod in Quicktime and iTunes, it is not a variant of AAC but a totally new codec.

Free Lossless Audio Codec (FLAC) is a multiplatform, open-sourced lossless compression codec. While not as efficient as APE, its primary advantage is its crossplatform support. FLAC has become the preferred lossless format for tranferring live music online because of its lossless algorithm ensures the highest fidelity to the source material.

Compressed video format

Video compression refers to reducing the quantity of data used to represent digital video images, and is a combination of spatial image compression and temporal motion compensation. Video compression is an example of the concept of source coding in Information theory.

Lossy video format

The Moving Picture Experts Group (MPEG - pronounced M-peg) is a small group charged with the development of video and audio encoding standards. Using lossy data compression , samples of picture or sound are taken, chopped into small segments, transformed into a frequency space, and quantized. There are several standards of MPEG (as shown below), each standard having different levels of picture and sound quality respectively.

MPEG-1 - Initial video and audio compression standard. Later used as the standard for video CD, its output quality is comparable to VHS.

MPEG-2 - Transport, video and audio standards for broadcast-quality television, it is also with some modifications, the coding format used by standard commercial DVD movies.

MPEG-4 - Expands MPEG-1 to support video/audio 'objects' , 3D content, low bitrate encoding and support for Digital Rights Management. Primarily used for web-streaming and broadcast television.

2. XviD
XviD is a GPL open-sourced MPEG-4 video codec. Originally based on OpenDivX, XviD was started by a group of volunteer programmers after the OpenDivX source was closed.

3. DivX
DivX is a video codec created at DivX Networks Inc. known for its ability to compress lengthy video segments into small sizes. A typical feature-length movie on DVD is around 5GB to 6GB in size. With DivX, a movie can be compressed to around 700MB, which will fit on a CD with minimal loss in quality. Most videos you download in AVI format are encoded in DivX.

4. WMV
Windows Media Video (WMV) is a generic name for the set of proprietary streaming video technologies from Microsoft. The video stream is often combined with an audio stream of Windows Media Audio (WMA).

Lossless video format

Video compression is necessary for efficient coding of video data in video file formats and streaming video formats. However the methods used usually require discarding data rather than just reducing the required bandwidth. So while lossless video compression is possible, in practise it is virtually never used, as video data already contains spacial and temporal redundancy, all standard video data rate reduction involves removing data.

Compressed image formats

Lossy image formats

Joint Photographic Expert Group (JPEG) is the most common format used for storing and transmitting photographs in the digital world. JPEG itself specifies only how an image is transformed into a stream of bytes, a further standard, created by the independent JPEG Group, called JPEG File Interchange Format (JFIF) further specifies how to produce a file suitable for computer storage and transmission.

The JPEG format supports the RGB, CMYK and grayscale colour spaces. However even though it's the most common format used for storing and transmitting photographs on the internet, it is not as well suited for line drawings and other textual or iconic graphics because its compression method perfoms badly on these types of images.

2. JPEG 2000
JPEG 2000 is a wavelet-based image compression standard. It was created by the same group of people who came up with the JPEG standard, with the intention of superseding the original. JPEG 2000 is able to operate at higher compression ratios without generating the characteristic 'blocky and blurry' artifacts of the original JPEG standard. However it remains unpopular until now.

Lossless image formats

Tagged Image File Format (TIFF) files come in several versions, but the most common is the uncompressed TIFF. Unlike JPEG that compresses (and throws away data) during re-saves, re-saving a TIFF image does not result in any data loss and quality degradation. It is also the most common file format found in imaging applications, and is popularly used in the printing industry. Furthermore, it is supported by both the PC and Mac.

2. GIF
Graphics Interchange Format (GIF) is a bitmap image format that is widely used on the World Wide Web, both for still images and for animations. First introduced in 1987, GIF is palette based, however their maximum number of colours available for each frame is only 256.

3. PNG
A relatively new bitmap image format that is becoming popular on the internet and elsewhere. Portable Network Graphics (PNG) was largely developed to deal with some of the shortcomings of the GIF format and it addresses several issues that plague GIF, like colour depth and pattent issues.

PNG uses a non-patented lossless data compression method known as deflation. This method is combined with prediction, where for each image line, a filter method is chosen that predicts the colour of each pixel based on the colours of previous pixels and subtracts the predicted colours of the pixel from the actual colour.

4. BMP
BMP is a bitmapped graphics format used internally by the Microsoft Windows graphics subsystem, and commonly used as a simple graphics file format on that platform. Images are generally stored with a colour depth of 2 (1-bit), 16 (4-bit), 256 (8-bit), 65,536 (16-bit) or 16.7 million (24-bit).

8 bit images can also be grayscale instead of colour. Although BMP files are usually not compressed, they can be compressed using the lossless RLE algorithm. A BMP image can be quite large depending on its colour depth and size. A true colour bitmap of 800x600 resolution can occupy as much as 1.5 MB, making them unsuitable for rendering on websites and slow bandwidth transfers.

