When we stream movies online or share video clips through social media, we likely don’t think twice about the underlying technology enabling the transmission of dynamic visual content. However, video compression and encoding technologies are essential for efficient storage and transmission of digital videos.At its core, video compression aims to reduce redundancy and irrelevancy in video data to minimize its size. Since raw uncompressed video contains an immense amount of data, compression discards redundant bits and encodes only the essential data. This compressed bitstream can then be efficiently transmitted or stored.Several techniques are employed to achieve high compression ratios. Spatial redundancy within image frames can be reduced using transform coding techniques that concentrate signal information into fewer coefficients. Motion compensation methods exploit the temporal correlation between consecutive frames – pixels that do not change are not recoded.

The Magic Behind the Pixels

Every moment of video, no matter how brief, is composed of thousands, even millions, of tiny pixels. Each pixel has data—colors, brightness, and more. If we were to store video without compressing it, the file sizes would be enormous, making streaming a slow and cumbersome experience. Enter video codecs.

Psycho-visual optimizations take advantage of the human visual system’s reduced sensitivity to certain visual components. For instance, color information can be subsampled since the eye perceives luminance more sharply than color.After compressing video frames using the best available methods, these frames are assembled into a coded video sequence known as the bitstream. Additional compression occurs by assigning variable-length codes to frequently occurring symbols and patterns.To decode this compact bitstream, video codecs specify compression and decompression processes. Well-known video codecs include H.264, HEVC, VP9 and more. Codecs balance performance in terms of compression efficiency, complexity, latency and visual quality. Newer codecs like HEVC aim to improve compression rates.

Compression Strategies

The decades of research and development into video compression technologies has enabled the streaming and sharing of high definition content we enjoy today. Ongoing improvements focus on supporting emerging applications like high-resolution video, VR and AR content. Efficient video compression will remain crucial for delivering rich visual experiences into the future.

Transform Coding

Motion Compensation

Motion Estimation

Intra Prediction

Inter Prediction

Psychovisual Optimization

Rate Control

Entropy Coding

These techniques reduce spatial, temporal, and statistical redundancy. Combining them allows modern video codecs to achieve high compression ratios. Ongoing research improves efficiency of each strategy.

The Concept of Saliency Mask

A saliency mask is a computer vision technique that attempts to determine the most important regions in an image from a perceptual perspective. The goal is to create a binary mask that highlights the salient or most visually attention-grabbing parts of the scene.Saliency detection models aim to predict where humans would focus when presented with a scene. This is inspired by visual attention mechanisms in the human visual system. Areas of greater visual complexity, contrast, color variation or distinctness tend to attract more attention.

Common approaches to generating saliency masks include:

Saliency masks have applications in areas like image cropping, thumbnail generation, image compression by preserving salient regions at higher quality, and general computer vision understanding of human perception. They provide supplementary information to models on the most relevant parts of the scene.Ongoing research aims to improve accuracy of models in predicting human visual attention across diverse images. Evaluations use datasets of images labeled with human gaze patterns. More robust saliency detection remains an active area in computer vision and psychology.

Here are some key optimization strategies for video compression based on use case:

Short Video

Live Streaming

Real-Time Communications

The constraints and priorities vary – short video favors maximum quality, live streaming smoothness and responsiveness, and real-time communications minimizing end-to-end latency above all else. Codec settings and strategies are tailored to the use case.

Why Compression is Crucial

Video codecs use advanced algorithms to shrink file sizes by identifying and eliminating redundant data. For instance, in a scene where the sky takes up half the frame and remains a consistent shade of blue, rather than storing information for every pixel, the codec will recognize the redundancy and store that section of the frame more efficiently.

The Balancing Act: Quality vs. Size

But it’s not just about making files smaller. It’s about striking the right balance between file size and video quality. Over-compression can result in loss of detail and visual artifacts. Modern codecs like H.265 and AV1 are designed to deliver high-quality videos at a fraction of the original size, but selecting the right settings is key.

The Future is Even Smaller

As we march forward into an era of 4K, 8K, and even virtual reality videos, the demands on data and storage will only grow. Thankfully, the evolution of video codecs is keeping pace. Research into more efficient algorithms and better compression techniques promises to deliver even smaller files with even bigger pictures.

In essence, the complex dance of video compression is an ongoing tango between innovation and necessity. As our world grows increasingly digital, the power of video codecs ensures that our experiences remain seamless, vivid, and deeply engaging.