Visual data, i.e., images and videos, are at the heart of current research and applied topics. Their acquisition, processing, storage, and retrieval represent a major challenge for present-day systems because of their massivity. Compression has been and remains an essential solution to cope with the massive size of visual data that usually exceeds the transmission capacity of networks and storage capacity of devices, e.g., hard drives, and platforms, e.g., clouds. Here, I try to introduce visual data compression, one of my main research topics.
Visual data compression is a multidisciplinary field that is at the crossroads of most innovative applications and domains. It spans nearly every image and video-centric industry, where both compact and relevant information is needed. From education to healthcare, agriculture to manufacturing, and beyond, stakeholders have to rely on compression to help store and communicate their visual data. These latter are considered as a gold mine in terms of information and knowledge for current research and engineering themes. It is then strongly necessary to develop efficient compression algorithms preserving image regions that are semantically important for each specific application domain.
Being in full swing, compression is a very vast field that has experienced an important development during these last decades as one of the main branches of Information and Communication Technologies (ICT). Whatever the field of application, compression has been and remains an essential solution to the massivity of visual data. It dramatically reduces the amount of data bytes which can save storage capacity, speed up data transmission and retrieval, and decrease costs of storage and communication hardware. Compression codecs are indeed experiencing unprecedented levels of growth driven by the unprecedented video and image traffic demands.
Beyond bytes: Trends to defy
Compression in itself is not a new concept. What is new and ever-changing is the emergent trends that compression has to defy. Research projects aim to address topical challenges that current architectures, applications, use cases, images and videos formats, communication networks, scaling, devices, ... raise. This is truer with the emergence of the ever-evolving Internet of Things (IoT) where storage nodes (sensors, phones, ...) and transmission links (LoRaWAN, 5G, ...) are becoming more and more constraining in terms of computing power, memory storage and energy consumption. Faced with this persistent compression need, a multitude of ad-hoc groups, such as Moving Picture Experts Group (MPEG), aim to develop standards of image and video compression that adapt well to the massivity and heterogeneity of data (content-awareness codecs) and the diversity of current and future applications (task-awareness codecs). In this sense, the MPEG group has launched calls for proposals for compression standards of different video formats, e.g., multi-view and 3D videos, using several approaches, e.g., scalability and online streaming. Lately, the interest of MPEG is also directed towards Video Compression for Machines (VCM) which, unlike the compression of videos aimed at human consumption, is intended to improve the performance of machines, thus enabling them to efficiently analyze videos for a multitude of computer vision tasks.
Compression as a sustainability practice
Besides its practical benefits, compression plays a pivotal role in GreenCoding and Green IoT, which are sustainability practices that all ICT practitioners should adopt. It is noteworthy that by 2030, ICT is projected to consume around 21% of the world electrical power, with the sector accounting for approximately 5-9% of global electricity usage, leading to 3.5% of CO2 emissions. By leveraging compression codecs, data size can be reduced by more than half, thereby promoting reduced energy consumption and shorter processing times. It becomes essential to discern when to prioritize high-quality visual data and when smaller files can accomplish the same objectives.
Conventional compression standards
Over the years, numerous compression standards have emerged based on conventional transforms, mainly Discrete Cosine and Wavelets. Some standards are dedicated to images compression, while others focus on videos. For images, several codecs have been designed and optimized by domain experts, such as JPEG (Joint Photographic Experts Group), JPEG2000, and BPG (Better Portable Graphics). Regarding video compression, I can mention the High Efficiency Video Coding (HEVC) standard, also known as H.265, and its successor, Versatile Video Coding (VVC) or H.266. VVC, with various improvements, achieves over 50% reduction in bitrate requirements compared to HEVC for the same quality. VVC also enables compression of High Dynamic Range (HDR) images and 360° videos. These traditional compression codecs have been used for a long time in most visual data processing applications. They have been highly successful among the general public due to their fast execution, performance and compatibility with most hardware devices.
New generation of compression methods
Artificial Intelligence has impacted nearly every domain of information processing and communication. Image compression is no exception to this trend. Very recently, researchers worked at introducing deep learning to image compression, one of the applications that deep learning is suspected to be efficient at. The aim is to offer substantially better compression efficiency than available image codecs with models obtained from a large amount of images, and that can efficiently represent the wide variety of visual content that is available nowadays. This domain still takes its first steps and has a short history, compared to compression standards, requiring more extensions and research. In this sense, MPEG has launched the new JPEG Artificial Intelligence (JPEG AI) ad-hoc group whose aim is to publish a learning-based image compression standard.
Deep learning-based methods have succeeded in achieving comparable performances than the classical compression codecs, and even have advanced to outperform them. However, the use of deep learning models for image compression methods makes their performance dependent on the nature of the images used for training. For instance, if a model is trained on images of cars, it will perform well when tested for compressing an image of a car. However, its performance will not be the same when tested on compressing an image of nature. Opting to use a more generalized dataset during the training phase could be a solution, but this would still require a significant amount of hardware resources and processing time.