SVCL - ROI

Home People Research Publications Demos

News Jobs Prospective
Students About Internal

Image compression using Object-based Regions of Interest

Due to the very limited bandwidth of a number of important communication channels (e.g. wireless, underwater, low-power camera networks, etc.), video communication over such channels requires substantial compression of the video signal. Current compression paradigms are not likely to efficiently meet the challenge posed by this very narrowband scenario: while advances continue to be possible in the area of MPEG-style coding, such advances are mostly incremental, and will not enable drastic increase of compression ratios. One of the most promising answers to this challenge is to adopt a new compression paradigm that, unlike MPEG, relies heavily in scene understanding. Of special interest is the possibility to produce video of increased subjective quality within current bandwidth ranges.

One path towards this goal is to explore the well known fact that viewers assign different penalties to loss of video fidelity in different image areas. Typically, the magnitude of these penalties is correlated with the importance, to the viewer, of the different objects that compose the scene: while loss of fidelity in the reproduction of a talking face is likely to be highly annoying to users of a cell-phone based video-conferencing system, tree leaves moving in the background tend to be considered irrelevant (and can therefore be severely compressed). In this sense, the ability to identify, and allocate more bits to, the regions of interest to the viewer are likely to enable substantial subjective compression gains, especially for bandwidth-starved applications.

A number of advances in object recognition in the recent past make this path worthwhile exploring. One possibility is to formulate the problem as one of discriminant saliency detection. A universe of objects of interest is defined, and salient regions determined by the attributes that best distinguish the object classes of interest from random background scenes. Detectors of these attributes are then trained from collections of example images collected through simple web search. Salient regions are identified as the regions of the video to compress where these detectors have strong response. Recent advances enable the design of the components of such saliency detectors (feature selection, identification of salient responses, and detection of objects), with small numbers of training examples and significant computational efficiency. This allows support for applications where the objects of interest are completely user defined.

The goals of this project are to design robust and computationally efficient saliency-based compression algorithms, that can be tuned for the compression of specific visual classes of interest with minimal human supervision. This includes the design of robust top-down saliency mechanisms, and the ability to detect instances where saliency is unreliable. All resulting algorithms should be compatible with current standards, e.g. JPEG-2000 or MPEG.

Selected Publications:

Demos/
Results:

results of ROI coding for various classes of cluttered images

Contact: Sunhyoung Han, Nuno Vasconcelos

Home	People	Research	Publications	Demos

News	Jobs	Prospective Students	About	Internal

Due to the very limited bandwidth of a number of important communication channels (e.g. wireless, underwater, low-power camera networks, etc.), video communication over such channels requires substantial compression of the video signal. Current compression paradigms are not likely to efficiently meet the challenge posed by this very narrowband scenario: while advances continue to be possible in the area of MPEG-style coding, such advances are mostly incremental, and will not enable drastic increase of compression ratios. One of the most promising answers to this challenge is to adopt a new compression paradigm that, unlike MPEG, relies heavily in scene understanding. Of special interest is the possibility to produce video of increased subjective quality within current bandwidth ranges. One path towards this goal is to explore the well known fact that viewers assign different penalties to loss of video fidelity in different image areas. Typically, the magnitude of these penalties is correlated with the importance, to the viewer, of the different objects that compose the scene: while loss of fidelity in the reproduction of a talking face is likely to be highly annoying to users of a cell-phone based video-conferencing system, tree leaves moving in the background tend to be considered irrelevant (and can therefore be severely compressed). In this sense, the ability to identify, and allocate more bits to, the regions of interest to the viewer are likely to enable substantial subjective compression gains, especially for bandwidth-starved applications. A number of advances in object recognition in the recent past make this path worthwhile exploring. One possibility is to formulate the problem as one of discriminant saliency detection. A universe of objects of interest is defined, and salient regions determined by the attributes that best distinguish the object classes of interest from random background scenes. Detectors of these attributes are then trained from collections of example images collected through simple web search. Salient regions are identified as the regions of the video to compress where these detectors have strong response. Recent advances enable the design of the components of such saliency detectors (feature selection, identification of salient responses, and detection of objects), with small numbers of training examples and significant computational efficiency. This allows support for applications where the objects of interest are completely user defined. The goals of this project are to design robust and computationally efficient saliency-based compression algorithms, that can be tuned for the compression of specific visual classes of interest with minimal human supervision. This includes the design of robust top-down saliency mechanisms, and the ability to detect instances where saliency is unreliable. All resulting algorithms should be compatible with current standards, e.g. JPEG-2000 or MPEG.
Selected Publications:
Demos/ Results:	results of ROI coding for various classes of cluttered images
Contact:	Sunhyoung Han, Nuno Vasconcelos