Domain-Specific Compression of Endoscopic Videos

Domain Specific Compression of Endoscopic Videos

Common video compression methods have already reached a very high performance level and are about to hit their natural limit. However, their compression efficiency is still insufficient for certain domains where a plethora of video data should be archived. One example is the domain of medical endoscopy, where surgeries are recorded with high visual quality for documentation purposes. A crucial requirement for a comprehensive video documentation system is to find an adequate strategy how to deal with the large data volume in order to enable a balanced cost-benefit ratio between storage requirements and the potential benefit. We identified several domain-specific characteristics of endoscopic videos which can be exploited by compression algorithms to optimize compression efficiency and hence reduce storage costs. In particular, we identified four dimensions (spatial, temporal, perceptual and long-term) which need to be taken into account for that purpose [1].

Spatial dimension: Circle Detection and Border Overlay

Endoscopic videos usually do not cover the entire rectangular area of a frame but are restricted to a central circular content area. The outer part carries no relevant information but still takes up a considerable amount of the available encoding bitrate due to the random noise captured by the image sensor. We propose to identify these irrelevant areas and to superimpose a uniform black overlay in order to improve compression efficiency [2,3,4].

We developed an efficient circle detection algorithm to determine the exact position and size of the circular content area. Our algorithm exploits some domain-specific knowledge about the possible position and size of the circle. Hence, it is faster and more accurate than general purpose circle detectors such as Hough transform. The first step of our algorithm is to detect zoomed images that contain no black border and exclude these frames from further processing. Afterwards, the algorithm uses the Canny edge detector to get an edge image. On this image, edge points which are potentially part of the border area are detected. Each possible combination of three edge points is then used to calculate a circle candidate. Since the valid position and radius of the circular area is approximately known for endoscopic videos, it is possible to identify plausible circle candidates and select the circle that offers the best match with the edge image. The area outside of the selected circle is then substituted by a uniform black mask. This significantly improves the compression efficiency (e.g., by skipped macro blocks in H.264/AVC).

The original video has a noisy border and a size of 15,0 MB.

The border of the modified video is covered with black and has a size of 9,84 MB.

Temporal Dimension: Relevance Segmentation

Our second contribution to improve video compression is to detect and mark irrelevant segments which do not provide any useful information about the surgery [5]. Such irrelevant segments may be skipped completely or encoded with a lower quality, which improves compression efficiency. We differentiate between three types of irrelevant frames: dark frames, out-of-patient frames and blurry frames.

The recording is often already started before the actual procedure begins and the endoscope is still in a warming device or no lens is attached. This results in very dark images containing no relevant information but only noise. Similarly, such situations often occur after a procedure. Such frames can easily be identified by analysing the brightness information.

During the procedure, the endoscope is sometimes withdrawn and cleaned or removed during certain steps which do not require an endoscopic view. We identify such out-of-patient (oop) frames with a global color-based feature based on the HSV color space. We observed that the average hue of frames showing scenes inside a patient are very different from oop frames. The former are characterized by a red or sometimes yellow hue, since they mostly show organic structures, human tissue or blood. On the other hand, the hue of oop frames is mostly green or blue since they are the predominant colors in nearly all emergency rooms. To verify this observation, we analyzed five hours of laparoscopic videos which contain about 40% out-of-patient frames. The average hue histograms for in-patient and out-of-patient frames show that the color distribution is almost complementary.

The third class of irrelevance concerns blurry frames. As common endoscope lenses have a fixed focus, the image appears blurry if the distance between endoscope and tissue is too small or too large. Although these images are mostly in-patient frames, they do not contain any valuable information, since the content of the images is hardly recognizable. Measuring the sharpness of an image is a challenging task. In particular, it is hard to differentiate between actual blurriness and the absence of salient details. Furthermore, we are not interested in detecting frames that are only partially blurry but still contain enough detail to be potentially valuable for a surgeon. We use the Difference of Gaussians (DoG) as indicator for blurriness. In particular, we subtract a blurred version of the image from a less blurred version. The intuitive idea behind this approach is that the difference will be greater, when the original image was sharp and contained a lot of details. Contrary, the DoG will be smaller when the original image was already blurry.

Perceptual Dimension: Impact of Compression on the Perception Quality

Lossy compression may lead to the loss of medically relevant information. However, there are no encoding guidelines available for endoscopic videos that specify how to efficiently encode such videos so that the visual quality is appropriate. Hence, most users use inefficient default configurations. Even worse, most commercially available video documentation systems use the legacy MPEG-2 video format, although it is not designed to handle HD content and requires high bitrates. In order to improve storage efficiency, we identified appropriate encoding configurations and studied their impact on the perceptual quality of laparoscopic videos [6].

In order to investigate how video compression with the H.264/AVC state of the art encoding format and reduced bitrate and resolution settings affects the perceived quality, we performed a subjective study including 37 surgeons and surgical residents. In particular, we addressed to what extend laparoscopic videos can be compressed without any noticeable quality loss. Additionally, we wanted to find out whether it is feasible to reach a significantly higher compression rate without essential loss of “semantic quality”, which refers to the extent to which a video conveys relevant medical information to the observer. The original recordings were encoding using MPEG-2 with a 1080p resolution and a bitrate of 20 Mbit/s. Evaluations results show that using H.264/AVC and a constant rate factor of 26 produces videos with a bitrate of 8 Mbit/s that provide the same visual quality that the original version. Furthermore, we found out that lowering the resolution to 720p, which results in a bitrate of 2,5 Mbit/s, still achieves a good quality. Finally, even a resolution of 640×360 with a bitrate of only 1.4Mbit/s is acceptable. Based on these evaluation results, we propose encoding parameter recommendations for the efficient compression of laparoscopic video archives.

Long-term Dimension: Archiving Strategy

Through many discussions with domain experts we learned that most recorded procedures are never reviewed, since there were no exceptional situations or complaints. If a procedure is reviewed afterwards, this mostly happens within a limited time span. Nevertheless, older procedures may still become relevant, e.g., in case of a follow-up intervention or for forensic investigations. Thus, there is a need to archive the videos obtained during an endoscopic procedure. Since such archives may contain a lot of videos over time, there is a need to improve storage efficiency. Based on our evaluations concerning the impact of compression on the perceived quality, we suggest using a archiving strategy that increases the compression ratio over time in order to reduce storage requirements [1]. The basic idea is to only keep the most recent videos with high quality and gradually re-encode older videos using lower quality compression parameters, until a minimum acceptable quality is reached. In particular, we propose the following three-layered archiving strategy. In the first x months a video is kept with a visually lossless quality. After these x months, the video is transcoded using parameters that still provide a good quality result. After this period, videos are transcoded to an acceptable quality for long-term archiving. Together with the aforementioned content-spatial and temporal filtering techniques, this greatly reduces the storage requirements of medical video archives without losing valuable information about the medical cases.

References

Bernd Münzer, Klaus Schoeffmann, Laszlo Böszormenyi, “Domain-Specific Video Compression for Long-term Archiving of Endoscopic Surgery Videos“, In Proc. IEEE Int. Symp. on Computer Based Medical Systems (CBMS), pp. 312-317, 2016.
Bernd, Münzer, Klaus Schoeffmann, Laszlo Böszörmenyi, “Detection of Circular Content Area in Endoscopic Videos for Efficient Encoding and Improved Content Analysis“, ITEC, Technical Report No. TR/ITEC/12/2.03, pp. 1-20, 2012.
Bernd Münzer, Klaus Schoeffmann, Laszlo Böszörmenyi, “Detection of Circular Content Area in Endoscopic Videos“, In Proc. IEEE Int. Symp. on Conputer Based Medical Systems (CBMS), pp. 534-536, 2013.
Bernd Münzer, Klaus Schoeffmann, Laszlo Böszörmenyi, “Improving Encoding Efficiency of Endoscopic Videos by using Circle Detection based Border Overlays“, In Proc. IEEE Int. Conf.on Multimedia and Expo Workshops (ICMEW), pp. 1-4, 2013.
Bernd Münzer, Klaus Schoeffmann, Laszlo Böszormenyi, “Relevance Segmentation of Laparoscopic Videos“, In Proc. IEEE Int. Symp. on Multimedia (ISM), pp. 84-91, 2013.
Bernd Münzer, Klaus Schoeffmann, Laszlo Böszormenyi, Jack J Jakimowicz, “Investigation of the Impact of Compression on the Perceptional Quality of Laparoscopic Videos“,In Proc. IEEE Int. Symp. on Computer Based Medical Systems (CBMS), pp. 153-158, 2014.

Domain Specific Compression of Endoscopic Videos

Spatial dimension: Circle Detection and Border Overlay

The original video has a noisy border and a size of 15,0 MB.

The border of the modified video is covered with black and has a size of 9,84 MB.

Temporal Dimension: Relevance Segmentation

Perceptual Dimension: Impact of Compression on the Perception Quality

Long-term Dimension: Archiving Strategy

References

Menu