Visual Fixation Patterns when Judging Image Quality - CiteSeerX

15 downloads 222 Views 2MB Size Report
of two eye-tracking experiments designed to investigate ... studies did not mention the affects of different amounts ... tracker between each subsession.
Visual Fixation Patterns when Judging Image Quality: Effects of Distortion Type, Amount, and Subject Experience C. T. Vu, E. C. Larson, and D. M. Chandler Image Coding and Analysis Lab, School of Electrical and Computer Engineering Oklahoma State University, Stillwater, OK 74078 [email protected]; [email protected]; [email protected]

Abstract Human visual fixation patterns can provide important insights into how biological systems address the image-analysis problem. This paper presents the results of two eye-tracking experiments designed to investigate how normal visual fixations may be affected when judging image quality. We asked (1) whether people look at different regions when judging image quality vs. just looking; and (2) how different types and amounts of distortion affect fixations. We found that white noise and blurring do not change fixations relative to the task-free condition, whereas compression artifacts can influence fixations, depending on the amount of distortion.

2.

Do different types and amounts of distortion influence where people fixate? This paper is organized as follows: Sections 2 and 3 describe the methods and results of the experiments, respectively. A discussion of the results is presented in Section 4. Conclusions are provided in Section 5.

2. METHODS Two experiments were performed using an eye-tracker to examine visual fixation patterns. In Experiment I, subjects simply looked at original images (task-free condition). In Experiment II, subjects rated the quality of distorted images.

1. INTRODUCTION

2.1. Apparatus and Subjects

Digitized images are often subject to distortion during acquisition, processing, and display. Yet, even if the parameters of these stages are held constant (e.g., a constant signal-to-noise ratio or a fixed compression bitrate), why are some images more resilient to distortion than other images? Indeed, when subjects are asked why they assigned a particular fidelity rating to an image, they often attribute their rating to distortion of specific objects or regions within the image, suggesting that perceived fidelity might also be affected by what in the image is actually distorted. Several studies have investigated the utility of visual fixation patterns for image processing (e.g., for image quality [1], image compression [2]). To investigate the relationship between image quality and visual gaze points, Miyata et al. [3] tracked the eyes of subjects who were asked to rate the perceived quality of images containing white noise, blurring, or color-shift distortion They concluded that these types of distortion do not influence where subjects fixate. However, in a similar study, Ninassi et al. [4] concluded that JPEG2000 and JPEG distortion do indeed affect fixations. Both of these studies did not mention the affects of different amounts of distortion and were limited to a small number of distortion types. In this paper, we expand on these previous studies by presenting two experiments designed to investigate the influence of noise, blurring, and compression artifacts on visual fixations. By using eye-tracking and images from the LIVE image database [5], we asked:

Stimuli were displayed on a high-resolution ViewSonic VA912B 19 inch monitor. The display yielded a minimum and maximum luminance of 2.7 and 207 cd/m2, respectively, and an overall gamma of 2.9. The eye tracker used in the experiments was a ViewPoint PC-60. A head and chin rest was used to prevent movement during the experiments; the distance from the chin rest to the screen was 60 cm. Software developed by the authors was used to display the images and collect the data. Five subjects participated in Experiment I; one of them was naïve to the purpose of the experiment, and the others were members of our lab. The first and third authors participated in Experiment II. Subjects ranged in age from 24 to 32 years. All subjects had either normal or corrected-to-normal visual acuity.

1.

Do people fixate on different regions when judging the quality of an image vs. just looking?

2.2. Stimuli Stimuli were color images chosen from the LIVE image database [5]. The LIVE database contains 29 original images, 26 to 29 distorted versions of each original image, and subjective ratings of fidelity for each distorted image. The types of distortions used in the database were: Gaussian blurring, additive white noise, JPEG compression, JPEG2000 compression, and simulated data packet loss of transmitted JPEG2000-compressed images. The images ranged in size from 408x704 pixels to 768x512 pixels.

2.3. Procedure In Experiment I, subjects viewed all 29 original images. Each experimental session was divided into four subsessions (each with 7-8 images) in order to give the subjects

a resting period and to perform recalibration of the eye tracker between each subsession. In this experiment, subjects were instructed only to look at the images; no additional task was given. In Experiment II, subjects rated the quality of various distorted images. Distorted versions of 10 original images, selected randomly from the 29 total original images, were used in this experiment. The ten images were: caps, cemetary, manfishing, lighthouse, monarch, parrots, plane, rapids, sailing2, and womanhat. For each of the five distortion types (blurring, white noise, JPEG, JPEG2000, and packet loss), two distorted versions of each of the 10 original images were used. The two distorted versions were chosen such that one of the images contained just-noticeable distortion and the other image contained clearly visible (suprathreshold) distortion. Thus, there were a total of 100 images used in Experiment II (10 original images × 5 types of distortion × 2 amounts of distortion). Each image was rated on an integer scale from 1 to 5, where 5 denoted perfect quality. The experimental session was divided into 14 subsessions to give the subjects a resting period and to perform recalibration of the eye tracker. In both experiments, each image was displayed for 7 seconds in its original size. During the experiments, subjects were instructed to perform the calibration carefully and to refrain from unnecessary movement.

3. RESULTS AND ANALYSIS 3.1. Experiment I The gaze points obtained from the eye tracker* were averaged over all five subjects. We used squares of size 10x10 pixels to denote these gaze points, which were composited onto the corresponding images to facilitate analysis. The result is shown in the left-hand column of Figure 1; brighter squares denote longer fixations. These results reveal that subjects usually gaze at particular regions within images (regions of interest). For example, subjects tend to look at faces, or more specifically, the eyes of people and animals. Text, numbers, and other objects which have special position or high contrast also attract people’s attention.

3.2. Experiment II Figure 1 (right-hand column) depicts the results from Experiment II averaged across all five distortion types and both levels (low and high) of distortion. Green squares correspond to results for subject C.V. who has experience in photography, but little experience in judging image quality. Red squares correspond to results for subject D.C. who has experience in judging image quality, but little experience in photography. Brighter squares *

The ViewPoint eye tracker provides a confidence level for each obtained data point. During analysis, based on this confidence, unreliable data were ignored.

Figure 1: Fixations from the task-free condition (lefthand column) vs. the distortion-rating condition averaged over all distortion types (right-hand column).

denote regions of longer fixation. These results reveal that, when gaze points are averaged over all types of distortion, the task of judging image quality changes the locations at which subjects fixate relative to the task-free condition. Figures 2 and 3 depict the results for two images (lighthouse and womanhat) for separate distortion types and separate amounts of distortion. Images in the left-hand column correspond to the lowdistortion (high-quality) condition; images in the right-

Low

Low

High

Blurring

Blurring

Noise

Noise

Packet Loss

Packet Loss

JPEG

JPEG

JPEG2000

JPEG2000

Figure 3: Comparing gazing points among different types and amounts of distortion while rating quality

Figure 2: Comparing gazing points among different types and amounts of distortion while rating quality

hand column correspond to the high-distortion (lowquality) condition. We note the following observations from these results: 1.

When judging the quality of images with blurring or white noise, subjects tend to look at the same regions as those in the task-free condition. However,

High

2.

this is not true for others types of distortion (JPEG, JPEG2000, packet loss). Different types of distortion affect where people look. Specifically, with blurring, subjects tended to look at the same regions as those in the task-free condition. Similar results were observed for white

3.

noise except that subjects also looked at smooth regions (e.g., sky, skin). However, different trends were observed for other distortion types. Subjects tended to focus on the ghosting in packet-loss distortion. With JPEG, subjects looked at blocking (if present), at the edges of objects, and/or at the same regions as in the task-free condition. For JPEG2000, subjects tended to focus on edges (for ringing), textures (for blurring), and/or on the same regions as in the task-free condition. Large amounts of distortion prompted both subjects to focus on the distorted regions, but only if the distortion was localized in space (concentrated); this occurred for the compression-type distortions, but not for blurring or white noise. For these latter two distortions, subjects looked largely at the same regions as in the task-free condition. Similarly, in the low-distortion condition, many of the images looked perfect; for these images, subjects also tended to look primarily at the same regions as in the task-free condition.

4. DISCUSSION 4.1. Task-Free vs. Quality Judgment The results of Experiments I and II revealed that, when judging image quality, subjects tended to look at the same regions as those in the task-free condition for distortions consisting of blurring or white noise. We believe that these results are due to the fact that blurring and white noise are uniformly distributed throughout the image. Consequently, because the distortion is not spatially localized, subjects could not find any particularly degraded region on which to fixate, so they simply looked at the same regions as in task-free condition. Note that for white noise, it was easier for subjects to find white noise in smooth regions; thus, fixations tended to include such regions as sky or skin. For JPEG, JPEG2000, and packet-loss distortion, the results were quite different. These types of distortion tend to create artifacts that are spatially localized. Consequently, if such artifacts were visible, subjects tended to fixate on those regions, e.g., ghosting in fast fading, blocking in JPEG, or ringing in JPEG-2000. However, if the artifacts were below the threshold of detection, subjects tended to fixate on the same regions as in the taskfree condition. These results indicate that for spatially localized distortion, fixations indeed depend on the amount of distortion (near-threshold vs. suprathreshold).

4.2. Image-Processing Expert vs. Photographer Generally, the results from Experiment II demonstrated that the two subjects C.V. (photographer) and D.C. (image processing researcher) were consistent with each other, both in terms of regions-of-fixation and quality ratings. For the ratings, the correlation coefficient be-

tween the two subjects was 0.8. Of the 100 distorted images, only 12 of the images received a rating that differed by more than one unit (e.g., 3 vs. 5), and none of the ratings differed by more than two units (e.g., 2 vs. 5). It is interesting to note that even for those images in which C.V. and D.C. fixated on different regions, the final ratings were generally consistent. For the 12 images that C.V. and D.C. rated differently, the difference was always attributable to personal preference. Subject D.C. expressed particular dislike to suprathreshold compression-type artifacts, distortions which are common in image coding. Subject C.V. expressed particular dislike to burring and noise, distortions which are common to photography.

5. CONCLUSIONS In this paper, we presented two experiments designed to examine visual fixation patterns when judging image quality. The results revealed that regions where people fixate while judging image quality can be different from those obtained under the task-free condition. For blurring and white noise, our results were similar to those Miyata et al. who concluded that fixations while rating image quality do not change under the task-free condition. However, we also found that this trend does not hold for compression-type distortions in which the distortions are spatially localized; for these distortions, fixations varied depending on distortion type and amount. We are currently working to improve the subject pool: For the results presented here, only one subject out of five was naïve to the purpose of our experiment, all other subjects came from our lab. Clearly, it is crucial to employ more subjects to better determine any resulting trends. However, we anticipate that the general findings will be consistent with those presented here. References 1. A. Ninassi, O. L. Meur, P. L. Callet, D. Barba, A. Tirel, “Does where you gaze on an image affect your perception of quality? Applying visual attention to image quality metric”, ICIP07, vol.2, pp.169-172, 2007 2. N. Tsumura, C. Endo, H. Haneishi, and Y. Miyake, “Image compression and decompression based on gazing area” Proc. IS&T/SPIE Symposium on Electronic Imaging 96: Human Vision and Electronic Imaging, vol.2657, pp.361–367, 1996. 3. K. Miyata, M. Saito, N. Tsumura, H. Haneishi, and Y. Miyake, “Eye Movement Analysis and its Application to Evaluation of Image Quality”, The Fifth Color Imaging Conference: Color Science, Systems, and Applications, pp. 116 – 119, (1997). 4. A. Ninassi, O. L. Meur, P. L. Callet, D. Barba, A. Tirel, “Task impact on the visual attention in subjective image quality assessment”, EUSIPCO-06, Florence, Italy, 2006 5. H. R. Sheikh, Z. Wang, A. C. Bovik, and L. K. Cormack, “Image and Video Quality Assessment Research at LIVE,” http://live.ece.utexas.edu/research/quality/.