Coloring Gray-Scale Image Using Artificial Neural Networks

Coloring Gray-Scale Image Using Artificial Neural Networks Bekir Karlık and Mustafa Sarıöz Haliç University, Department of Computer Engineering, 34381, Istanbul, Turkey [email protected] Fatih University, Computer Technology and Programming, 34500, Istanbul, Turkey [email protected]

Abstract: This paper presents a novel method on coloring the grayscale images. For this purpose, a combination of artificial neural networks and some image processing algorithms was developed to transfer colors from a user-selected source image to a target grayscale image. According to the results of this combining method, a survey where volunteers were asked to rate the plausibility of the colorings generated automatically was performed for individual images. In most cases automatically-colored images were rated either as totally or mostly acceptable.

such as luminance preserving pseudo-coloring that have been specifically developed to facilitate the visualization of scientific, industrial, medical and security images [2-4]. Colorization technique is to exploits textural information. The work of Welsh et al. [5], which is inspired by the color transfer and by image analogies [6], examines the luminance values in the neighborhood of each pixel in the target image and adds to its luminance the chromatic information of a pixel from a source image with best neighborhoods matching.

Keywords: Gray-scale image, neural networks, image processing, coloring.

In any of those cases, the coloring problem amounts to replacing a scalar value stored at each pixel of a grayscale image (e.g. luminance) by a vector in a multi-dimensional color space (e.g. a threedimensional vector with luminance, saturation and hue). Thus, this is in general a severely underconstrained and ambiguous problem for which it makes no sense to try to find an optimum solution, and for which even the obtainment of reasonable solutions requires some combination of strong prior knowledge about the scene depicted and decisive human intervention [7]. The color scale selected for visualization, that is, the sequence of colors used to represent the values of the data range, can have a substantial impact on the effectiveness of the visualization [8].

1. Introduction The color fundamental process followed by the human brain in perceiving color is a psychological phenomenon that is not yet fully understood, the physical nature of color can be expressed on a formal basis supported by experimental and theoretical results. Basically, the colors we perceive in an object are determined by the nature of the light reflected from the object. Due to the structure of human eye, all colors are seen as variable combinations of the three so-called Primary colors Red, Green and Blue (RGB). The task of coloring a gray-scale image involves assigning RGB values to an image which varies along only the luminance value. Since different colors may have the same luminance value but vary in hue or saturation, the problem of coloring gray-scale images has no correct solution. Due to these ambiguities, human interaction usually plays a large role in the coloring process [1]. Adding color to grayscale images and movies in a way that seems realistic to most human observers is a problem that greatly challenged the motion picture industry in the 1980s and has recently attracted renewed interest within the Computer Graphics community. While coloring of historic photos and classic movies has usually been done with the purported intent of increasing their visual appeal, there are certain coloring techniques c 978-1-4244-3523-4/09/$25.002009 IEEE

In this paper, we present a new methodology to color grayscale images in a fully automatic way that compensates the lack of human intervention by using a database of color images as a source of implicit prior knowledge about color statistics in natural images. Here, we move another step towards automatic coloring of grayscale images, with a methodology where source color images are automatically selected from an image database. More specifically, we designed, implemented and experimentally assessed four techniques to choose images from a database which was obtained from Fatih University employees to be used as a source images in color transferring.

366

2. The Proposed Algorithm In this section, the general structure of proposed algorithm describes for transferring color. The general procedure for converting the gray-scale image to color image includes two parts: The first part, a digital image is a two-dimensional image as a set of picture elements or pixels. Each pixel of an image is associated to a specific position in 2D region, and has a value consisting of one or more quantities (samples) related to that position. The second, Digital images are classified according to the number and nature of those quantities: • • • • • •

binary (bi-level) grayscale color false-color multi-spectral thematic

The RGB color model is a model in which red, green, and blue are combined in various ways to represent other colors. The name of the model ‘RGB’ come from the three primary colors, red, green, and blue. These three colors are different from the primary pigments of red, blue, and yellow, known in the art world as ‘primary colors’. As it can be seen in Fig. 1, a color in the RGB color model can be described by indicating how much of each of the red, green and blue color is included. Each red, green and blue value can vary between the minimum (no color) and maximum (full intensity). The result is black while all the colors are at minimum and the result is white while all the colors at maximum. Then Colors in RGB color model can be represented in several different ways. In color science, colors are represented in the range 0.0 (minimum) to 1.0 (maximum). Most color formulae take these values. For instance, full intensity red is 1.0, 0.0, and 0.0. The color values may be written as percentages between 0% (minimum) and 100% (maximum). Full intensity red is 100%, 0%, 0% in this representation. The color values may be represented as numbers in the range 0 to 255. Simply it can be calculated multiplying the value between the ranges 0.0 to 1.0 by 255. Each color value can be stored in one byte (8-bit) in digital medium. If this representation is used full intensity red is 255, 0, and 0. The same range, 0 to 255, is sometimes, commonly in web color representations, described in hexadecimal. The full intensity red #ff, #00, #00 might be contracted to #ff0000.

Figure 1 RGB image and components A grayscale digital image is an image in which each pixel is represented as a single value. This image is combination of colors which are the shades of gray, between the black and white. Grayscale images are distinct from black-and-white images having many shades of gray stored with 8 bits per sampled pixel, which allows 256 intensities but in black-and-white images, only two colors, black and white are used [9]. 2.1. The process of digital image processing Digital image processing is a data processing on a 2-D array of numbers which are numeric representation of an image. The components of an image processing system are a source of image data, a processing element and a destination for the processed results. The source of image data may be a camera, a scanner, a mathematical equation, statistical data, etc. In short, anything able to generate or acquire twodimensional data as a source of image data. Furthermore, the data may change as a function of time. The processing element is a computer. The output of the processing may be a display for the human visual system [10]. A RGB image can be converted into grayscale image (see Fig2). The correction of conversion from RGB to gray scale is depends on the sensitivity

2009 2nd International Conference on Adaptive Science & Technology

367

response curve of detector to light as a function of wavelength. The common formula is; Y = 0.3*R + 0.59*G + 0.11*B

(1)

training and testing. The same neural network is used both training and testing which has one input layer, one output layer and one hidden layer. Neural network has 6 input nodes and 1 output node. Hidden layer has 10 nodes. Pureline function is used as an activation function for all layers. The related neural network structure is shown below.

Figure 2 RGB shade color wheel and its gray scale

Figure 3 Neural Network Architecture

The conversion of RGB image to gray scale image minimizes the color space from 16,777,216 (256*256*256) to 256. As a result of minimization of the color space, many RGB color values are represented by the same grayscale image value. Hence, it is impossible to convert grayscale image to RGB image using linear conversion methods. To solve this non-linearity, neural networks may be used.

Both training and testing, for each pixel of grayscale image, 6 values are generated as an input of neural network with different methods. As it can be seen in Fig. 4, three different neural networks are used to determine red, green and blue components of result RGB image. Three of the network has the same structure which is described above. The network for red component of result RGB image is described as ’network1’, the network for green, is described as ’network2’ and the network for blue, is described as ’network3’. All of these neural networks consist of Multi Layered Perceptron (MLP) architectures and Back-Propagation training algorithms.

2.2. Coloring the Gray-scale Image Using Neural Networks In this works, coloring the gray scale image is realized with a matlab application using neural networks. This application is executed in two phases:

Figure 4 Pre-training and training phase 368


During training phase, weights are optimized using a RGB image and its grayscale conversion. In this schema, input is grayscale image data, which is generated conversion of RGB image. The target is red components of RGB image for network1; green component of RGB image for network2 and blue component of RGB image for network3 (see Fig 5). Here are the layers of input variables;

6. A new image matrix generated from the grayscale image using the sigmoid functions as sub functions. The mathematical expression of sigmoid function is:

1. Original grayscale image (see Fig. 6). 2. 2D matrix includes the ratio of the sum of the distances of abscissas and ordinates between the middle point to sum of middle coordinates. The size of Matrix can be varied depending to the size of training image. Matrix(i,j)=(abs(i-midx)+abs(j-midy))/(midx+midy); 3. 2D blurred image matrix which is composed from computing the averages of mini matrixes which of size is given before, and writing the averages to first element of (1,1) mini matrixes. 4. 2D matrix which helps to separate background from person in image. To compose this, following steps are done. • find the sum of each abscissa • find the sum of each ordinate • find values for each pixel adding together the sum of apses with the sum of ordinate • find average value for each pixel • find for each pixel value difference from the average • assemble these values into a matrix and send this matrix as an output 5. 2D matrix of values which represents difference from average. Figure 5 Original image and target images

Figure 6 Training inputs


369

The sigmoid function can be used for separating the values whish are closer to 0.5 in image (see Fig.7). After experimenting with different values, we found good results near 0.5. If the intensity of values closer to 0.5 is more than the others, a sigmoid function can be used to separate the differences of pixels. If gravity centre of intensity in image is different from 0.5, intensive values can be separated using two sigmoid functions. The mathematical expressions of these two functions are:

y=

(exp( x) + (2 ∗ origin − 1)) exp( x) + 1

(2) Figure 7 Sample function used for separate data

and

y=

exp( x) ∗ (1 + 2 ∗ origin − 0,5 ) exp( x) + 1

(3)

Finally, we can drown a function, which has two components. These components are the functions which are derivations of sigmoid function.

Fig. 8 describes testing phase of structures of ANNs. As it can be seen in this figure, weights are optimized for three networks (network1, network2, network3). After then, some grayscale images are used as an input of test phase for three networks. After the test, colored image will be composed from the grayscale input image. The outputs of three networks are red, green and blue values of colored image.

Figure 8 Testing phase

3. Application Results Grayscale images are used as test data. The grayscale image is input of three networks. When we combine the outputs of these three networks, we derivate the output images. Figure 9 shows some test images and outputs. We notice that the train data is very important for success. If we use closest images as train image, performance will be increased. We can use this technique in coloring old photographs. But we notice that, there will be some loss in data because of old technology or converting the old photo to digital medium. 370

a. test1 input, output and original image


References b. test2 input, output and original image

c. test3 input, output and original image Figure 9 Test inputs, outputs and original images

4. Conclusion In this paper we have proposed a novel, fast, and userfriendly approach to the problem of coloring grayscale images. We experienced that neural networks can be used successfully to solve the coloring problem of image processing. The coloring of old pictures (or historical pictures) can also be used very well using this proposed method. But it has some specific disadvantages. One of them is getting similar the training input data for old picture format. The other is training time of neural networks. In this study, we used Back-propagation well known algorithm which needs more training time (but testing is on-line). In the future, we will try to use the other neural networks algorithms such as RBF, LVQ etc. Then their results will be compared.

[1] R. C. Gonzales and R. E.Woods, 1987. Digital Image Processing. Addison-Wesley Publishing, USA. [2] Luiz Filipe M. Vieira et al., 2003. Automatically choosing source color images for coloring grayscale images. In Proc. SIBGRAPI’03, pp.1530-1538. [3] S. M. Pizer, and J. B. Zimmerman, 1983. Color Display in Ultrasonography, Ultrasound in Medicine and Biology, v.9, no. 4, pp. 331-345. [4] P. Rheingans, 1992. Color, Change, and Control for Quantitative Data Display, In Proc.Visualization '92, IEEE Computer Society Press, Los Alamitos CA, pp. 252-259. [5] T.Welsh, M. Ashikhmin, and K. Mueller, 2002. Transferring color to grayscale images. In Proc. ACM SIGGRAPH, pp. 277-280. [6] E. Reinhard, M. Ashikhmin, B. Gooch, and P. Shirley, 2001. Color transfer between images. IEEE Computer Graphics and Applications, special issue on Applied Perception, pp. 34-41. [7] V. Karthikeyani, K. Duraiswamy and P. Kamalakkannan, 2007. Conversion of Gray-scale image to Color Image with and without Texture Synthesis, In Proc. IJCSNS, v.7, no.4, pp.11-16. [8] P. Rheingans and C. Landreth, 1995. Perceptual Principles for Effective Visualizations, Perceptual Issues in Visualization, G. Grinstein and H. Levkowitz eds., Springer, pp. 59-73. [9] R. Gonzales, R. Woods, S. Eddins, 2001. Digital Image Processing Using Matlab, Pearson Prentice Hall, USA. [10] Z. N. Li and M.S. Drew, 2004. Fundamentals of Multimedia”, Pearson Prentice Hall, USA.


371