Journal of Scientific Computing https://doi.org/10.1007/s10915-018-0888-2
A Fast Algorithm for Solving Linear Inverse Problems with Uniform Noise Removal Xiongjun Zhang1
· Michael K. Ng2
Received: 2 June 2018 / Revised: 22 November 2018 / Accepted: 26 November 2018 © Springer Science+Business Media, LLC, part of Springer Nature 2018
Abstract In this paper, we develop a fast algorithm for solving an unconstrained optimization model for uniform noise removal which is an important task in inverse problems. The optimization model consists of an ∞ data fitting term and a total variation regularization term. By utilizing the alternating direction method of multipliers (ADMM) for such optimization model, we demonstrate that one of the ADMM subproblems can be formulated by involving a projection onto 1 ball which can be solved efficiently by iterations. The convergence of the ADMM method can be established under some mild conditions. In practice, the balance between the ∞ data fitting term and the total variation regularization term is controlled by a regularization parameter. We present numerical experiments by using the L-curve method of the logarithms of data fitting term and total variation regularization term to select regularization parameters for uniform noise removal. Numerical results for image denoising and deblurring, inverse source, inverse heat conduction problems and second derivative problems have shown the effectiveness of the proposed model. Keywords Uniform noise · Linear inverse problems · Total variation · ∞ -Norm · Alternating direction method of multipliers Mathematics Subject Classification 49N45 · 65F22 · 90C25
Xiongjun Zhang: Research supported in part by the National Natural Science Foundation of China under Grants 11801206, 11571098, 11871026, Hubei Provincial Natural Science Foundation of China under Grant 2018CFB105, and Self-Determined Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE under Grant CCNU17XJ031. Michael K. Ng: Research supported in part by the HKRGC GRF 1202715, 12306616, 12200317 and HKBU RC-ICRS/16-17/03.
B
Xiongjun Zhang
[email protected] Michael K. Ng
[email protected]
1
School of Mathematics and Statistics, Hubei Key Laboratory of Mathematical Sciences, Central China Normal University, Wuhan 430079, China
2
Department of Mathematics, Centre for Mathematical Imaging and Vision, Hong Kong Baptist University, Kowloon Tong, Hong Kong
123
Journal of Scientific Computing
1 Introduction Linear inverse problems are foundational problems in applied mathematics and arise in many real-world applications such as signal processing [1,28], image processing [4,5,19], and acoustic wave propagating [11], to name only a few. In this paper, we focus on the following discrete linear system: f = K x + v, where f ∈ Rn is a vector containing the observed data, K ∈ Rn×m is a given matrix, x ∈ Rm is a vector containing the true data to be estimated, and v ∈ Rn is a noise vector with each entry being uniform distribution on [−c, c]. Here c denotes the noise level. We remark that the matrix K can represent a blur matrix in signal and image processing [5], a compact linear operator by discretizing an integral equation in inverse heat conduct problems [21], and a Laplacian matrix with a homogeneous Dirichlet boundary condition in inverse source problems in two dimensions (2D) [9], etc. A widely used approach to deal with this problem is the maximum likelihood (ML) estimation [49]. Assume that vi are independent identically distributed, where vi are the ith components of v, i = 1, . . . , n. By the probability density of uniform distribution, the likelihood function p f |x (x, f ) is given by 1 n , if vi ∈ [−c, c], i = 1, . . . , n, p f |x (x, f ) = (2c) 0, otherwise. An ML estimate is any x satisfying [49] f − K x∞ ≤ c. In order to deal with the ill-posed problem, a regularization term should be added to stabilize the solution. Traditional regularization consists of the Tikhonov-like regularization [43] and the total variation (TV) regularization [39]. Although the Tikhonov-like regularization method is easy to calculate, it tends to make the restored solution excessively smooth and fails to preserve adequate attributes such as sharp edges. In contrast, the TV regularization, first proposed by Rudin et al. [39] for image denoising and then generalized to image deconvolution [38], is very efficient to preserve piecewise constant regions in signal and image processing. Due to the capability of preserving sharp edges, the TV regularization is a very successful and popular approach in image processing such as image denoising and deblurring, image segmentation [5,19]. We use the TV as the prior information of x to preserve more piecewise constant regions, that is, the prior density px (x) of x is given by m Di x , px (x) = exp − i=1
where Di : Rm → Rs is a linear operator with s ≥ 1. Here Di can represent the firstorder difference of a signal and gradient operator of an image. Di x describes the prior information of the given data. Notice that the conditional density p f |x (x, f ) of f , given x, is given by p f |x (x, f ) =
123
p(x, f ) , px (x)
Journal of Scientific Computing
where p(x, f ) is the joint probability density of x and f . Therefore, the joint probability density p(x, f ) is given explicitly by m exp(− i=1 Di x) , if vi ∈ [−c, c], i = 1, . . . , n, (2c)n p(x, f ) = 0, otherwise. The maximum a posterior probability (MAP) estimation is to maximize the conditional density px| f (x, f ) of x, given f , which is given by px| f (x, f ) =
p(x, f ) . pf( f)
By the Bayesian formulation, the MAP estimate is found by solving the following problem: min
m
Di x
(1.1)
i=1
s.t. K x − f ∞ ≤ c. We reformulate (1.1) as an unconstrained optimization problem and propose a variational model composed of an ∞ data fitting term combined with the TV regularization term (L∞ TV) as follows: m min μK x − f ∞ + Di x, (1.2) i=1
where μ is the regularization parameter to balance the data fitting term and the regularization term. For suitable choices of c and μ, the solutions of (1.1) and (1.2) coincide. However, it is not known a priori about the two parameters [44] such that (1.1) is equivalent to (1.2). A lot of variational models based on the TV regularization have been introduced for removing other noises, such as Gaussian noise [15,39], multiplicative noise [2,13,25,26,42], impulse noise [8,27,32,33,51], Poisson noise [29,41,48,52], and Cauchy noise [40,45], to name only a few. Some interesting properties of minimizers of the variational models are also studied. For instance, Nikolova [32] analyzed the properties of a variational model in 1 data fitting for impulse noise removal and showed that a certain number of data points can be attained exactly. Moreover, Chan et al. [8] investigated some analytical properties of minimizers of TV regularization for image decomposition and image denoising. Nikolova [34] also analyzed the properties of minimizers of cost-functions composed of an 2 data-fidelity term and an edge-preserving regularization term for signal and image recovery involving constant regions. It is interesting to note that the 2 data fitting is employed for Gaussian noise removal. When some data points are required to fit exactly, the 1 data fitting term would be more useful [32,33]. Moreover, when the observed data are corrupted by uniform noise, it is more suitable to use the ∞ data fitting term [9,49]. The computational challenge of (1.2) is that both of the ∞ data fitting term and the TV regularization term are nondifferentiable. Recently, minimization of cost functions involving ∞ data fitting has received much interest in linear inverse problems [9,47,49], which is efficient for uniform noise removal. Clason [9] proposed to use a Moreau-Yosida approximation for ∞ constraint, and then applied a semi-smooth Newton method to solve the resulting optimality conditions. Moreover, Wen et al. [49] developed an efficient semi-smooth Newton method for solving the linear inverse problems without any approximation, where the ∞ constraint is handled by active set constraints arising from the optimality conditions. Note that the objective function in their model is a quadratic term, which is differentiable, while
123
Journal of Scientific Computing
the two terms of the proposed model in (1.2) are nondifferentiable. These algorithms cannot be applied to solve our proposed model directly. In this paper, the alternating direction method of multipliers (ADMM) [6,16–18] is developed to solve the L∞ TV model. The ADMM is efficient to solve many nonsmooth convex optimization problems with equality constraints and has been used widely in many linear inverse problems such as image processing [3,50], system identification [16], and machine learning [6]. The main advantage of ADMM is that the resulting subproblems are much easier to solve than before. By variable splitting techniques, each subproblem in ADMM can be solved directly or by efficient solvers for the L∞ TV model. Since a subproblem in ADMM involves the ∞ minimization problem, by the Moreau decomposition [36], it can be solved efficiently and exactly by a fast projection algorithm onto the 1 ball [12]. The convergence of ADMM can also be established under some mild conditions. Extensive numerical results show the effectiveness of the proposed model in image denoising and deblurring, inverse source problems in 2D, inverse heat conduction problems, and second derivative problems. Furthermore, we compare the proposed model with the 1 data fitting plus TV regularization (L1TV) [32,33] and 2 data fitting plus TV regularization (L2TV) [39] models. In addition, we propose to utilize the L-curve method to select the regularization parameter automatically, which is based on the graph of the logarithms of data fitting term and TV regularization term. Unlike the traditional 2 -norm of data fitting term and Tikhonov regularization in the L-curve [20,23], we use the logarithms of ∞ -norm of data fitting term and TV terms. Very recently, Wang et al. [46] proposed to employ the 1 -norm of data fitting term in the L-curve method for multiplicative noise removal. It is interesting to note that the use of ∞ -norm in the L-curve method is different from [23,46]. Numerical experiments will be presented to show the effectiveness of the L-curve method in linear inverse problems. The remaining parts of this paper are organized as follows. In Sect. 2, we introduce some notation and notions used throughout this paper. In Sect. 3, the ADMM is developed to solve the L∞ TV model and a fast projection algorithm onto the 1 ball is reviewed. Moreover, the convergence of ADMM is established under some mild conditions. Extensive numerical examples are presented to demonstrate the effectiveness of the proposed model in Sect. 4. Finally, we conclude this paper in Sect. 5.
2 Preliminaries Throughout this paper, we use Rn to denote the n-dimensional Euclidean space. For any vector x := (x1 , . . . , xn )T ∈ Rn , x denotes the 2 -norm (Euclidean norm) of x, x∞ denotes the ∞ -norm of x, i.e., x∞ := max1≤i≤n |xi |, x1 denotes the 1 -norm of x, n i.e., x1 := i=1 |xi |, where | · | is the absolute value of a real number. The set of all m × n matrices with real entries is denoted by Rm×n . Let D := (D1T , . . . , DmT )T , where the notation T denotes the transpose operator. The identity matrix is denoted by I , whose dimension should be clear from the context. The effective domain of f : Rn → R ∪ {+∞} is defined as dom( f ) := {x : f (x) < +∞}. We say that f is proper if it never equals −∞ and dom( f ) = ∅. Let f be a closed proper convex function. The proximal operator Prox f : Rn → Rn of f is defined by 1 Prox f (y) = arg min f (x) + x − y2 . x 2
123
Journal of Scientific Computing
By the Moreau decomposition [36, Theorem 31.5], we have y = Proxλ f (y) + λProxλ−1 f ∗ (y/λ),
(2.1)
where λ > 0 is a given parameter and the Fenchel conjugate function f ∗ of f is defined by f ∗ (x) := sup{x, y − f (y)}. y
For any nonempty closed convex set C, its indicator function is defined by δC (x) =
0, if x ∈ C, +∞, otherwise.
3 Solving L∞ TV via ADMM In this section, we apply the alternating direction method of multipliers (ADMM) [16–18] to solve the proposed model (1.2). Let z = K x − f and yi = Di x, i = 1, . . . , m, then (1.2) can be rewritten equivalently as follows: min μz∞ +
m
yi
(3.1)
i=1
s.t. z = K x − f , yi = Di x, i = 1, . . . , m. The augmented Lagrangian function for (3.1) is defined by L(y, z, x, u 1 , u 2 ) := μz∞ +
m
yi − u 1 , z − (K x − f ) −
i=1
γ2 γ1 + z − (K x − f )2 + y − Dx2 , 2 2
m (u 2 )i , yi − Di x i=1
(3.2) where u 1 , u 2 are the Lagrangian multipliers and γ1 , γ2 > 0 are the penalty parameters. Here T )T . Note that y, z are separable, we can y := (y1T , . . . , ymT )T and u 2 := ((u 2 )1T , . . . , (u 2 )m view (y, z) together. Since the objective function of (3.1) is the sum of a function of x and a function of (y, z), the two-block ADMM can be applicable [16–18]. Now the iteration scheme of ADMM for solving (3.1) can be described as follows: (y k+1 , z k+1 ) = arg min L(y, z, x k , u k1 , u k2 ) , y,z x k+1 = arg min L(y k+1 , z k+1 , x, u k1 , u k2 ) ,
(3.4)
u k+1 = u k1 − τ γ1 (z k+1 − (K x k+1 − f )), 1
(3.5)
u k+1 2
(3.6)
x
=
u k2
− τ γ2 (y
k+1
− Dx
k+1
),
(3.3)
√ where τ ∈ (0, (1 + 5)/2) is the step-size. The subproblem with respect to y in (3.3) is the shrinkage operator [3,51] and its solution can be given explicitly by
123
Journal of Scientific Computing
(u 2 )ik
Di x + (u 2 )ik /γ2 1
, i = 1, . . . , m, yik+1 = max Di x k +
− ,0 γ2 γ2 Di x k + (u 2 )ik /γ2
(3.7)
where 0 · (0/0) = 0 is assumed. For the subproblem with respect to z in (3.3), it can be rewritten as z k+1 = arg min μz∞ + z
2 γ1
1
z − (K x k − f + u k1 ) , 2 γ1
(3.8)
which is the proximal mapping of the ∞ -norm [35]. Let g(z) := z∞ and ξ k := K x k − f + γ11 u k1 . Then the minimizer of (3.8) can be rewritten as follows: z k+1 = Prox μ g (ξ k ). γ1
By (2.1), we have ξ k = Prox μ g (ξ k ) + γ1
γ
μ 1 k ξ . Prox γ1 g∗ μ γ1 μ
(3.9)
(3.10)
Let C := {x|x1 ≤ 1}. Note that g ∗ (x) = δC (x). Then we get Prox γ1 g∗ μ
γ
1 k
μ
ξ
γ1
1
2
δC (x) + x − ξ k
x μ 2 μ 1
γ1 k
2
= arg min
x − ξ . x1 ≤1 2 μ
= arg min
γ
1
(3.11)
Notice that (3.11) is the projection onto the 1 ball. Despite no closed form being available directly, the projection onto the 1 ball can be carried out very quickly [12,14,24,44]. We will apply Algorithm 2 to solve the proximal mapping of δC exactly, then z k+1 can be computed by combining (3.9) and (3.10). We also refer the reader to [12] for more detailed discussions about the projection onto the simplex and 1 ball. For the subproblem with respect to x in (3.4), it is just a least squares problem and can be computed via solving the following normal equation. (γ1 K T K + γ2 D T D)x = K T (γ1 (z k+1 + f ) − u k1 ) + D T (γ2 y k+1 − u k2 ).
(3.12)
Remark 3.1 On the one hand, if K has some special structures, we can solve (3.12) by the known solvers. For example, in the image restoration problems, K is usually a blurring matrix under period boundary conditions that can be diagonalized by using fast Fourier transforms. (3.12) can be solved via three fast Fourier transforms [30,31]. On the other hand, if K has not special structures, we can apply the conjugate gradient method to solve this equation. Now the ADMM for solving (3.1) can be stated in Algorithm 1. Algorithm 1: Alternating direction method√of multipliers for solving (3.1). Step 0. Input u 01 , u 02 , x 0 . Let τ ∈ (0, (1 + 5)/2) and γ1 , γ2 > 0 be given parameters. Set k := 0. Step 1. Compute y k+1 , z k+1 by (3.7), (3.9). Step 2. Compute x k+1 via solving (3.12). k+1 Step 3. Update u k+1 by (3.5), (3.6). 1 , u2 Step 4. If a termination criterion is not met, set k := k + 1 and go to Step 1.
123
Journal of Scientific Computing
3.1 Fast Projection onto the 1 Ball Since the computation of z k+1 involves the projection onto the 1 ball, in this subsection, we briefly review a fast projection algorithm onto the 1 ball in [12]. We also refer reader to [12] for more details about the projection onto the 1 ball. Unlike the proximal operator of 1 - and 2 -norms, the projection onto the 1 ball cannot besolved via one step. Let n W := {x ∈ Rn |x1 ≤ a} and G := {x = (x1 , . . . , xn )T ∈ Rn | i=1 xi = a and xi ≥ 0}, T where a > 0 is a given parameter. For any y := (y1 , . . . , yn ) ∈ Rn , we review a fast algorithm for solving the following problem exactly: 1 ProxδW (y) := arg min δW (x) + x − y2 . (3.13) x 2 By [14, Lemma 3], we have y, if y ∈ W , ProxδW (y) = (sgn(y1 )s1 , . . . , sgn(yn )sn )T , otherwise, where s := (s1 , . . . , sn )T = ProxδG (|y|) with |y| := (|y1 |, . . . , |yn |)T and sgn(·) is the signum function, i.e., ⎧ ⎨ 1, if x > 0, sgn(x) := 0, if x = 0, ⎩ −1, if x < 0. By [12, Proposition 2.2] (see also [24]), there exists a unique ρ ∈ R such that si = max{|yi | − ρ, 0}, i = 1, . . . , n. Therefore, the computation cost of the projection algorithm is dominated by the method of searching ρ. In the following, we briefly review a fast method to search ρ [12]. Assume that we have already read the previous entries y j , j = 1, . . . , i − 1, and then prepare to read the entry yi . Now we have a subsequence Y of {y j }i−1 j=1 of all entries potentially larger than ρ, then we set ζ = ( y j ∈Y y j − a)/|Y |, where |Y | denotes the number of entries of Y . Note that ζ ≤ ρ. Then, if yi ≤ ζ , we obtain that yi ≤ ρ. Therefore, we could ignore this entry and do nothing. On the contrary, if yi > ζ , we add yi to Y since yi is potentially larger than ρ. Then ζ = ( y j ∈Y y j − a)/|Y | with new Y , which is larger than before. Afterwards, we continue the pass with the next entry yi+1 . Now we have the subsequence Y of all the entries of y potentially larger than ρ. Since we have calculated the value ζ , we can use it to remove entries from Y in the next pass. Assume that we have read the entry yi ∈ Y . If yi > ζ , we do nothing, otherwise, we remove yi from Y . Consequently, ζ is assigned to the new value of ( y j ∈Y y j − a)/|Y |, which is strictly larger than before. After that, we can continue the pass with the next entry of Y . We summarize the fast projection algorithm onto the 1 ball in the following, which can be referred to [12].
123
Journal of Scientific Computing
Algorithm 2: Fast algorithm for solving (3.13) [12]. as an empty list, ζ := y1 − a. Step 1. Set Y := (y1 ), Y Step 2. For i = 2, . . . , n, do If yi > ζ , −ζ Set ζ := ζ + |Yyi |+1 . If ζ > yi − a, add yi to Y . , set Y := (yi ), ζ := yi − a. Else, add Y to Y is not empty, for every entry yi of Y , do Step 3. If Y If yi > ρ, add yi to Y and set ζ := ζ + (yi − ζ )/|Y |. Step 4. Do, while |Y | changes, For every entry yi of Y , do If yi ≤ ζ , Remove yi from Y and set ζ := ζ + (ζ − yi )/|Y |. Step 5. Set ρ := ζ, K = |Y |. Step 6. For i = 1, . . . , n, set si := max{yi − ρ, 0}. Condat [12] showed the efficiency of Algorithm 2 and analyzed the complexity of this projection algorithm. It is very efficient to solve large-scale problems in image processing and statistics learning.
3.2 Convergence Analysis In this subsection, we establish the convergence of ADMM in Algorithm 1. First, we consider the following mild assumption throughout this paper. Assumption 3.1 Ker(K ) ∩ Ker(D) = {0}, where Ker(·) denotes the null space of a matrix. Remark 3.2 Assumption 3.1 is easy to satisfy. The null space of differences matrix D (under appropriate boundary conditions) consists of constant vectors with all entries being equal. If K is a blur matrix, then their rows sum up to one. Thus their joint kernel must be the zero vector. We now show that the optimal solution set of (3.1) is nonempty and compact in the following proposition. Proposition 3.1 Assume that Assumption 3.1 holds. Then the optimal solution set of (3.1) is nonempty and compact. Proof Let M := {(z, x) : z = K x − f }, N := {(y, x) : yi = Di x, i = 1, . . . , m}. Then (3.1) can be rewritten as min Q(y, z, x) := μz∞ +
m
yi + δ M (z, x) + δ N (y, x)
i=1
Note that Q(y, z, x) is proper and lower semicontinuous. Under Assumption 3.1, we have Q(y, z, x) → +∞ as (y T , z T , x T )T → +∞. By [37, Theorem 1.9], we obtain that the optimal solution set of (3.1) is nonempty and compact.
123
Journal of Scientific Computing
Since Algorithm 1 is a direct application of ADMM with two blocks of variables (y, z) and x, the convergence can be guaranteed [6,16–18]. For the sake of completeness, we summarize the convergence of Algorithm 1 in the following theorem. Theorem 3.1 Suppose that Assumption 3.1 holds. Let the sequence {(y k , z k , x k , u k1 , u k2 )} be √ generated by Algorithm 1. For any γ1 , γ2 > 0, τ ∈ (0, (1 + 5)/2), then {(y k , z k , x k )} converges to an optimal solution to (3.1) and {(u k1 , u k2 )} converges to an optimal solution to the dual problem of (3.1). Proof By Proposition 3.1, we know that the optimal solution is nonempty. Note that (3.1) can be rewritten as min μz∞ + s.t.
m
yi
i=1
I 0 0 I
y D 0 − x= . z K −f
Since D T D + K T K is positive definite by Assumption 3.1, the conditions of [16, Theorem B.1] are satisfied. Hence we can obtain the conclusion of this theorem by [16, Theorem B.1].
4 Numerical Results In this section, we present numerical results to validate the effectiveness of the proposed model for uniform noise removal. We test many linear inverse problems including image denoising and deblurring, inverse source problems in 2D, inverse heat conduction problems, and second derivative problems. All the experiments are performed under Windows 7 and MATLAB R2015b running on a laptop (AMDA10-5750M, @2.50GHz, 8G RAM). To evaluate the performance of the proposed model, the peak signal-to-noise ratio (PSNR) is adopted to measure the quality of the estimated solution, and it is defined as follows: PSNR := 10 log10
m(xmax − xmin )2 , x − x ˆ 2
where x ∈ Rm is the exact solution with each entry in the range [xmin , xmax ] and xˆ ∈ Rm is the estimated solution. The termination criterion of Algorithm 1 is that the relative change between two consecutive iterates of the estimated solution should satisfy the following inequality: x k+1 − x k ≤ tol, x k
(4.1)
where tol is the given accuracy in advance. We set tol = 1 × 10−4 for the image denoising and deblurring, inverse source problems in 2D, and inverse heat conduction problems, and set tol = 5 × 10−4 for the second derivative problems. The maximum iteration number of Algorithm √ 1 is set to be 1000. In the ADMM method, the step-size τ can be chosen from (0, (1 + 5)/2) for its convergence [6,16–18]. We just set τ to be 1.618 in all experiments for simplicity since it is not sensitive to its choice for the testing data. For the penalty parameters γ1 , γ2 in ADMM, since K x − f is smaller than Dx for the testing problems, the penalty parameter γ1 may be smaller than γ2 . We choose γ1 around the the range
123
Journal of Scientific Computing 1.4 Original Observation MY-SSN SL-SSN
1.2 1
L∞ TV
0.8 0.6 0.4 0.2 0 -0.2
0
50
100
150
200
250
300
350
400
450
500
Fig. 1 Restored results of MY-SSN, SL-SSN, and L∞ TV for the piecewise constant signal with σ = 0.1 in the inverse heat conduction problem
√ √ √ √ (0.1 n/ f , n/ f ) and γ2 around (100 n/ f , 1000 n/ f ). For simplicity, some choices of γ1 , γ2 are just given for the testing problems. We compare our proposed model with the L1TV model [8,32,33] and the L2TV model [39]. For the L1TV and L2TV models, we employ the ADMM in [50] to solve the minimization problem. The termination criterion for the L1TV and L2TV models is same as (4.1) and the maximum iteration number is also set to be 1000. Note that the regularization term of our model is the TV term, which is nondifferentiable. While the regularization term in [9] and [49] is x2 , which is differentiable. We give an example for an inverse heat conduction problem (see in Sect. 4.5 for details) to compare L∞ TV with the methods in [9] and [49], i.e., Moreau-Yosida semi-smooth Newton method (MY-SSN) [9] and semi-smooth Newton method combined with the method of moments for solving ∞ -norm constrained problem (SL-SSN) [49]. Figure 1 shows the restored results by different methods for a piecewise constant signal with σ = 0.1. From Fig. 1, it can be seen that the L∞ TV can preserve piecewise constant regions, while MY-SSN and SL-SSN cannot preserve piecewise constant regions for this signal. In addition, the CPU time of MY-SSN and SL-SSN are 6.36 and 0.082 seconds, respectively, while the CPU time of ADMM is 1.01 seconds. We remark that we mainly focus on the quality of solutions of each methods. Since MY-SSN and SL-SSN cannot preserve piecewise constant regions of the signal, we do not compare L∞ TV with the two methods in the following experiments.
4.1 The Convergence of ADMM In this subsection, we present the convergence curve to show the convergence of ADMM. The testing image is the Cameraman image (see in Sect. 4.3 for details) corrupted by Average blur and uniform noise with noise level c = 0.05, where the intensity range of the original image is scaled into [0,1] before generating f . Figure 2 shows the objective function values
123
Journal of Scientific Computing
5
×10
4
25
4
24
3.5 3
PSNR
Objective function values
4.5
2.5 2
1.5
23
22
21
0
50
100
150
20
0
50
100
150
Iteration
Iteration
Fig. 2 Objective function values and PSNR values versus number of iterations in ADMM for the Cameraman image corrupted by Average blur and uniform noise with c = 0.05
and PSNR values versus the number of iterations. It can be observed that both the objective function values and PSNR values converge when the number of iterations increases.
4.2 L-Curve Method For the regularization parameter μ in our model, we use the L-curve method to select it automatically [20]. The L-curve method has been widely used for the selection of regularization parameter, e.g., see the papers [23,46] and references therein. We plot the mvalid regularization parameters associated with the logarithms of the regularization term i=1 Di x and data fitting term K x − f ∞ with respect to the solutions of L∞ TV. The L-curve is the graph of the points
log(K x − f ∞ ), log
m
Di x
,
i=1
where x is the solution of L∞ TV for different μ. In order to determine a reasonable regularization parameter, the best choice is the corner of the L-curve [23], which is used to balance the regularization term and data fitting term. We use the method in [23] to define the corner as the point of L-curve with the maximal curvature. The regularization toolbox1 is used to select the regularization parameter efficiently [22]. The L-curve method is to find the corner of L-curve and the corresponding regularization parameter μ [7,23]. In the following, if the regularization parameter μ is selected by the L-curve method, (1.2) is denoted by L∞ TV-L for short. And if the regularization parameter μ is selected manually, we denote (1.2) as L∞ TV. In Fig. 3, we plot the L-curves of Cameraman image corrupted by Gaussian blur and uniform noise with noise level c = 0.15 and House image corrupted by Average blur and uniform noise with noise level c = 0.025, respectively, where the red squares are the corners of L-curves by the regularization toolbox. It can be observed that we select μ = 8.5 × 103 and μ = 8.7 × 103 as the regularization parameters for the two images which are the closest points of the corners in the L-curves. 1 http://www2.compute.dtu.dk/~pcha/Regutools/.
123
Journal of Scientific Computing 7.6
7.6 7.4 m ||D i x||) i=1
7.2
7
= 8.5 10
6.8
6.6 -1.92
7.2 7
= 8.7 103
3
log(
log(
m ||D i x||) i=1
7.4
6.8 6.6
-1.9
-1.88
-1.86
-1.84
-1.82
-1.8
log(||Kx-f|| )
(a)
-1.78
6.4 -3.8
-3.7
-3.6
-3.5
-3.4
-3.3
-3.2
-3.1
-3
log(||Kx-f|| )
(b)
Fig. 3 The L-curves of Cameraman and House images corrupted by blurs and uniform noise. a Cameraman image corrupted by Gaussian blur and uniform noise with c = 0.15. b House image corrupted by Average blur and uniform noise with c = 0.025
4.3 Image Denoising and Deblurring In this subsection, we present some numerical results for image denoising and deblurring, where the matrix K is a spatial-invariant blur matrix [30]. The testing images include Cameraman (256 × 256), House (256 × 256), Pepper (512 × 512), Bridge (512 × 512), Jetplane (512 × 512), and Pirate (512 × 512), which are shown in Fig. 4. In these experiments, we test two kinds of blurs, i.e., Average blur (11 × 11) and Gaussian blur (19 × 19, standard deviation = 3). In our experiments, the original images are normalized on [0, 1]. For the penalty parameters in Algorithm 1, γ1 is set to be 5 and γ2 is selected from {5000, 8000, 10000} to get the highest PSNR values. In the L∞ TV model, we select μ from {8 × 105 , 5 × 105 , 4 × 105 , 3 × 105 , 2 × 105 , 1 × 105 , 8 × 104 } to get the highest PSNR values. For the equation in (3.12) in image denoising and deblurring, both K and D are block circulant matrices with circulant blocks under periodic boundary conditions. Then (3.12) can be diagonalized by using fast Fourier transforms [30,31]. The computational cost of the method is dominated by three fast discrete transforms for solving the linear equation in (3.12). For more details on how to deal with other boundary conditions, the interested readers are referred to [30,31]. Table 1 shows the image restoration results obtained by the L∞ TV-L, L∞ TV, L1TV, and L2TV models, where the observed images are corrupted by Average blur and uniform noise with noise levels 0.025, 0.05, 0.15, 0.3. The CPU time and number of iterations (Ite) of L∞ TV-L refer to the CPU time and number of iterations of Algorithm 1 by assuming that a reasonable regularization parameter is chosen by the L-curve method. It can be seen that the PSNR values obtained by L∞ TV and L∞ TV-L are higher than those obtained by L1TV and L2TV, the PSNR values obtained by L2TV are higher than those obtained by L1TV. The performance of L∞ TV and L∞ TV-L is almost the same in terms of PSNR values. Moreover, the L∞ TV and L∞ TV-L needs more CPU time (in seconds) than L1TV and L2TV since the computation of the proximal mapping of ∞ -norm is more expensive than those of the
123
Journal of Scientific Computing
Fig. 4 Original images. From left to right and top to bottom: Cameraman, House, Pepper, Bridge, Jetplane, and Pirate
proximal mappings of 1 - and 2 -norms. The number of iterations of L2TV is less than those of L1TV, L∞ TV, and L∞ TV-L in general. The visual comparisons of these restored images obtained by L∞ TV-L, L∞ TV, L1TV, and L2TV are shown in Fig. 5. Here we test different methods on Cameraman, House, Pepper, Bridge, Jetplane, and Pirate images corrupted by Average blur and uniform noise with noise level c = 0.05. From Fig. 5, we can observe that L∞ TV-L and L∞ TV reach better visual quality and higher PSNR values than L1TV and L2TV. Moreover, it is clear that the images restored by L∞ TV-L and L∞ TV keep more details than the other two models. Table 2 shows the restoration results of L∞ TV-L, L∞ TV, L1TV and L2TV for Cameraman, House, Pepper, Bridge, Jetplane, and Pirate images corrupted by Gaussian blur and uniform noise with noise levels 0.025, 0.05, 0.15, 0.3. We can observe that the performance of L∞ TV and L∞ TV-L is better than those of L1TV and L2TV in terms of PSNR values. The performance of L2TV is better than those of L1TV in terms of PSNR values and CPU time. The CPU time required by L∞ TV and L∞ TV-L is higher than those required by L2TV, which is similar to the case of Average blur. Figure 6 shows the visual comparisons of L∞ TV-L, L∞ TV, L1TV and L2TV for Cameraman, House, Pepper, Bridge, Jetplane, and Pirate images corrupted by Gaussian blur and uniform noise with noise level c = 0.15. We can observe that the images recovered by L∞ TV-L and L∞ TV preserve many more details than those recovered by L1TV and L2TV. The visual quality of L2TV is better than that of L1TV for these reference images.
4.4 Inverse Source Problem in 2D In this subsection, we consider an inverse source problem in 2D for an elliptic partial differential operator with homogeneous Dirichlet boundary conditions [9,49], i.e., K = A−1 ,
123
123
Jetplane
Bridge
Pepper
House
27.00
25.65
23.82
22.87
0.15
0.30
20.81
20.35
0.15
0.30
0.025
22.10
0.05
22.81
0.025
25.45
24.45
0.15
0.30
0.05
28.58
27.27
0.025
24.09
22.73
0.15
0.30
0.05
28.17
26.55
0.025
21.16
20.13
0.15
0.30
0.05
23.56
22.62
0.025
Cameraman
43.29
36.18
15.73
18.85
98.53
42.91
23.67
17.15
66.25
47.01
25.26
18.43
7.86
4.26
4.41
2.53
9.69
5.80
2.73
2.26
268
212
119
125
362
248
181
129
359
254
186
137
330
234
233
124
365
261
143
126
23.65
24.83
26.83
27.74
20.73
21.45
22.72
23.01
25.21
26.59
28.48
29.24
23.89
25.22
27.86
29.39
20.99
21.88
23.46
24.02
PSNR
Ite
PSNR
CPU
L2TV
L1TV
0.05
c
Image
18.95
15.61
10.83
10.37
28.57
26.19
14.71
14.47
33.12
20.05
14.38
12.20
3.27
1.90
1.91
1.70
4.03
2.38
2.21
1.96
CPU
140
130
101
87
178
168
138
130
179
162
124
109
161
133
129
113
187
164
143
135
Ite
25.42
26.08
27.80
28.42
21.68
22.31
23.31
23.92
26.59
27.85
29.43
30.17
25.92
27.11
28.99
30.06
22.19
22.76
24.23
25.14
PSNR
L∞ TV
88.83
53.21
32.08
31.13
98.16
66.52
29.62
28.27
96.96
61.90
41.49
36.54
17.69
11.12
7.02
8.49
12.10
12.45
7.23
6.09
CPU
251
185
130
122
355
196
122
111
341
198
161
144
376
249
165
199
386
289
171
141
Ite
25.00
25.96
27.66
28.60
21.70
22.29
23.00
23.33
26.77
28.00
29.38
30.12
25.20
26.84
28.42
30.17
22.05
22.87
23.95
25.16
PSNR
L∞ TV-L
94.67
68.97
49.13
45.50
102.01
89.29
136.47
114.76
95.98
81.23
91.90
61.39
24.83
16.94
6.36
10.64
19.08
15.93
7.57
6.03
CPU
308
245
205
187
362
294
239
227
342
273
211
191
500
316
139
135
390
341
146
141
Ite
Table 1 PSNR (dB) values, CPU time (s), and number of iterations of different methods for the testing images corrupted by Average blur and uniform noise with different noise levels
Journal of Scientific Computing
23.99
23.22
0.15
0.30
15.14
62.53
40.72
17.83
Bold values indicate the best method for each test case
26.28
25.39
0.025
Pirate
328
243
132
111
23.75
24.66
26.31
27.18
PSNR
Ite
PSNR
CPU
L2TV
L1TV
0.05
c
Image
Table 1 continued
24.82
23.01
14.34
12.91
CPU
169
161
112
108
Ite
25.18
25.75
26.99
27.81
PSNR
L∞ TV
92.00
77.52
41.60
39.98
CPU
261
232
149
137
Ite
24.90
25.76
26.85
27.41
PSNR
L∞ TV-L
108.42
87.26
54.29
50.93
CPU
349
276
216
201
Ite
Journal of Scientific Computing
123
Journal of Scientific Computing Corruption
PSNR: 22.62
PSNR: 23.46
PSNR: 24.23
PSNR: 23.95
Corruption
PSNR: 26.55
PSNR: 27.86
PSNR: 28.99
PSNR: 28.42
Corruption
PSNR: 27.27
PSNR: 28.48
PSNR: 29.43
PSNR: 29.38
Corruption
PSNR: 22.10
PSNR: 22.72
PSNR: 23.31
PSNR: 23.00
Corruption
PSNR: 25.65
PSNR: 26.83
PSNR: 27.80
PSNR: 27.66
Corruption
PSNR: 25.39
PSNR: 26.31
PSNR: 26.99
PSNR: 26.85
Fig. 5 Restored images (with PSNR (dB)) of different methods on Cameraman, House, Pepper, Bridge, Jetplane, and Pirate images corrupted by Average blur and uniform noise with c = 0.05. First column: Corrupted images. Second column: Restored images by L1TV. Third column: Restored images by L2TV. Fourth column: Restored images by L∞ TV. Fifth column: Restored images by L∞ TV-L
123
Jetplane
Bridge
Pepper
House
26.13
24.37
23.07
0.05
0.15
0.30
20.29
0.30
26.98
21.07
0.15
0.025
22.05
0.05
24.22
0.30
22.41
25.93
0.15
0.025
28.81
27.92
22.99
0.30
0.025
24.59
0.15
0.05
26.54
0.05
20.63
0.30
27.66
21.47
0.15
0.025
22.91
22.57
0.025
Cameraman
38.62
31.27
30.59
29.55
57.62
30.85
40.81
30.20
43.27
34.55
34.50
23.67
6.86
4.14
4.41
3.36
5.50
3.80
2.91
3.14
256
212
206
154
315
231
304
227
316
256
250
171
298
219
235
178
320
239
151
142
24.18
25.45
27.00
27.83
20.97
21.70
22.59
22.91
25.60
27.05
28.54
29.59
24.26
25.60
27.63
28.59
21.19
21.96
22.94
23.56
PSNR
Ite
PSNR
CPU
L2TV
L1TV
0.05
c
Image
18.36
12.54
10.84
9.82
22.03
21.59
21.47
19.90
21.88
15.82
11.35
11.52
2.51
1.99
2.06
2.05
3.03
2.45
2.29
2.22
CPU
169
110
97
91
209
154
145
141
207
142
115
107
191
132
127
128
227
161
157
148
Ite
25.50
26.40
27.56
28.30
21.63
22.27
22.92
23.32
27.43
28.35
29.41
29.97
25.97
26.91
28.21
29.01
22.21
22.71
23.43
23.89
PSNR
L∞ TV
93.91
59.96
44.52
50.34
77.36
68.20
51.49
71.83
85.22
71.77
59.13
50.57
11.56
12.47
8.24
8.44
12.23
12.52
11.04
12.07
CPU
258
226
180
195
230
250
187
238
261
273
192
164
259
301
185
198
285
289
259
271
Ite
25.59
26.44
27.51
28.02
21.81
22.31
22.78
22.98
27.20
28.26
29.44
29.98
25.65
26.32
28.36
29.19
21.62
22.46
23.58
23.90
PSNR
L∞ TV-L
132.56
78.35
74.12
56.58
72.10
53.50
46.85
44.45
73.23
65.72
52.86
49.82
25.64
23.77
18.34
18.24
17.18
16.32
13.64
11.92
CPU
241
196
151
135
240
191
152
143
282
227
177
158
370
374
280
238
414
399
294
262
Ite
Table 2 PSNR (dB) values, CPU time (s), and number of iterations of different methods for the testing images corrupted by Gaussian blur and uniform noise with different c
Journal of Scientific Computing
123
123
24.23
23.08
0.15
0.30
46.53
31.86
18.50
16.52
328
239
140
122
Bold values indicate the best method for each test case
26.38
25.69
0.025
Pirate
24.03
25.06
26.35
26.98
PSNR
Ite
PSNR
CPU
L2TV
L1TV
0.05
c
Image
Table 2 continued
22.33
14.55
12.37
12.16
CPU
201
134
114
112
Ite
25.17
25.93
26.88
27.33
PSNR
L∞ TV
66.40
64.11
55.17
46.69
CPU
260
252
196
171
Ite
25.29
25.98
26.80
27.21
PSNR
L∞ TV-L
63.73
57.27
44.48
43.28
CPU
251
211
169
152
Ite
Journal of Scientific Computing
Journal of Scientific Computing Corruption
PSNR: 21.47
PSNR: 21.96
PSNR: 22.71
PSNR: 22.46
Corruption
PSNR: 24.59
PSNR: 25.60
PSNR: 26.91
PSNR: 26.32
Corruption
PSNR: 25.93
PSNR: 27.05
PSNR: 28.35
PSNR: 28.26
Corruption
PSNR: 21.07
PSNR: 21.70
PSNR: 22.27
PSNR: 22.31
Corruption
PSNR: 24.37
PSNR: 25.45
PSNR: 26.40
PSNR: 26.44
Corruption
PSNR: 24.23
PSNR: 25.06
PSNR: 25.93
PSNR: 25.98
Fig. 6 Restored images (with PSNR (dB)) of different methods on Cameraman, House, Pepper, Bridge, Jetplane, and Pirate images corrupted by Gaussian blur and uniform noise with c = 0.15. First column: Corrupted images. Second column: Restored images by L1TV. Third column: Restored images by L2TV. Fourth column: Restored images by L∞ TV. Fifth column: Restored images by L∞ TV-L
123
Journal of Scientific Computing
where A f = −a f + b, ∇ f + ν f . Here a, b, ν are chosen such that A is an isomorphism. We set a = 1, b = (−2, 0)T , and ν = 0 in our experiments. In general, the size of A is very large for the inverse source problem in 2D. We do not deal with A−1 directly in (3.12). By substituting A−1 into (3.12), we obtain that (γ1 I + γ2 A T D T D A)(A−1 x) = γ1 (z k+1 + f ) − u k1 + A T D T (γ2 y k+1 − u k2 ).
(4.2)
First we apply the conjugate gradient method [30] to get A−1 x and then can get x easily. For the inverse source problem in 2D, the exact solution is given explicitly by 1, if |t1 | ≤ 13 , and |t2 | ≤ 13 , x(t1 , t2 ) = 0, otherwise, which is shown in Fig. 7a. The difference operators are discretized by using the standard finite differences on a uniform mesh of size 64 × 64. For the noise level, we choose c = σ A−1 x∞ with different σ > 0. For the penalty parameters in Algorithm 1, we set γ1 = 1 and γ2 = 1 × 105 , respectively. For the regularization parameter μ, we select it from {100000, 100150, 100200, 100250, 100300, 100400, 100500} to get the highest PSNR values. The visual comparisons of reconstruction obtained by L∞ TV-L, L∞ TV, L1TV, and L2TV are shown in Fig. 7, where the exact solution is corrupted by uniform noise with σ = 0.1. We can observe that the estimated solutions of L∞ TV-L and L∞ TV fit the exact solution better than those of L1TV and L2TV. It is clear that the performance of L∞ TV-L and L∞ TV is better than those of the other two models in terms of visual quality. The differences between the exact solutions and the corresponding estimated solutions of the four methods are shown in Fig. 7d, f, h, j. We can see that the differences of L∞ TV-L and L∞ TV are much smaller than those of L1TV and L2TV. Furthermore, Table 3 shows the PSNR values, CPU time, and number of iterations of different methods with different noise levels in the inverse source problem in 2D. It can be seen that L∞ TV-L and L∞ TV get higher PSNR values than the other two models. The CPU time required by L∞ TV-L and L∞ TV is more than those required by L1TV and L2TV since the computation of the proximal mapping of ∞ -norm is much more expensive than those of the proximal mappings of 1 - and 2 -norms. The PSNR values obtained by L2TV is slightly higher than those obtained by L1TV for low noise levels.
4.5 Inverse Heat Conduction Problem In this subsection, we consider an inverse heat conduction problem [9,49], which is proposed as a Volterra integral equation of the first kind [21]. The observed data is described as t f (t) = k(s − t)x(s)ds + v(t), 0
where the kernel k(t) is given by 3
k(t) =
123
t − 2 − 12 √ e 4κ t , 2κ π
Journal of Scientific Computing
Fig. 7 The estimated solutions by different methods in the inverse source problem in 2D with σ = 0.1. a Exact data. b Observation. c Estimated solution by L1TV. d Difference between exact and estimated solutions. e Estimated solution by L2TV. f Difference between exact and estimated solutions. g Estimated solution by L∞ TV. h Difference between exact and estimated solutions. i Estimated solution by L∞ TV-L. j Difference between exact and estimated solutions
123
Journal of Scientific Computing Table 3 PSNR (dB) values, CPU time (s), and number of iterations of different methods for estimated solutions with different σ in the inverse source problem in 2D σ
L1TV PSNR
L2TV CPU
Ite
PSNR
L∞ TV CPU
Ite
PSNR
L∞ TV-L CPU
Ite
PSNR
CPU
Ite
0.1
19.62
132.86
860
20.05
19.09
215
23.73
148.46
985
22.70
137.01
967
0.2
18.77
152.95
810
19.48
22.07
354
22.98
131.37
976
22.41
156.91
957
0.3
18.37
187.45
966
18.90
18.83
287
21.21
158.49
936
22.04
99.15
895
0.4
18.08
251.58
1000
17.71
17.18
268
20.91
132.07
853
20.46
119.27
915
0.5
17.32
101.70
675
15.77
18.82
243
20.54
134.68
904
20.03
117.34
912
0.6
16.12
102.97
709
13.87
17.22
241
20.11
91.64
808
19.58
143.45
801
0.7
15.82
181.83
952
13.05
15.88
186
20.04
89.60
780
19.23
144.26
840
0.8
15.71
187.85
988
12.38
18.98
248
19.90
117.85
665
19.12
143.85
892
0.9
14.99
159.20
1000
11.94
17.83
235
19.12
149.51
820
19.05
167.74
926
Bold values indicate the best method for each test case 1.2 L1TV L2TV
1
L∞TV L∞TV-L
Observation Original
0.8 0.6 0.4 0.2 0 -0.2
0
50
100
150
200
250
300
350
400
450
500
Fig. 8 Restored results of different methods for the piecewise constant signal with σ = 0.1 in the inverse heat conduction problem
where κ controls the ill-conditioning of the matrix. The integral equation is discretized by means of simple quadrature (midpoint rule). Then the non-symmetric Toeplitz matrix K is given by 2i − 1 , i = 1, . . . , n, K i,1 = k 2n and K 1, j = 0, j = 2, . . . , n. The exact solution is a piecewise constant signal, which is shown in Fig. 8. In the uniform distribution of the noise vector, we set c = σ K x∞ with different σ > 0 to control the noise levels. In our test, n and κ are set to be 500 and 1, respectively. The regularization parameter is selected from {400, 600, 800, 1000, 1100, 1300, 1500}. For the penalty parameters, γ1 is set to be 5 and γ2 is selected from {5000, 8000, 10000}.
123
Journal of Scientific Computing Table 4 PSNR (dB) values, CPU time (s), and number of iterations of different methods for estimated solutions with different σ in the inverse heat conduction problem σ
L1TV
L2TV
L∞ TV
L∞ TV-L
PSNR
CPU
Ite
PSNR
CPU
Ite
PSNR
CPU
Ite
PSNR
CPU
Ite
0.1
15.13
0.56
456
16.46
1.05
195
21.39
0.79
577
20.92
0.95
632
0.2
13.72
0.53
451
14.03
1.29
242
20.20
1.13
746
19.40
1.15
807
0.3
12.88
0.51
472
13.48
1.43
274
19.07
0.94
706
18.70
1.01
579
0.4
12.73
0.63
608
13.23
1.52
286
17.88
1.09
813
17.48
1.02
688
0.5
12.08
0.50
473
12.91
1.66
316
16.75
1.04
672
16.37
1.46
902
0.6
11.74
0.69
642
12.44
1.75
332
15.65
1.11
825
15.36
1.08
737
0.7
11.44
0.52
487
12.13
1.76
337
14.41
1.20
895
14.31
1.14
781
0.8
11.17
0.61
596
11.93
1.42
270
13.59
1.30
968
13.76
1.17
805
0.9
11.11
0.63
596
11.81
1.56
298
12.76
1.24
943
13.70
1.46
962
Bold values indicate the best method for each test case
Figure 8 shows the restored results of L∞ TV-L, L∞ TV, L1TV, and L2TV with noise level σ = 0.1. We can observe that the estimated solutions of L∞ TV-L and L∞ TV fit the exact solution better than those of L1TV and L2TV. Moreover, L∞ TV-L and L∞ TV can restore more piecewise constant regions and sharp edges than L1 TV and L2TV. We also show the restored results of L∞ TV-L, L∞ TV, L1TV, and L2TV with different noise levels in Table 4. It can be seen that the PSNR values obtained by L∞ TV-L and L∞ TV are much higher than those obtained by L1TV and L2TV. Furthermore, the CPU time required by L1TV is less than those required by the other three models, while the number of iterations required by L2TV is less than those required by the other three models.
4.6 First-Kind of Fredholm Integral Equation In this subsection, we consider the computation of the following first-kind of Fredholm integral equation [10,21], i.e., 1 f (t) = k(s, t)x(s)ds + v(t), 0
where the kernel k(s, t) is Green’s function for the second derivative: s(t − 1), if s < t, k(s, t) = t(s − 1), otherwise. The integral equation is discretized by means of the Galerkin method with orthogonal box functions. Then the symmetric matrix K is given by 2 1 K i,i = h 2 h(i 2 − i + ) − (i − ) , i = 1, . . . , n, 4 3
1 1 2 h(i − ) − 1 , j < i, K i, j = h j − 2 2 where h = n1 . The exact solution with piecewise constant regions is shown in Fig. 9, where n = 600. The range of uniform distribution is set to c = σ K x∞ with different σ > 0. The penalty parameter γ1 is set to be 5 and γ2 is selected from
123
Journal of Scientific Computing 0.8 L1TV L2TV
0.7
L∞TV L∞TV-L
0.6
Observation Original
0.5 0.4 0.3 0.2 0.1 0 -0.1
0
100
200
300
400
500
600
Fig. 9 Restored results of different methods for the piecewise constant signal with σ = 0.1 in the second derivative problem
Table 5 PSNR (dB) values, CPU time (s), and number of iterations of different methods for estimated solutions with different σ in the second derivative problem σ
L1TV PSNR
L2TV CPU
Ite
PSNR
L∞ TV CPU
Ite
PSNR
L∞ TV-L CPU
Ite
PSNR
CPU
Ite
0.1
18.27
0.54
355
19.03
0.75
95
21.14
0.66
359
20.86
0.87
458
0.2
16.21
0.51
323
17.03
0.77
96
20.00
1.42
825
19.75
1.57
844
0.3
14.08
0.71
452
16.16
0.82
105
18.95
1.09
617
18.85
1.28
658
0.4
13.51
0.58
355
15.61
0.85
107
18.02
1.00
562
17.08
1.37
723
0.5
13.11
0.70
455
15.14
0.87
111
17.53
0.93
517
16.10
0.92
451
0.6
12.57
0.78
534
14.75
0.82
103
16.11
0.80
439
15.56
0.99
506
0.7
12.43
0.72
445
14.40
0.82
104
15.49
0.90
495
15.36
1.44
776
0.8
12.38
0.64
405
14.11
0.91
117
14.77
1.01
490
14.94
1.46
773
0.9
12.08
0.76
502
13.92
0.91
119
14.62
1.08
587
14.85
1.47
718
Bold values indicate the best method for each test case
{5000, 7000, 10000} in Algorithm 1, respectively. The regularization parameter μ is selected from {2000, 4000, 5000, 6000, 8000, 9000, 10000, 12000} with respect to L∞ TV. In Fig. 9, we show the restored results obtained by different methods with σ = 0.1. We can observe that the estimated solutions of L∞ TV-L and L∞ TV are better than those of L1TV and L2TV in terms of the fitting accuracy. Furthermore, the estimated solutions of L∞ TV-L and L∞ TV preserve many more sharp edges than those of the other two models for this signal. Furthermore, we list the PSNR values, CPU time, and number of iterations of the different models in Table 5. We can observe that the PSNR values of L∞ TV-L and L∞ TV are higher than those of L1TV and L2TV for different noise levels. The L∞ TV-L and L∞ TV need more
123
Journal of Scientific Computing
CPU time than L1TV and L2TV in general since the computational cost of the proximal mapping of ∞ -norm is more expensive than those of the proximal mappings of 1 - and 2 -norms. The number of iterations required by L2TV is much smaller than those required by L1TV, L∞ TV and L∞ TV-L.
5 Concluding Remarks In this paper, we have proposed a variational model for linear inverse problems corrupted by uniform noise. By MAP estimation, we combine an ∞ data fitting term with a TV regularization term for the uniform noise. Then the ADMM is applied to solve the resulting model according to variable splitting techniques. By the Moreau decomposition [36], the subproblem with respect to ∞ -norm can be solved efficiently by a fast projection algorithm onto the 1 ball. The convergence of ADMM is established under some mild conditions. Moreover, we propose to use the L-curve method to select the regularization parameter automatically. Our preliminary numerical experiments in many linear inverse problems demonstrate the superior performance of the proposed model over state-of-the-art methods including L1TV and L2TV. Acknowledgements The authors would like to thank the anonymous referees for their constructive comments and suggestions that have helped improve the presentation of the paper greatly.
References 1. Alliney, S.: A property of the minimum vectors of a regularizing functional defined by means of the absolute norm. IEEE Trans. Signal Process. 45(4), 913–917 (1997) 2. Aubert, G., Aujol, J.-F.: A variational approach to removing multiplicative noise. SIAM J. Appl. Math. 68(4), 925–946 (2008) 3. Bai, M., Zhang, X., Shao, Q.: Adaptive correction procedure for TVL1 image deblurring under impulse noise. Inverse Probl. 32(8), 085004 (2016) 4. Bertero, M., Boccacci, P.: Introduction to Inverse Problems in Imaging. CRC Press, Boca Raton (1998) 5. Bovik, A.: Handbook of Image and Video Processing. Academic Press, New York (2000) 6. Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011) 7. Castellanos, J.L., Gómez, S., Guerra, V.: The triangle method for finding the corner of the L-curve. Appl. Numer. Math. 43(4), 359–373 (2002) 8. Chan, T.F., Esedoglu, S.: Aspects of total variation regularized L 1 function approximation. SIAM J. Appl. Math. 65(5), 1817–1837 (2005) 9. Clason, C.: L ∞ fitting for inverse problems with uniform noise. Inverse Probl. 28(10), 104007 (2012) 10. Clason, C., Jin, B., Kunisch, K.: A semismooth Newton method for L1 data fitting with automatic choice of regularization parameters and noise calibration. SIAM J. Imaging Sci. 3(2), 199–231 (2010) 11. Colton, D., Coyle, J., Monk, P.: Recent developments in inverse acoustic scattering theory. SIAM Rev. 42(3), 369–414 (2000) 12. Condat, L.: Fast projection onto the simplex and the l1 ball. Math. Program. 158(1–2), 575–585 (2016) 13. Dong, Y., Zeng, T.: A convex variational model for restoring blurred images with multiplicative noise. SIAM J. Imaging Sci. 6(3), 1598–1625 (2013) 14. Duchi, J., Shalev-Shwartz, S., Singer, Y., Chandra, T.: Efficient projections onto the 1 -ball for learning in high dimensions. In: Proceedings of the International Conference on Machine Learning, pp. 272–279. ACM (2008) 15. Durand, S., Nikolova, M.: Denoising of frame coefficients using 1 data-fidelity term and edge-preserving regularization. Multiscale Model. Simul. 6(2), 547–576 (2007) 16. Fazel, M., Pong, T.K., Sun, D., Tseng, P.: Hankel matrix rank minimization with applications to system identification and realization. SIAM J. Matrix Anal. Appl. 34(3), 946–977 (2013)
123
Journal of Scientific Computing 17. Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2(1), 17–40 (1976) 18. Glowinski, R., Marrocco, A.: Sur l’approximation par éléments finis d’ordre un, et la résolution, par pénalisation-dualité, d’une classe de problèmes de Dirichlet non linéaires. Revue Française d’Autom. Informat. Rech. Opér. Anal. Numér 9(R2), 41–76 (1975) 19. Gonzalez, R.C., Woods, R.E.: Digital Image Processing, 3rd edn. Pearson, London (2008) 20. Hansen, P.C.: Analysis of discrete ill-posed problems by means of the L-curve. SIAM Rev. 34(4), 561–580 (1992) 21. Hansen, P.C.: Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion. SIAM, Philadephia (1998) 22. Hansen, P.C.: Regularization tools version 4.0 for Matlab 7.3. Numer. Algorithms 46(2), 189–194 (2007) 23. Hansen, P.C., O’Leary, D.P.: The use of the L-curve in the regularization of discrete ill-posed problems. SIAM J. Sci. Comput. 14(6), 1487–1503 (1993) 24. Held, M., Wolfe, P., Crowder, H.P.: Validation of subgradient optimization. Math. Program. 6(1), 62–88 (1974) 25. Huang, Y.-M., Lu, D.-Y., Zeng, T.: Two-step approach for the restoration of images corrupted by multiplicative noise. SIAM J. Sci. Comput. 35(6), A2856–A2873 (2013) 26. Huang, Y.-M., Moisan, L., Ng, M.K., Zeng, T.: Multiplicative noise removal via a learned dictionary. IEEE Trans. Image Process. 21(11), 4534–4543 (2012) 27. Huang, Y.-M., Ng, M.K., Wen, Y.-W.: A fast total variation minimization method for image restoration. Multiscale Model. Simul. 7(2), 774–795 (2008) 28. Kay, S.M.: Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice-Hall PTR, Englewood Cliffs (1993) 29. Le, T., Chartrand, R., Asaki, T.J.: A variational approach to reconstructing images corrupted by Poisson noise. J. Math. Imaging Vis. 27(3), 257–263 (2007) 30. Ng, M.K.: Iterative Methods for Toeplitz Systems. Oxford University Press, London (2004) 31. Ng, M.K., Chan, R.H., Tang, W.-C.: A fast algorithm for deblurring models with Neumann boundary conditions. SIAM J. Sci. Comput. 21(3), 851–866 (1999) 32. Nikolova, M.: Minimizers of cost-functions involving non-smooth data-fidelity terms. Application to the processing of outliers. SIAM J. Numer. Anal. 40(3), 965–994 (2002) 33. Nikolova, M.: A variational approach to remove outliers and impulse noise. J. Math. Imaging Vis. 20(1–2), 99–120 (2004) 34. Nikolova, M.: Weakly constrained minimization: application to the estimation of images and signals involving constant regions. J. Math. Imaging Vis. 21(2), 155–175 (2004) 35. Parikh, N., Boyd, S.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014) 36. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970) 37. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, 3rd edn. Springer, Berlin (2009) 38. Rudin, L.I., Osher, S.: Total variation based image restoration with free local constraints. In: Proceedings of the IEEE International Conference on Image Processing, volume 1, pp. 31–35. Austin, TX (1994) 39. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1–4), 259–268 (1992) 40. Sciacchitano, F., Dong, Y., Zeng, T.: Variational approach for restoring blurred images with Cauchy noise. SIAM J. Imaging Sci. 8(3), 1894–1922 (2015) 41. Setzer, S., Steild, G., Teuber, T.: Deblurring Poissonian images by split Bregman techniques. J. Vis. Commun. Image Rep. 21(3), 193–199 (2010) 42. Steidl, G., Teuber, T.: Removing multiplicative noise by Douglas–Rachford splitting methods. J. Math. Imaging Vis. 36(2), 168–184 (2010) 43. Tikhonov, A.N., Arsenin, V.Y.: Solutions of Ill-Posed Problems. Winston, Washington, DC (1977) 44. van den Berg, E., Friedlander, M.P.: Probing the pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput. 31(2), 890–912 (2008) 45. Wan, T., Canagarajah, N., Achim, A.: Segmentation of noisy colour images using Cauchy distribution in the complex wavelet domain. IET Image Process. 5(2), 159–170 (2011) 46. Wang, F., Zhao, X.-L., Ng, M.K.: Multiplicative noise and blur removal by framelet decomposition and 1 -based L-curve method. IEEE Trans. Image Process. 25(9), 4222–4232 (2016) 47. Weiss, P., Aubert, G., Blanc-Féraud, L.: Some Applications of ∞ -Constraints in Image Processing. INRIA Resarch Report 6115 (2006) 48. Wen, Y.-W., Chan, R.H., Zeng, T.: Primal-dual algorithms for total variation based image restoration under Poisson noise. Sci. China Math. 59(1), 141–160 (2016) 49. Wen, Y.-W., Ching, W.-K., Ng, M.K.: A semi-smooth Newton method for inverse problem with uniform noise. J. Sci. Comput. 75(2), 713–732 (2018)
123
Journal of Scientific Computing 50. Yang, J., Zhang, Y., Yin, W.: An efficient TVL1 algorithm for deblurring multichannel images corrupted by impulsive noise. SIAM J. Sci. Comput. 31(4), 2842–2865 (2009) 51. Zhang, X., Bai, M., Ng, M.K.: Nonconvex-TV based image restoration with impulse noise removal. SIAM J. Imaging Sci. 10(3), 1627–1667 (2017) 52. Zhang, X., Ng, M.K., Bai, M.: A fast algorithm for deconvolution and Poisson noise removal. J. Sci. Comput. 75(3), 1535–1554 (2018)
123