Compressed Signal Recovery by Convex Optimization and Non-Convex Regularization

DOI : 10.17577/IJERTCONV5IS09002

Download Full-Text PDF Cite this Publication

Text Only Version

Compressed Signal Recovery by Convex Optimization and Non-Convex Regularization

[1]A.Suban, [1]V.Karthick, [2]Mallikarjuna Nandi, [3]M. Pradeep

[1][2]Assistant Professor, [3]Lecturer

[1], [3]Department of Electronics and Communication Engineering [2]Department of Computer Science and Engineering [1],[2],[3]Velammal College of Engineering and Technology, Madurai

Abstract – We Aim to promote a standard approach for estimating sparse signals in noise is Convex optimization with sparsity-promoting convex Regularization. In order to promote sparsity more strongly than convex regularization, it is also standard practice to employ non-convex optimization. In this paper, we take a third approach. We utilize a non-convex regularization term chosen such that the total cost function (consisting of data consistency and regularization terms) is convex. Therefore, sparsity is more strongly promoted than in the standard convex formulation, but without sacrificing the attractive aspects of convex optimization. We use this idea to improve the recently developed overlapping group shrinkage (OGS) algorithm for the de-noising of group-sparse signals. The algorithm is applied to the problem of speech enhancement with favorable results in terms of both SNR and perceptual quality.

Index TermsConvex optimization, Sparse optimization, de- noising, Threshold function, non-convex optimization, speech enhancement.

  1. INTRODUCTION

    This paper aims to develop an approach that promotes sparsity more strongly than convex formulation. An example of such a vector (in 2D) is the spectrogram of a speech waveform. The spectrogram of a speech waveform exhibits areas and ridges of large magnitude, but not isolated large values. The method proposed in this work will be demonstrated on the problem of speech filtering.

    Convex and non-convex optimization are both common practice for the estimation of sparse vectors from noisy data. Generally, convex approaches are based on sparsity- promoting convex penalty functions (e.g., the l1 norm), while non-convex approaches are based on non-convex penalty functions

    The algorithm we present is derived according to the principle of majorization-minimization (MM) . The proposed approach:

    1. does not underestimate large amplitude components of sparse solutions to the extent that convex penalties do,

    2. is translation invariant (due to groups in the proposed method being fully overlapping),

    3. is computationally efficient ( O(N) per iteration) with monotonically decreasing cost function.

    4. requires no algorithmic parameters (step-size, Lagrange, etc.).

    A.Related Work

    This paper develops a specific threshold function designed

    so as to have the three properties advocated in unbiasedness (of large coefficients), sparsity and continuity. Further, the threshold function and its corresponding penalty function are parameterized by two parameters: the threshold T and the right-sided derivative of at the threshold, a measure of the threshold functions sensitivity.

  2. PRELIMINARIES

    1. Notation

      We will work with finite-length discrete signals which we denote in lower case bold. The -point signal is written as

      X = [x(0),.,x(N-1)] RN

      We use the notation

      Xi,K = [x(i),.,x(i+K-1)] RK

      to denote the i-th group of size K. We consistently use K (a positive integer) to denote the group size. At the boundaries (i.e., for i < 0 and i > N-K).

    2. Penalty Functions

      We will make the following assumptions on the penalty function, : R R

      1. is continuous on R

      2. is twice differentiable on R / {0}

      3. (-x) = (x) (symmetric)

      4. (x) > 0 (increasing on R*+)

      5. (0+) = 1 (unit slope at zero)

        +

      6. (x) 0 (concave on R* ) 7) (0+) (x)

      8) (0+) is finite.

    3. Threshold Functions

    The fact that the soft threshold function reduces large values by a constant amount is considered its deficiency. In the estimation of sparse signals in AWGN, this behavior results in a systematic underestimation (bias) of large magnitude signal values. Hence, threshold functions that are asymptotically unbiased are often preferred to the soft threshold function, and the penalty functions from which they are derived promote sparsity more strongly than the l1 norm.

    Threshold functions corresponding to several penalty functions. The threshold function corresponding to the absolute value penalty function is called the soft threshold function. Notice that, except for the soft threshold function, the threshold functions approach the identity function

    asymptotically. The atan threshold function approaches identity the fastest.

    Fig.. Threshold functions derived from the four penalty functions given in Section II-B; three of which are non-convex.

  3. OGS WITH NON-CONVEX REGULARIZATION For de-noising group-sparse signals in AWGN, we propose to minimize the cost function,

    where is a (non-convex) sparsity promoting penalty function satisfying the assumptions in Section II-B, and . The notation , denotes a -point group starting at index . The group size, (a positive integer), should be selected based roughly on the size of the groups (clusters) arising in the data. This constitutes ones prior knowledge regarding the data to be denoised and may need to be set through some trial-and- error. We note that does not impose any strict constraint on the size of groups, not does it define the boundaries of groups. The current work addresses the case where and is a non-convex regularizer, so as to promote group sparsity more strongly in comparison to convex regularization.

    1. Overlapping Group Thresholding[OGS]

      Using the results above, we can find a condition on a to ensure F is strictly convex. The result permits the use of non- convex regularization to strongly promote group sparsity while preserving strict convexity of the total cost function. A small a, in turn, limits the non-convexity of the regularizer. Hence, it appears the benefit of the proposed non-convex regularization method is diminished for large K. However, two considerations offset this reasoning. First, for larger K, a smaller value of is needed so as to achieve a fixed level of noise suppression. Secondly, for larger K, there is greater overlap between adjacent groups because the groups are fully-overlapping; so, regularization may be more sensitive to a.

    2. Minimization Algorithm

    To derive an algorithm minimizing the strictly convex function, we use the majorization-minimization (MM) procedure. The MM procedure replaces a single minimization problem by a sequence of (simpler) ones. Specifically, MM is based on the iteration

    Fig.. Majorization of non-convex (x) by q(x,v) .

    To specify a majorizer of the cost function F, we first specify a majorizer of the penalty function , . To simplify notation, we suppress the dependence of on a.

    The OGS algorithm proceeds by iteratively reducing x(i), i S, toward their optimal values (including zero). The attenuation is multiplicative, so the the value never equals zero, even though it may converge to zero. But if a value reaches machine epsilon then a divide-by-zero error may subsequently occur in the implementation. Hence, to avoid possible divide-by-zero errors due to finite precision arithmetic, the OGS algorithm updates S at the end of the loop. The small number , may be set to machine epsilon, which for single pecision floating point is about 10-6.

  4. EXPERIMENTAL RESULTS

    1. Example 1: One-Dimensional Signal Denoising

      This example compares the proposed non-convex regularized

      OGS algorithm with the earlier (convex regularized) version of OGS and with scalar thresholding. The SNRs are summarized in below Table.

      The noisy signl was obtained by adding white Gaussian noise (AWGN) with SNR of 10 dB.

      For each of soft and hard thresholding, we used the threshold, T, that maximizes the SNR. The SNR values are summarized in the top row of above Table.

      We selected T and for each method, so as to reduce the noise standard deviation, , down to 0.01. The resulting SNRs, given in the second row of Table, are much lower. (This method does not maximize SNR, but it does ensure residual noise is reduced to the specified level.) The low SNR in these cases is due to the attenuation (bias) of large magnitude values. However, it can be observed that OGS, especially with non-convex regularization, significantly outperforms scalar thresholding.

      Example 1: Group-sparse signal denoising. (a) Signal;

      (b) Signal+noise(SNR=10dB) (c) OGS[abs](SNR=12.30db); (d) OGS[atan]

    2. Example 2: Speech Denoising

    This example evaluates the use of the proposed OGS algorithm for the problem of speech enhancement (denoising). We compare the OGS algorithm with several other algorithms. For the evaluation, we use female and male speakers, multiple sentences, two noise levels, and two sampling rates. Throughout this example, we use the non- convex arctangent penalty function with a set to its maximum value of a = (1/K1K2). In all cases, we use a fixed number of 25 iterations within the OGS algorithm.

    Fig.. Spectrograms before and after denoising (male speaker). (a) Noisy signal. (b) OGS[atan] with group size(K=(8,2)).Gray scale represents decibels.

    Regularization Parameter:

    We have found empirically, that setting to maximize SNR yields speech with noticeable undesirable perceptual artifacts (musical noise). This known phenomenon is due to residual noise in the STFT domain. Therefore, we instead set the regularization parameter, using the noise suppression approach described . In particular, we set so as to reduce the noise standard deviation . We have selected this value so as to optimize the perceptual quality of the denoised speech according to informal listening tests. In particular, this value is effective at suppressing the musical noise artifact. We also note that this approach leads to greater regularization (higher ) than SNR-optimization.

    Group Size: The perceptual quality of speech denoised using OGS depends on the specified group size. As we apply OGS to a time-frequency spectrogram, the size of the group with respect to both the temporal and spectral dimensions must be specified. We let K1 and K2 denote the number of spectral and temporal samples, respectively.

    Fig.. Denoised spectrograms; female speaker. (a) Noisy spectrogram with SNR=10dB. (b, c) Areas A and B, denoised with group size (8, 2). (d, e) Areas A and B, denoised with group size (2, 4).

    Table V

    AVERAGE SNR FOR SIX SPEECH ENHANCEMENT

    ALGORITHMS. (a)fz=16 kHz (average of 30 samples); (b) fz=8 kHz (average of 15 samples)

    Algorithm Comparisons: In Table V we compare the

    OGS[atan] algorithm with several other speech enhancement algorithms. The table summarizes the output SNR for two sampling rates, male and female speakers, and two input SNR (noise) levels. Each SNR value is averaged over 30 or 15 sentences, depending on the sampling rate. It can be observed that the proposed algorithm, OGS[atan], achieves the highest SNR in each case. (We also note that in all cases, OGS is used not with SNR-optimized , but with the larger , set according to the noise suppression method. The SNR of OGS could be further increased, but at the cost of perceptual quality.

    Fig.. Example 1. Comparison of OGS[abs] and OGS[atan] (a)Output versus input; (b) sorted error.

    The proposed algorithm, OGS[atan], achieves the highest SNR for both noise levels and genders. For example, for the male speaker with an input SNR of 10 dB, OGS[atan] attains the highest output SNR of 16.58 dB. BT achieves the second highest, 15.61 dB. In terms of perceptual quality, SS and LMA have clearly audible artifacts; BT and PS have slight audible artifacts; OGS[atan], OGS[abs] and SUB have the least audible artifacts. However,SUB has a high computational complexity due to eigenvalue factorization. Compared to OGS[abs] and SUB, OGS[atan] better preserves the perceptual quality of high frequencies. Similar results can be observed for different noise levels and the female speaker.

    Empirical Wiener post-processing (EWP) improves the SNR for all methods at all noise levels, but least for OGS[atan]. EWP is effective for increasing SNR because it effectively rescales large-amplitude STFT coefficients that are unnecessarily attenuated by these algorithms (the results of which are biased toward zero). The fact that EWP yields the least improvement for OGS[atan] demonstrates that this algorithm inherently induces less bias than the other algorithms.

    Fig.. Frequency spectrum of denoised spectrograms at t=0.79 seconds.(a) OGS[abs]. (b) OGS[atan]. The group size is K=8,2 in both cases. The noise- free spectrum is in gray.

  5. REMARKS

    Several aspects of the non-convex regularized OGS algorithm are sufficiently similar to those of the convex regularized OGS algorithm. In particular, remarks regarding the convergence behavior, implementation issues, computational complexity, and relationship of OGS, apply also to the version of OGS presented here. The proximal framework has proven effective for convex optimization problems arising in sparse signal estimation and reconstruction. The proposed non-convex regularized OGS algorithm resembles a proximity operator; however, a proximity operator is defined in terms of a convex penalty function. Hence, the proposed approach appears to fall outside the proximal framework.

    The proximal framework has proven effective for convex optimization problems arising in sparse signal estimation and reconstruction The proposed non-convex regularized OGS algorithm resembles a proximity operator; however, a proximity operator is defined in terms of a convex penalty function .Hence, the proposed approach appears to fall outside the proximal framework. Due to the effectiveness of the proximal framework for solving inverse problems much more general than denoising (e.g. deconvolution), it will be of interest in future work to explore the extent to which the proposed method can be used for more general inverse problems by using proximal-like techniques.

    Fig.. SNR comparison of speech enhancement algorithms (30 male sentences, input SNR of 10 dB). Each algorithm is used without EWP (a) and with EWP (b). The sentences are ordered according the SNR of OGS[atan].

  6. CONCLUSION

This paper formulates group-sparse signal denoising as a convex optimization problem with a non-convex regularization term. The regularizer is based on overlapping groups so as to promote group-sparsity. The regularizer, being concave on the positive real line, promotes sparsity more strongly than any convex regularizer can. For several non-convex penalty functions, parameterized by a variable, , it has been shown how to constrain a to ensure the optimization problem is strictly convex. We also develop a fast iterative algorithm for the proposed approach. Numerical experiments demonstrate the effectiveness of the proposed method for speech enhancement.

REFERENCES

  1. F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, Optimization with sparsity-inducing penalties, Found. Trends Machine Learn., vol. 4,no. 1, pp. 1106, 2012.

  2. I. Bayram, Mixed norms with overlapping groups as signal priors, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), May 2011, pp. 40364039.

  3. M. Berouti, R. Schwartz, and J. Makhoul, Enhancement of speech corrupted by acoustic noise, in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), April 1979, vol. 4, pp. 208211.

  4. A. Blake andA. Zisserman, Visual Reconstruction. Cambridge,MA,USA: MIT Press, 1987.

  5. A. Blumensath, Accelerated iterative hard thresholding,

    Signal

    Process., vol. 92, no. 3, pp. 752756, 2012.

  6. S. Boyd,N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and statistical learning via the alternating direction methodof multipliers, Found. Trends in Machine Learn., vol. 3, no. 1, pp.1122, 2011.

  7. S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.k.: Cambridge University Press, 2004.

  8. E. J. Candes, C. A. Sing-Long, and J. D. Trzasko, Unbiased risk estimates for singular value thresholding and spectral estimators, IEEE Trans. Signal Process., vol. 61, no. 19, pp. 46434657, Oct. 2013.

  9. E. J. Candès, M. B. Wakin, and S. Boyd, Enhancing sparsity by

    reweighted l1 minimization, J. Fourier Anal. Appl., vol. 14, no. 5, pp. 877905, Dec. 2008.

  10. R. Chartrand, Exact reconstruction of sparse signals via non- convex minimization, IEEE Signal Process. Lett., vol. 14, no. 10, pp.707710, 2007.

  11. R. Chartrand and B. Wohlberg, A nonconvex ADMM algorithm for group sparsity with sparse groups, in Proc. IEEE Int. Conf. Acoust.,Speech, Signal Process. (ICASSP), May 2013.

  12. P.-Y. Chen and I. W. Selesnick, Translation-invariant shrinkage/thresholding of group sparse signals, Signal Process., vol. 94, pp.476489, Jan. 2014.

  13. X. Chen, Q. Lin, S. Kim, J. G. Carbonell, and E. P. Xing, Smoothing proximal gradient method for general structured sparse regression Ann. Appl. Stat., vol. 6, no. 2, pp. 719752, 2012.

  14. I. Cohen, Optimal speech enhancement under signal presence uncertainty using log-spectral amplitude estimator, IEEE Signal Process.Lett., vol. 9, no. 4, pp. 113116, Apr. 2002.

  15. P. L. Combettes and J.-C. Pesquet, Proximal thresholding algorithm for minimization over orthonormal bases, SIAM J. Optim., vol. 18, no.4, pp. 13511376, 2008.

  16. P. L. Combettes and J.-C. Pesquet, Proximal splitting methods in signal processing, in Fixed-Point Algorithms for Inverse Problems in Science and Engineering, H.H. Bauschke, Ed. et al. New York, NY,USA: Springer-Verlag, 2011, pp. 185212.

  17. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, Image denoising by sparse 3d transform-domain collaborative filtering, IEEE Trans.Image Process., vol. 16, no. 8, pp. 20802095, Aug. 2007.

  18. W. Deng, W. Yin, and Y. Zhang, Group Sparse Optimization by Alternating Direction Method, Rice Univ. CAAM Tech. Rep. TR11-06,2011, 2011.

  19. D. L. Donoho and I. M. Johnstone, Ideal spatial adaptation by wavelet shrinkage, Biometrika, vol. 81, no. 3, pp. 425455, 1994.

  20. Y. C. Eldar and M. Mishali, Robust recovery of signals from a structuredunion of subspaces, IEEE Trans. Inf. Theory, vol. 55, no. 11, pp.53025316, Nov. 2009.

Leave a Reply