The use of High-Order Sparse Linear Prediction for the Restoration of Archived Audio

Since the invention of Gramophone by Thomas Edison in 1877, vast amounts of cultural, entertainment, educational and historical audio recordings have been recorded and stored throughout the world. Through natural aging and improper storage, the recorded signal degrades and loses its information in terms of quality and intelligibility. Degradation of audio signals is considered as any unwanted modification to the audio signal after it has been recorded. There are different degradations affecting recorded signals on analog storage media. The degradations that are often encountered are clicks, hiss and ?Wow and Flutter?. Several researches have been conducted in restoring degraded audio recordings. Most of the methods rely on some prior information of the underlying data and the degradation process. The success of these methods heavily depends on the prior information available. When such information is not available, a model of the underlying undegraded data can be used to generate such prior information. Linear prediction is one of the most widely used models to represent speech. However, linear prediction has limitations for voiced speech and music and as such restoration approaches that use linear prediction have limited success for voiced speech and music. This research uses recent findings in linear prediction modeling in the restoration of click and ?wow and flutter?. Recent developments in efficient algorithms and computational capability have led to significant investigations on the usefulness of ?1-norm and ?0-norm regularization in the solution to the least squares problem. The use of high-order sparse linear prediction for overcoming the limitations posed by conventional linear prediction has been investigated by other researchers. This research investigates the use of high-order sparse linear prediction for the detection and restoration of degraded archived audio signals. A method is developed that uses the high-order sparse linear prediction model to estimate the underlying audio signal without priori information on the type of audio and the details of the degradation. The model is then used for the detection of the degradations as well as for the restoration of the degraded sample values. The use of the model for two of the most widely encountered degradations in archived audio is investigated. Results show that the use of high-order sparse linear prediction for the modeling of the underlying audio signal results in improved detection as well as restoration. Simulations are conducted for a wide range of audio signals including synthetic vowels, natural vowels, speech and music. The method was able to be used without prior information on the type of audio as well as without the need of pitch estimators. The performance was measured with respect to degradation characterization and restoration quality: in terms of signal-to-noise ratio of the restored signal versus the original undegraded signal and perceptual evaluation of audio quality for assessment of the subjective quality of the restored signal. Both results showed that the proposed framework achieves better quality of all types of audio signal. The computational time of the proposed framework was also investigated.

File Type: pdf
File Size: 6 MB
Publication Year: 2020
Author: Dufera, Bisrat Derebssa
Supervisors: Toon van Waterschoot, Koen Eneman, Eneyew Adugna
Institution: School of Electrical and Computer Engineering, Addis Ababa Institute of Technology, Addis Ababa University
Keywords: Linear prediction, sparse linear prediction, high-order sparse linear prediction, audio restoration, archived audio, click degradation, wow degradation