Study on the texture of biomedical data: contributions from multiscale and multidimensional features based on entropy measures
Goals and motivation: The PhD aimed at proposing new texture feature extraction algorithms based on information theory. More precisely, new entropy-based algorithms (EBA) were proposed to study the texture of biomedical datasets and, through the use of artificial intelligence techniques, to help diagnose pulmonary pathologies. Introduction: In 1948, Shannon introduced entropy to quantify a signal?s information content in his information theory proposal. Shannon?s entropy determination was accomplished by employing a logarithm-based metric to assess the quantity of information in a message contained within a communication system. This quantity is associated with an increase in disorder. The system had a higher disorder content when the entropy values were higher. Shannon?s entropy, or first-order entropy, can be easily determined by conducting a histogram analysis of a signal. Subsequently, in accordance with information theory, numerous EBAs were proposed to investigate the disorder or irregularity inherent in signals. These EBAs were developed as Shannon-based entropy algorithms, which use a set of probabilities to rely on the logarithm metric, or as conditional-based entropy algorithms, which calculate the probability of patterns occurrence for a specific embedding size, m, and then for m+1. Based on this, these EBAs were also expanded to include multidimensional data and multiscale analysis. Materials and methods: Initially, a systematic review study was conducted to examine the utility and potential of entropy as a texture feature in various biomedical fields. Consequently, first, different existing entropy-based techniques to extract texture features are listed. Some of them use Shannon?s entropy, Shannon-based entropy algorithms, conditional-based algorithms, and gray-level co-occurrence matrix entropy (GLCM) or second-order features, among others. Subsequently, the research focused on the development of Shannon-based and conditional-based entropy algorithms, comparing their computational efficiency and performance in texture analysis of medical images. Shannon-based entropy algorithms developed in this PhD are first described, namely, two-dimensional approximate entropy (AAPE2D two-dimensional ensemble entropy algorithms, and two-dimensional symbolic dynamic entropy (SDE2D). These algorithms were also compared with algorithms proposed in the literature. These developed EBAs were validated in synthetic data like MIX2D(p) images and then tested with biomedical data. Similarly, conditional-based algorithms are described in detail and subsequently validated and applied in a biomedical context. The conditional-based entropy algorithms developed in this PhD are ensemble conditional entropy techniques based on fuzzy and sample entropy. Three-dimensional fuzzy entropy (FuzEn3D), developed in a previous master project, was also implemented to study the texture properties of COVID-19. Throughout this PhD, several pulmonary pathologies are depicted like pneumonia, emphysema, tuberculosis, and COVID-19. Main Results: The results show that the Shannon-based algorithms are less computationally intensive and exhibit interesting results in detecting pneumonia, emphysema, and tuberculosis. From the study of SDE2D applied to tuberculosis chest X-rays, one can highlight its accuracy for the detection of tuberculosis in the left and right lung?86.4% and 85.2%, respectively. The developed two-dimensional approximate permutation entropy leads to 75.7% accuracy in detecting pneumonia. The conditional-based algorithms, on the other hand, show superior stability and consistency than other entropy algorithms and also had satisfactory outcomes in detecting emphysema and COVID-19. The two-dimensional ensemble fuzzy entropy based on multiple embedding dimensional values (EnsFuzEnM2D) is the best algorithm out of the developed ensemble algorithms, with an accuracy of 93.7% for the detection of healthy lung tissue and two different types of emphysema. Furthermore, the implemented multiscale FuzEn3D leads to 89.6% accuracy and 96% sensitivity when detecting COVID-19 in CT scans. Moreover, in a global analysis comparing Shannon-based and conditional-based algorithms, SDE2D shows to be more accurate (87.3%) in detecting emphysema patients when compared with the other remaining EBAs tested. In the end, when using two-dimensional entropy features provided by the eight two-dimensional EBAs developed in this PhD, emphysema patients are detected with 89.1% accuracy and 95% area under the curve. Conclusion: Overall, the developed EBAs prove to be effective in texture evaluation. They could be applied, in the future, to various biomedical applications through different medical image sources.
