Digital Forensic Techniques for Splicing Detection in Multimedia Contents

Visual and audio contents always played a key role in communications, because of their immediacy and presumed objectivity. This has become even more true in the digital era, and today it is common to have multimedia contents stand as proof of events. Digital contents, however, are also very easy to manipulate, thus calling for analysis methods devoted to uncover their processing history. Multimedia forensics is the science trying to answer questions about the past of a given image, audio or video file, questions like “which was the recording device?", or “is the content authentic?". In particular, authenticity assessment is a crucial task in many contexts, and it usually consists in determining whether the investigated object has been artificially created by splicing together different contents. In this thesis we address the problem of splicing detection in the three main media: image, ...

Fontani, Marco — Dept. of Information Engineering and Mathematics, University of Siena


A Game-Theoretic Approach for Adversarial Information Fusion in Distributed Sensor Networks

Every day we share our personal information through digital systems which are constantly exposed to threats. For this reason, security-oriented disciplines of signal processing have received increasing attention in the last decades: multimedia forensics, digital watermarking, biometrics, network monitoring, steganography and steganalysis are just a few examples. Even though each of these elds has its own peculiarities, they all have to deal with a common problem: the presence of one or more adversaries aiming at making the system fail. Adversarial Signal Processing lays the basis of a general theory that takes into account the impact that the presence of an adversary has on the design of effective signal processing tools. By focusing on the application side of Adversarial Signal Processing, namely adversarial information fusion in distributed sensor networks, and adopting a game-theoretic approach, this thesis contributes to the above mission ...

Kallas, Kassem — University of Siena


Light Field Based Biometric Recognition and Presentation Attack Detection

In a world where security issues have been gaining explosive importance, face and ear recognition systems have attracted increasing attention in multiple application areas, ranging from forensics and surveillance to commerce and entertainment. While the recognition performance has been steadily improving, there are still challenging recognition scenarios and conditions, notably when facing large variations in the biometric data characteristics. Additionally, the widespread use of face and ear recognition solutions raises new security concerns, making the robustness against presentation attacks a very active field of research. Lenslet light field cameras have recently come into prominence as they are able to also capture the intensity of the light rays coming from multiple directions, thus offering a richer representation of the visual scene, notably spatio-angular information. To take benefit of this richer representation, light field cameras have recently been successfully applied, not only ...

Alireza Sepas-Moghaddam — Instituto Superior Técnico, University of Lisbon


Theoretical Foundations of Adversarial Detection and Applications to Multimedia Forensics

Every day we share our personal information with digital systems which are constantly exposed to threats. Security-oriented disciplines of signal processing have then received increasing attention in the last decades: multimedia forensics, digital watermarking, biometrics, network intrusion detection, steganography and steganalysis are just a few examples. Even though each of these fields has its own peculiarities, they all have to deal with a common problem: the presence of adversaries aiming at making the system fail. It is the purpose of Adversarial Signal Processing to lay the basis of a general theory that takes into account the impact of an adversary on the design of effective signal processing tools. By focusing on the most prominent problem of Adversarial Signal Processing, namely binary detection or Hypothesis Testing, we contribute to the above mission with a general theoretical framework for the binary detection ...

Tondi, Benedetta — University of Siena


Digital Processing Based Solutions for Life Science Engineering Recognition Problems

The field of Life Science Engineering (LSE) is rapidly expanding and predicted to grow strongly in the next decades. It covers areas of food and medical research, plant and pests’ research, and environmental research. In each research area, engineers try to find equations that model a certain life science problem. Once found, they research different numerical techniques to solve for the unknown variables of these equations. Afterwards, solution improvement is examined by adopting more accurate conventional techniques, or developing novel algorithms. In particular, signal and image processing techniques are widely used to solve those LSE problems require pattern recognition. However, due to the continuous evolution of the life science problems and their natures, these solution techniques can not cover all aspects, and therefore demanding further enhancement and improvement. The thesis presents numerical algorithms of digital signal and image processing to ...

Hussein, Walid — Technische Universität München


Contributions to Human Motion Modeling and Recognition using Non-intrusive Wearable Sensors

This thesis contributes to motion characterization through inertial and physiological signals captured by wearable devices and analyzed using signal processing and deep learning techniques. This research leverages the possibilities of motion analysis for three main applications: to know what physical activity a person is performing (Human Activity Recognition), to identify who is performing that motion (user identification) or know how the movement is being performed (motor anomaly detection). Most previous research has addressed human motion modeling using invasive sensors in contact with the user or intrusive sensors that modify the user’s behavior while performing an action (cameras or microphones). In this sense, wearable devices such as smartphones and smartwatches can collect motion signals from users during their daily lives in a less invasive or intrusive way. Recently, there has been an exponential increase in research focused on inertial-signal processing to ...

Gil-Martín, Manuel — Universidad Politécnica de Madrid


Voice biometric system security: Design and analysis of countermeasures for replay attacks

Voice biometric systems use automatic speaker verification (ASV) technology for user authentication. Even if it is among the most convenient means of biometric authentication, the robustness and security of ASV in the face of spoofing attacks (or presentation attacks) is of growing concern and is now well acknowledged by the research community. A spoofing attack involves illegitimate access to personal data of a targeted user. Replay is among the simplest attacks to mount - yet difficult to detect reliably and is the focus of this thesis. This research focuses on the analysis and design of existing and novel countermeasures for replay attack detection in ASV, organised in two major parts. The first part of the thesis investigates existing methods for spoofing detection from several perspectives. I first study the generalisability of hand-crafted features for replay detection that show promising results ...

Bhusan Chettri — Queen Mary University of London


Active and Passive Approaches for Image Authentication

The generation and manipulation of digital images is made simple by widely available digital cameras and image processing software. As a consequence, we can no longer take the authenticity of a digital image for granted. This thesis investigates the problem of protecting the trustworthiness of digital images. Image authentication aims to verify the authenticity of a digital image. General solution of image authentication is based on digital signature or watermarking. A lot of studies have been conducted for image authentication, but thus far there has been no solution that could be robust enough to transmission errors during images transmission over lossy channels. On the other hand, digital image forensics is an emerging topic for passively assessing image authenticity, which works in the absence of any digital watermark or signature. This thesis focuses on how to assess the authenticity images when ...

Ye, Shuiming — National University of Singapore


Interpretable Machine Learning for Machine Listening

Recent years have witnessed a significant interest in interpretable machine learning (IML) research that develops techniques to analyse machine learning (ML) models. Understanding ML models is essential to gain trust in their predictions and to improve datasets, model architectures and training techniques. The majority of effort in IML research has been in analysing models that classify images or structured data and comparatively less work exists that analyses models for other domains. This research focuses on developing novel IML methods and on extending existing methods to understand machine listening models that analyse audio. In particular, this thesis reports the results of three studies that apply three different IML methods to analyse five singing voice detection (SVD) models that predict singing voice activity in musical audio excerpts. The first study introduces SoundLIME (SLIME), a method to generate temporal, spectral or time-frequency explanations ...

Mishra, Saumitra — Queen Mary University of London


Audio-visual processing and content management techniques, for the study of (human) bioacoustics phenomena

The present doctoral thesis aims towards the development of new long-term, multi-channel, audio-visual processing techniques for the analysis of bioacoustics phenomena. The effort is focused on the study of the physiology of the gastrointestinal system, aiming at the support of medical research for the discovery of gastrointestinal motility patterns and the diagnosis of functional disorders. The term "processing" in this case is quite broad, incorporating the procedures of signal processing, content description, manipulation and analysis, that are applied to all the recorded bioacoustics signals, the auxiliary audio-visual surveillance information (for the monitoring of experiments and the subjects' status), and the extracted audio-video sequences describing the abdominal sound-field alterations. The thesis outline is as follows. The main objective of the thesis, which is the technological support of medical research, is presented in the first chapter. A quick problem definition is initially ...

Dimoulas, Charalampos — Department of Electrical and Computer Engineering, Faculty of Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece


Deep Learning for Event Detection, Sequence Labelling and Similarity Estimation in Music Signals

When listening to music, some humans can easily recognize which instruments play at what time or when a new musical segment starts, but cannot describe exactly how they do this. To automatically describe particular aspects of a music piece – be it for an academic interest in emulating human perception, or for practical applications –, we can thus not directly replicate the steps taken by a human. We can, however, exploit that humans can easily annotate examples, and optimize a generic function to reproduce these annotations. In this thesis, I explore solving different music perception tasks with deep learning, a recent branch of machine learning that optimizes functions of many stacked nonlinear operations – referred to as deep neural networks – and promises to obtain better results or require less domain knowledge than more traditional techniques. In particular, I employ ...

Schlüter, Jan — Department of Computational Perception, Johannes Kepler University Linz


Robust Signal Processing with Applications to Positioning and Imaging

This dissertation investigates robust signal processing and machine learning techniques, with the objective of improving the robustness of two applications against various threats, namely Global Navigation Satellite System (GNSS) based positioning and satellite imaging. GNSS technology is widely used in different fields, such as autonomous navigation, asset tracking, or smartphone positioning, while the satellite imaging plays a central role in monitoring, detecting and estimating the intensity of key natural phenomena, such as flooding prediction and earthquake detection. Considering the use of both GNSS positioning and satellite imaging in critical and safety-of-life applications, it is necessary to protect those two technologies from either intentional or unintentional threats. In the real world, the common threats to GNSS technology include multipath propagation and intentional/unintentional interferences. This thesis investigates methods to mitigate the influence of such sources of error, with the final objective of ...

Li, Haoqing — Northeastern University


Sound Event Detection by Exploring Audio Sequence Modelling

Everyday sounds in real-world environments are a powerful source of information by which humans can interact with their environments. Humans can infer what is happening around them by listening to everyday sounds. At the same time, it is a challenging task for a computer algorithm in a smart device to automatically recognise, understand, and interpret everyday sounds. Sound event detection (SED) is the process of transcribing an audio recording into sound event tags with onset and offset time values. This involves classification and segmentation of sound events in the given audio recording. SED has numerous applications in everyday life which include security and surveillance, automation, healthcare monitoring, multimedia information retrieval, and assisted living technologies. SED is to everyday sounds what automatic speech recognition (ASR) is to speech and automatic music transcription (AMT) is to music. The fundamental questions in designing ...

[Pankajakshan], [Arjun] — Queen Mary University of London


Multi-channel EMG pattern classification based on deep learning

In recent years, a huge body of data generated by various applications in domains like social networks and healthcare have paved the way for the development of high performance models. Deep learning has transformed the field of data analysis by dramatically improving the state of the art in various classification and prediction tasks. Combined with advancements in electromyography it has given rise to new hand gesture recognition applications, such as human computer interfaces, sign language recognition, robotics control and rehabilitation games. The purpose of this thesis is to develop novel methods for electromyography signal analysis based on deep learning for the problem of hand gesture recognition. Specifically, we focus on methods for data preparation and developing accurate models even when few data are available. Electromyography signals are in general one-dimensional time-series with a rich frequency content. Various feature sets have ...

Tsinganos, Panagiotis — University of Patras, Greece - Vrije Universiteit Brussel, Belgium


Study and application of acoustic information for the detection of harmful content, and fusion with visual information

This thesis aims at investigating and developing techniques for content-based segmentation and classification of multimedia files, based on audio information. Emphasis has been given to analyzing the content of films based on audio information. In addition, part of the thesis is focused on the detection of audio classes related to violent content (e.g., gunshots, screams, etc).

Giannakopoulos, Theodoros — University of Athens

The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.

The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.