The TM3270 Media-processor (2006)
An Energy Aware Framework for Mobile Computing
Since their inception, energy dissipation has been a critical issue for mobile computing systems. Although a large research investment in low-energy circuit design and hardware level energy management has led to more energy-efficient architectures, even then, there is a growing realization that the contribution to energy conservation should be more rigorously considered at higher levels of the systems, such as operating systems and applications. This dissertation puts forth the claim that energy-aware compilation to improve appli- cation quality both in terms of execution time and energy consumption is essential for a high performance mobile computing embedded system design. Our work is a design paradigm shift from the logic gate being the basic silicon computation unit, to an in- struction running on an embedded processor. Multimedia DSP processors are the most lucrative choice to a mobile computing system design for their ...
Azeemi, N. Zafar — Vienna University of Technology
The present doctoral thesis aims towards the development of new long-term, multi-channel, audio-visual processing techniques for the analysis of bioacoustics phenomena. The effort is focused on the study of the physiology of the gastrointestinal system, aiming at the support of medical research for the discovery of gastrointestinal motility patterns and the diagnosis of functional disorders. The term "processing" in this case is quite broad, incorporating the procedures of signal processing, content description, manipulation and analysis, that are applied to all the recorded bioacoustics signals, the auxiliary audio-visual surveillance information (for the monitoring of experiments and the subjects' status), and the extracted audio-video sequences describing the abdominal sound-field alterations. The thesis outline is as follows. The main objective of the thesis, which is the technological support of medical research, is presented in the first chapter. A quick problem definition is initially ...
Dimoulas, Charalampos — Department of Electrical and Computer Engineering, Faculty of Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece
Mixed structural models for 3D audio in virtual environments
In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...
Geronazzo, Michele — University of Padova
Motion detection and human recognition in video sequences
This thesis is concerned with the design of a complete framework that allows the real-time recognition of humans in a video stream acquired by a static camera. For each stage of the processing chain, which takes as input the raw images of the stream and eventually outputs the identity of the persons, we propose an original algorithm. The first algorithm is a background subtraction technique named ViBe. The purpose of ViBe is to detect the parts of the images that contain moving objects. The second algorithm determines which moving objects correspond to individuals. The third algorithm allows the recognition of the detected individuals from their gait. Our background subtraction algorithm, ViBe, uses a collection of samples to model the history of each pixel. The current value of a pixel is classified by comparison with the closest samples that belong to ...
Olivier, Barnich — University of Liege
Adaptive media streaming over multipath networks
With the latest developments in video coding technology and fast deployment of end-user broadband internet connections, real-time media applications become increasingly interesting for both private users and businesses. However, the internet remains a best-effort service network unable to guarantee the stringent requirements of the media application, in terms of high, constant bandwidth, low packet loss rate and transmission delay. Therefore, efficient adaptation mechanisms must be derived in order to bridge the application requirements with the transport medium characteristics. Lately, different network architectures, e.g., peer-to-peer networks, content distribution networks, parallel wireless services, emerge as potential solutions for reducing the cost of communication or infrastructure, and possibly improve the application performance. In this thesis, we start from the path diversity characteristic of these architectures, in order to build a new framework, specific for media streaming in multipath networks. Within this framework we ...
Jurca, Dan — EPFL/ITS, Lausanne, Switzerland
Deep Learning Techniques for Visual Counting
The explosion of Deep Learning (DL) added a boost to the already rapidly developing field of Computer Vision to such a point that vision-based tasks are now parts of our everyday lives. Applications such as image classification, photo stylization, or face recognition are nowadays pervasive, as evidenced by the advent of modern systems trivially integrated into mobile applications. In this thesis, we investigated and enhanced the visual counting task, which automatically estimates the number of objects in still images or video frames. Recently, due to the growing interest in it, several Convolutional Neural Network (CNN)-based solutions have been suggested by the scientific community. These artificial neural networks, inspired by the organization of the animal visual cortex, provide a way to automatically learn effective representations from raw visual data and can be successfully employed to address typical challenges characterizing this task, ...
Ciampi Luca — University of Pisa
Spatio-Temporal Speech Enhancement in Adverse Acoustic Conditions
Never before has speech been captured as often by electronic devices equipped with one or multiple microphones, serving a variety of applications. It is the key aspect in digital telephony, hearing devices, and voice-driven human-to-machine interaction. When speech is recorded, the microphones also capture a variety of further, undesired sound components due to adverse acoustic conditions. Interfering speech, background noise and reverberation, i.e. the persistence of sound in a room after excitation caused by a multitude of reflections on the room enclosure, are detrimental to the quality and intelligibility of target speech as well as the performance of automatic speech recognition. Hence, speech enhancement aiming at estimating the early target-speech component, which contains the direct component and early reflections, is crucial to nearly all speech-related applications presently available. In this thesis, we compare, propose and evaluate existing and novel approaches ...
Dietzen, Thomas — KU Leuven
COMPRESSED DOMAIN VIDEO UNDERSTANDING METHODS FOR TRAFFIC SURVEILLANCE APPLICATIONS
In the realm of traffic monitoring, efficient video analysis is paramount yet challenging due to intensive computational demands. This thesis addresses this issue by introducing novel methods to operate in the compressed domain. Four methods are proposed for image reconstruction from High Efficiency Video Coding (HEVC) Intra bitstreams, namely, the Block Partition Based Method (Mbp), the Prediction Unit Based Method (Mpu), the Random Perturbation Based Method (Mrp), and the Luma based method (My). These methods aim to provide a compact representation of the original image while retaining relevant information for video understanding tasks. Our methods substantially reduce data transmission requirements and memory footprint. Specifically, images created via Mbp and Mpu require 1/1,536 and 1/192 of the memory needed by pixel domain images, respectively. Moreover, these methods offer computational speedup between 1.25 to 4 times, yielding efficiencies in video analysis. The ...
Beratoğlu, Muhammet Sebul — Istanbul Technical University
Performance Improvement of Multichannel Audio by Graphics Processing Units
Multichannel acoustic signal processing has undergone major development in recent years due to the increased complexity of current audio processing applications. People want to collaborate through communication with the feeling of being together and sharing the same environment, what is considered as Immersive Audio Schemes. In this phenomenon, several acoustic effects are involved: 3D spatial sound, room compensation, crosstalk cancelation, sound source localization, among others. However, high computing capacity is required to achieve any of these effects in a real large-scale system, what represents a considerable limitation for real-time applications. The increase of the computational capacity has been historically linked to the number of transistors in a chip. However, nowadays the improvements in the computational capacity are mainly given by increasing the number of processing units, i.e expanding parallelism in computing. This is the case of the Graphics Processing Units ...
Belloch, Jose A. — Universitat Politècnica de València
GNSS Array-based Acquisition: Theory and Implementation
This Dissertation addresses the signal acquisition problem using antenna arrays in the general framework of Global Navigation Satellite Systems (GNSS) receivers. GNSSs provide the necessary infrastructures for a myriad of applications and services that demand a robust and accurate positioning service. GNSS ranging signals are received with very low signal-to-noise ratio. Despite that the GNSS CDMA modulation offers limited protection against Radio Frequency Interferences (RFI), an interference that exceeds the processing gain can easily degrade receivers' performance or even deny completely the GNSS service. A growing concern of this problem has appeared in recent times. A single-antenna receiver can make use of time and frequency diversity to mitigate interferences, even though the performance of these techniques is compromised in the presence of wideband interferences. Antenna arrays receivers can benefit from spatial-domain processing, and thus mitigate the effects of interfering signals. ...
Arribas, Javier — Universitat Politecnica de Catalunya
Adaptive Algorithms and Variable Structures for Distributed Estimation
The analysis and design of new non-centralized learning algorithms for potential application in distributed adaptive estimation is the focus of this thesis. Such algorithms should be designed to have low processing requirement and to need minimal communication between the nodes which would form a distributed network. They ought, moreover, to have acceptable performance when the nodal input measurements are coloured and the environment is dynamic. Least mean square (LMS) and recursive least squares (RLS) type incremental distributed adaptive learning algorithms are first introduced on the basis of a Hamiltonian cycle through all of the nodes of a distributed network. These schemes require each node to communicate only with one of its neighbours during the learning process. An original steady-steady performance analysis of the incremental LMS algorithm is performed by exploiting a weighted spatial-temporal energy conservation formulation. This analysis confirms that ...
Li, Leilei — Loughborough University
A statistical approach to motion estimation
Digital video technology has been characterized by a steady growth in the last decade. New applications like video e-mail, third generation mobile phone video communications, videoconferencing, video streaming on the web continuously push for further evolution of research in digital video coding. In order to be sent over the internet or even wireless networks, video information clearly needs compression to meet bandwidth requirements. Compression is mainly realized by exploiting the redundancy present in the data. A sequence of images contains an intrinsic, intuitive and simple idea of redundancy: two successive images are very similar. This simple concept is called temporal redundancy. The research of a proper scheme to exploit the temporal redundancy completely changes the scenario between compression of still pictures and sequence of images. It also represents the key for very high performances in image sequence coding when compared ...
Moschetti, Fulvio — Swiss Federal Institute of Technology
Orthonormal Bases for Adaptive filtering
In the field of adaptive filtering the most commonly applied filter structure is the transversal filter, also referred to as the tapped-delay line (TDL). The TDL is composed of a cascade of unit delay elements that are tapped, weighted and then summed. Thus, the output of a TDL is formed by a linear combination of its input signal at various delays. The weights in this linear combination are called the tap weights. The number of delay elements, or equivalently the number of tap weights, determines the duration of the impulse response of the TDL. For this reason, one often speaks of a finite impulse response (FIR) filter. In a general adaptive filtering scheme the adaptive filter aims to minimize a certain measure of error between its output and a desired signal. Usually, a quadratic cost criterion is taken: the so-called ...
Belt, harm — Eindhoven University of Technology
An analysis of the ergonomic quality of the current standards for the visual display quality leads to a number of recommendations for the development of new international standards: - Separation for different types of users, esp. display designers, purchasers, and end users, -Independence of display technology to allow comparison, -Modular construction with several quality grades to allow benchmarking for different types of applications, -A test method for the end user standard that can be performed at the place of work, to take into account the effects of wear and drift of components and to be able to correct suboptimal configurations. The separate parameters that exert influence on the image quality of a broad category of images in the context of use, and their mutual coherence within the cycle of evaluation and adaptation of image quality are presented in the "Image ...
Besuijen, Jacobus — Delft University of Technology
Tracking and Planning for Surveillance Applications
Vision and infrared sensors are very common in surveillance and security applications, and there are numerous examples where a critical infrastructure, e.g. a harbor, an airport, or a military camp, is monitored by video surveillance systems. There is a need for automatic processing of sensor data and intelligent control of the sensor in order to obtain efficient and high performance solutions that can support a human operator. This thesis considers two subparts of the complex sensor fusion system; namely target tracking and sensor control.The multiple target tracking problem using particle filtering is studied. In particular, applications where road constrained targets are tracked with an airborne video or infrared camera are considered. By utilizing the information about the road network map it is possible to enhance the target tracking and prediction performance. A dynamic model suitable for on-road target tracking with ...
Skoglar, Per — Linköping University, Department of Electrical Engineering
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.