QoE Analysis for Interactive Internet Applications in the Presence of Delay (2015)
Quality of Experience Evaluation Methodology via Crowdsourcing
Provisioning of digital video services is a difficult task as it is hard to estimate optimal settings of video parameters, given transmission constraints, while maximizing the overall end-user quality. With Internet streaming services becoming part of our everyday life, end-to-end optimization of such systems is important. On one hand, huge effort is given into subjective or objective evaluation of the end-user perception. High quality audiovisual perception with respect to the minimized costs of the provided service is one of the main interests for the network providers. On the other hand, subjective evaluations to determine best video and audio configurations are often evaluated in controlled test laboratory environments, which have little to do with the real environments in which consumers enjoy such content. Unfortunately, no serious attempts have been made to take into account interactions between quality of the content and ...
Gardlo, Bruno — University of Zilina
Quality Aspects of Packet-Based Interactive Speech Communication
Voice-over-Internet Protocol (VoIP) technology provides the transmission of speech over packet-based networks. The transition from circuit-switched to packet-switched networks introduces two major quality impairments: packet loss and end-to-end delay. This thesis shows that the incorporation of packets that were damaged by bit errors reduces the effective packet loss rate, and thus improves the speech quality as perceived by the user. Moreover, this thesis addresses the impact of transmission delay on conversational interactivity and on the perceived speech quality. In order to study the structure and interactivity of conversations, the framework of Parametric Conversation Analysis (P-CA) is introduced and three metrics for conversational interactivity are defined. The investigation of five conversation scenarios based on subjective quality tests has shown that only highly structured scenarios result in high conversational interactivity. The speaker alternation rate has turned out to represent a simple and ...
Hammer, Florian — Graz University of Technology
Interaction in Social eXtended Reality: A Quality of Experience Approach
The rise of immersive technologies has led to an increase in the number of use cases that adapt this type of technology within the telecommunications area. Some examples are: industrial training, multimedia content consumption and tele-training. Among all the immersive technologies, eXtended Reality through the use of Head-Mounted Displays (HMD) is the one that focuses the majority of current developments. Specifically, the Social XR paradigm frames the use of immersive technologies in a multi-user or social context. Among the decisive factors for using immersive technology in communications use cases, two stand out: the possibility of making the user believe that they has been transported to another place (sensation of presence) and the possibility of increasing interactions by allowing displacements through space (6 degrees of freedom) as well as the possibility of interacting in a more natural way. Such improvements are ...
Cortés, Carlos — Universidad Politécnica de Madrid
Analysis of quality of experience in 3D video systems
This thesis presents a comprehensive study of the evaluation of the Quality of Experience (QoE) perceived by the users of 3D video systems, analyzing the impact of effects introduced by all the elements of the 3D video processing chain. Therefore, various subjective assessment tests are presented, particularly designed to evaluate the systems under consideration, and taking into account all the perceptual factors related to the 3D visual experience, such as depth perception and visual discomfort. In particular, a subjective test is presented, based on evaluating typical degradations that may appear during the content creation, for instance due to incorrect camera calibration or video processing algorithms (e.g., 2D to 3D conversion). Moreover, the process of generation of a high-quality dataset of 3D stereoscopic videos is described, which is freely available for the research community, and has been already widely used in ...
Gutiérrez, Jesús — Universidad Politécnica de Madrid
Understanding and Assessing Quality of Experience in Immersive Communications
eXtended Reality (XR) technology, also called Mixed Reality (MR), is in constant development and improvement in terms of hardware and software to offer relevant experiences to users. One of the advances in XR has been the introduction of real visual information in the virtual environment, offering a more natural interaction with the scene and a greater acceptance of technology. Another advance has been achieved with the representation of the scene through a video that covers the entire environment, called 360-degree or omnidirectional video. These videos are acquired by cameras with omnidirectional lenses that cover the 360-degrees of the scene and are generally viewed by users through a head-tracked Head Mounted Display (HMD). Users only visualize a subset of the 360-degree scene, called viewport, which changes with the variations of the viewing direction of the users, determined by the movements of ...
Orduna, Marta — Universidad Politécnica de Madrid
Vision models and quality metrics for image processing applications
Optimizing the performance of digital imaging systems with respect to the capture, display, storage and transmission of visual information represents one of the biggest challenges in the field of image and video processing. Taking into account the way humans perceive visual information can be greatly beneficial for this task. To achieve this, it is necessary to understand and model the human visual system, which is also the principal goal of this thesis. Computational models for different aspects of the visual system are developed, which can be used in a wide variety of image and video processing applications. The proposed models and metrics are shown to be consistent with human perception. The focus of this work is visual quality assessment. A perceptual distortion metric (PDM) for the evaluation of video quality is presented. It is based on a model of the ...
Winkler, Stefan — Swiss Federal Institute of Technology
Dialogue Enhancement and Personalization - Contributions to Quality Assessment and Control
The production and delivery of audio for television involve many creative and technical challenges. One of them is concerned with the level balance between the foreground speech (also referred to as dialogue) and the background elements, e.g., music, sound effects, and ambient sounds. Background elements are fundamental for the narrative and for creating an engaging atmosphere, but they can mask the dialogue, which the audience wishes to follow in a comfortable way. Very different individual factors of the people in the audience clash with the creative freedom of the content creators. As a result, service providers receive regular complaints about difficulties in understanding the dialogue because of too loud background sounds. While this has been a known issue for at least three decades, works analyzing the problem and up-to-date statics were scarce before the contributions in this work. Enabling the ...
Torcoli, Matteo — Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU)
Artificial Bandwidth Extension of Telephone Speech Signals Using Phonetic A Priori Knowledge
The narrowband frequency range of telephone speech signals originally caused by former analog transmission techniques still leads to frequent acoustical limitations in today’s digital telephony systems. It provokes muffled sounding phone calls with reduced speech intelligibility and quality. By means of artificial speech bandwidth extension approaches, missing frequency components can be estimated and reconstructed. However, the artificially extended speech bandwidth typically suffers from annoying artifacts. Particularly susceptible to this are the fricatives /s/ and /z/. They can hardly be estimated based on the narrowband spectrum and are therefore easily confusable with other phonemes as well as speech pauses. This work takes advantage of phonetic a priori knowledge to optimize the performance of artificial bandwidth extension. Both the offline training part conducted in advance and the main processing part performed later on shall be thereby provided with important phoneme information. As ...
Bauer, Patrick Marcel — Institute for Communications Technology, Technical University Braunschweig
System-Level Modeling and Optimization of MIMO HSDPA Networks
Interaction between the Medium Access Control (MAC)-layer and the physical-layer routines is one of the basic concepts of modern wireless networks. Physical-layer dependent resource allocation and scheduling guarantee efficient network utilization. Accordingly, classical link-level analyses, focusing only on the physical-layer are not sufficient anymore for optimum transceiver structure and algorithm development. This thesis presents the development and application of a system-level description suitable for the downlink of Multiple-Input Multiple-Output (MIMO) enhanced High-Speed Downlink Packet Access (HSDPA), with particular focus on the Double Transmit Antenna Array (D-TxAA) transmission mode. The system-level model allows for investigating and evaluating transmission systems and algorithms in the context of cellular networks. Two separate models are proposed to obtain a complete system-level description: (i) a link-quality model, analytically describing the MIMO HSDPA link quality in a so-called equivalent fading parameter structure, and (ii) a link-performance model, ...
Wrulich, Martin — Vienna University of Technology
Mixed structural models for 3D audio in virtual environments
In the world of Information and communications technology (ICT), strategies for innovation and development are increasingly focusing on applications that require spatial representation and real-time interaction with and within 3D-media environments. One of the major challenges that such applications have to address is user-centricity, reflecting e.g. on developing complexity-hiding services so that people can personalize their own delivery of services. In these terms, multimodal interfaces represent a key factor for enabling an inclusive use of new technologies by everyone. In order to achieve this, multimodal realistic models that describe our environment are needed, and in particular models that accurately describe the acoustics of the environment and communication through the auditory modality are required. Examples of currently active research directions and application areas include 3DTV and future internet, 3D visual-sound scene coding, transmission and reconstruction and teleconferencing systems, to name but ...
Geronazzo, Michele — University of Padova
Feedback Delay Networks in Artificial Reverberation and Reverberation Enhancement
In today's audio production and reproduction as well as in music performance practices it has become common practice to alter reverberation artificially through electronics or electro-acoustics. For music productions, radio plays, and movie soundtracks, the sound is often captured in small studio spaces with little to no reverberation to save real estate and to ensure a controlled environment such that the artistically intended spatial impression can be added during post-production. Spatial sound reproduction systems require flexible adjustment of artificial reverberation to the diffuse sound portion to help the reconstruction of the spatial impression. Many modern performance spaces are multi-purpose, and the reverberation needs to be adjustable to the desired performance style. Employing electro-acoustic feedback, also known as Reverberation Enhancement Systems (RESs), it is possible to extend the physical to the desired reverberation. These examples demonstrate a wide range of applications ...
Schlecht, Sebastian Jiro — Friedrich-Alexander-Universität Erlangen-Nürnberg
Measurement and Modelling of Internet Traffic over 2.5 and 3G Cellular Core Networks
THE task of modeling data traffic in networks is as old as the first commercial telephony systems. In the recent past in mobile telephone networks the focus has moved from voice to packetswitched services. The new cellular mobile networks of the third generation (UMTS) and the evolved second generation (GPRS) offer the subscriber the possibility of staying online everywhere and at any time. The design and dimensioning is well known for circuit switched voice systems, but not for mobile packet-switched systems. The terms user expectation, grade of service and so on need to be defined. To find these parameters it is important to have an accurate traffic model that delivers good traffic estimates. In this thesis we carried out measurements in a live 3G core network of an Austrian operator, in order to find appropriate models that can serve as ...
Svoboda, Philipp — Vienna University of Technology
Modeling Perceived Quality for Imaging Applications
People of all generations are making more and more use of digital imaging systems in their daily lives. The image content rendered by these digital imaging systems largely differs in perceived quality depending on the system and its applications. To be able to optimize the experience of viewers of this content understanding and modeling perceived image quality is essential. Research on modeling image quality in a full-reference framework --- where the original content can be used as a reference --- is well established in literature. In many current applications, however, the perceived image quality needs to be modeled in a no-reference framework at real-time. As a consequence, the model needs to quantitatively predict perceived quality of a degraded image without being able to compare it to its original version, and has to achieve this with limited computational complexity in order ...
Liu, Hantao — Delft University of Technology
Multiple Objective Optimization for Video Streaming
In this thesis, we propose Multiple Objective Optimization (MOO) frameworks for efficient video streaming. Firstly, we introduce pre-roll delay-distortion optimization (DDO) for uninterrupted content-adaptive video streaming over low capacity, constant bitrate (CBR) channels using MOO. Content analysis is used to divide the input video into shots with assigned relevance levels. The video is adaptively encoded and streamed aiming minimum pre-roll delay and distortion with the optimal spatial and temporal resolutions and quantization parameters for each shot. With buffer and distortion constraints, the bitrate of unimportant shots is reduced to achieve an acceptable quality in important shots. Secondly, we introduce a cross-layer optimized video rate adaptation and scheduling scheme to achieve maximum "application layer" Quality-of-Service (QoS), maximum video throughput (video seconds per transmission slot), and QoS fairness for wireless video streaming. Using the MOO framework, these objectives are jointly optimized such ...
Ozcelebi, Tanir — Koc University
Dynamic Scheme Selection in Image Coding
This thesis deals with the coding of images with multiple coding schemes and their dynamic selection. In our society of information highways, electronic communication is taking everyday a bigger place in our lives. The number of transmitted images is also increasing everyday. Therefore, research on image compression is still an active area. However, the current trend is to add several functionalities to the compression scheme such as progressiveness for more comfortable browsing of web-sites or databases. Classical image coding schemes have a rigid structure. They usually process an image as a whole and treat the pixels as a simple signal with no particular characteristics. Second generation schemes use the concept of objects in an image, and introduce a model of the human visual system in the design of the coding scheme. Dynamic coding schemes, as their name tells us, make ...
Fleury, Pascal — Swiss Federal Institute of Technology
The current layout is optimized for mobile phones. Page previews, thumbnails, and full abstracts will remain hidden until the browser window grows in width.
The current layout is optimized for tablet devices. Page previews and some thumbnails will remain hidden until the browser window grows in width.