Tutorials will take place on Monday, September 3, 2018, Auditorium, at "Centro Congressi di Confindustria - Auditorium della Tecnica", Via dell'Astronomia 30, 00146, Rome, Italy.

Morning Tutorials

T1. Signal Processing and Optimization for Ultra Reliable Low Latency Communications

September 3, 2018, Room Q, 9:30 - 13:00

Eduard Axel Jorswieck, TU Dresden, Germany

Muhammad Ali Imran, University of Glasgow, United Kingdom

Muhammad Majid Butt, University of Glasgow, United Kingdom


In this tutorial, methods and signal processing tools for the URLLC service scenario in 5G and beyond are explained and applied to enable a tactile network. We provide fundamental concepts of latency-energy, reliability-energy and latency-reliability trade-off as well as link and system level techniques to enable URLLC networks in next generation cellular systems. Analytical approaches to identify the potential of the trade-off when using the error control mechanisms (e.g. Hybrid ARQ, intentional packet dropping) will be explored and a perspective will be given on how these approaches can be efficiently implemented on system level. We focus on mathematical tools to solve the network optimization problems with multiple objective functions. To model and systematically solve the resource allocation and assignment problems, which arise in heterogeneous interference networks, multi-objective programming and matching theory are two key tools. Throughout the tutorial, the numerical results are illustrated and discussed by showing the gains in terms of reliability, resilience, and efficiency.

T2. Light Field Image Processing

September 3, 2018, Room G, 9:30 - 13:00

Benoit Vandame, Technicolor, France

Christine Guillemot, Institut National de Recherche en Informatique et Automatique, France


Light field imaging is becoming increasingly popular thanks to the recent advances in Light Field capture of real scenes. By capturing light rays emitted along different directions, light fields yield a much richer description of the scene, enabling post-capture processing that can be appealing for a variety of applications, e.g., photorealistic rendering, computational photography and computer vision. Many acquisition devices have been designed to capture light fields, ranging from camera arrays, to single cameras mounted on moving gantries, and plenoptic cameras. These acquisition devices offer different trade-offs between angular and spatial resolution. Plenoptic cameras use an array of microlenses placed in front of the sensor to capture multiple low resolution views in one 2D sensor image, hence reduce the spatial resolution by orders of magnitude compared to the raw sensor image. Recent research efforts have been directed towards overcoming the spatio-angular trade-off inherent to plenoptic cameras. As the capture of 4D light fields from real scenes is gaining in popularity, the need for efficient compression and processing tools is expected to rise as well. However, the very large volume of data which they represent, as well as the need to maintain consistency across views, raise challenging complexity issues in these processing tasks. In the case of video light fields, the amount of data to process becomes even more critical, and besides the angular consistency, the temporal consistency is to take also into account. For all these reasons, new algorithms for static and dynamic light fields need to be designed. Furthermore, the angular and spatial sampling of existing light-fields vary significantly depending on the acquisition system (e.g. lenslet cameras vs. camera arrays). Therefore, specifically tailored algorithms taking into account the plenoptic sampling are to be considered. This tutorial will review fundamentals in light field imaging, the related plenoptic function, the main capturing devices and will present fundamental problems in light field image processing.

T3. Wild Patterns: Ten Years after the rise of Adversarial Machine Learning

September 3, 2018, Room A, 9:30 - 13:00

Battista Biggio, University of Cagliari, Italy

Fabio Rol, University of Cagliari, Italy


The pervasive use of machine learning and data-driven AI technologies exposes them to the threat of adversarial examples, namely, slightly-perturbed input samples capable of misleading the learning algorithm. This tutorial aims to introduce the fundamentals of adversarial machine learning to the signal processing community, presenting a well-structured review of recently-proposed techniques to assess the vulnerability of machine-learning algorithms to adversarial attacks, both at training (poisoning) and test (evasion) time, and some of the most effective countermeasures against adversarial examples proposed to date. We report clear application examples including object recognition in images, biometric identity recognition, spam and malware detection.

T4. Deep Learning for Image and Video Processing

September 3, 2018, Room Pininfarina, 9:30 - 13:00

A. Murat Tekalp, Koc University, Turkey


Deep neural networks and the term "deep learning" have become popular after the now classic 2012 paper by Krizhevsky, Sutskever, and Hinton on object recognition on the ImageNet dataset. Since then, deep networks and deep learning have been applied to many image and video processing tasks, including object recognition, object tracking, video frame prediction, image restoration, single-image super-resolution, and image/video compression. The goal of this tutorial is twofold: i) to briefly review the fundamentals of deep learning, types of deep networks, popular architectures, and approaches for training them, and ii) introduce applications of deep learning in image and video processing, i.e., learned image and video processing. Course Content: This short course is organized into two 80 minutes sessions. The first session will introduce the fundamentals of deep networks and learning, including:

  • the universal approximation theorem - function approximation using single- or multi-layer neural networks
  • convolutional neural networks (CNN or ConvNet) and the classic backpropagation+gradient descent method for training them via the supervised learning paradigm
  • auto-encoders and variational auto-encoders
  • adversarial nets, adversarial learning and applications
  • recurrent neural networks and variations for modeling temporal dynamics, and back-propagation in-time method for training them
  • popular deep learning frameworks, including PyTorch and TensorFlow, and interfaces

The second session will cover learned image and video processing, including:

  • learned image restoration and single-image super-resolution (SISR): Traditional human-engineered filters perform inversion of a linear model of a blur/down-sampling function using an assumed signal model, such as sparsity, to compute a regularized solution to an ill-posed inverse problem. Using deep learning, we now learn both the degradation (possibly nonlinear) and signal model directly from a large number of "original and degraded image pairs," without linear modelling assumptions. Deep learned models are more general and more accurate than traditional blind restoration methods that estimate a linear degradation model from a "single" blurred image. Furthermore, results show that learned image restoration produces artifact-free inverse solutions.
  • learned image and video compression: Learned compression methods may replace transforms such as discrete cosine transform (DCT) and discrete wavelet transform (DWT) with auto-encoders that are trained to learn image-dependent transforms or generative codecs that are trained to synthesize decoded images without artefacts. Results show that learned transforms and generative models can be effective for image/video compression. Alternatively, deep learning can be used to tune the parameters of well-known encoder and/or for post-processing using learned compression artifact reduction.
  • learned object tracking by recognition: The standard pipeline for image/video analysis or target tracking has been to compute one or more human-engineered features, such as histogram of oriented gradients, create a Fisher vector, and then perform classification. Now, these features are replaced by deep features and one or more fully-connected layer(s) perform image classification and object tracking by recognition.

T5. Signal Processing for Autonomous and Self-Aware systems

September 3, 2018, Room H, 9:30 - 13:00

Carlo Regazzoni, University of Genova, Italy

Lucio Marcenaro, University of Genova, Italy


The tutorial aims at providing an overview of new insights in introducing dynamic self-awareness structured models and inference in artificial autonomous systems. Over the last decade, researchers have been proposing and investigating computing systems with advanced levels of autonomy in order to manage the ever-increasing requirements in complexity. Cognitive Dynamic Systems (CDS) are one particular approach to tackle these challenges. CDS aim at building up rules of behavior over time through learning from continuous experiential interactions with the environment. By exploiting these rules, CDS can deal with environmental dynamics and uncertainties and have therefore leveraged the automation of tasks with complex perception-action cycles including surveillance, cognitive radio, traffic control and robot mediated industrial and domestic applications. However, autonomous systems and in particular CDS lack in adaptability to internal and external non-stationary conditions. Many real-world systems frequently experience non-stationary conditions (i.e., unknown situations) due to uncertain interactions with the environment (incl. human agents) and users, failures or structural changes. This tutorial describes recent advancements in last generation autonomous systems where self-awareness methods can be introduced that are based on fusion of multimodal signals into dynamic behavioral models. Self-awareness is a broad concept which describes the property of a system, which has knowledge of "itself", based on its own senses and internal models. This knowledge may take different forms, is based on perceptions of both internal and external phenomena and is essential for being able to anticipate and adapt to unknown situations. Self-awareness (in a computational context) is founded on advanced methods and algorithms from different disciplines including signal processing, machine learning, control engineering and decision making. Self-awareness models can be learned from data about experiences where a teacher has shown an entity how to perform a task, as human do. Learned self-awareness models can be used for different purposes like - predicting self and external situation evolution, - detecting non-stationary conditions - selecting best way to adapt agent behavior to current conditions based on the set of learned behaviors. The tutorial comprehensively addresses self-awareness in autonomous systems along multiple fundamental and practical dimensions. First, we investigate approaches for integrating self-awareness into CDS and show how to represent, maintain and exploit knowledge about internal states and environmental interactions with incremental learning techniques and autobiographical memories. Coupled Dynamic Bayesian Networks is one of the basic techniques for representing, modeling and automatically interpreting and managing complex first person interactions situations that are at the basis of self-awareness models here described. Learning automatically such networks and their probabilistic sub models linking discrete and continuous random variables that they include can be done by analyzing agents' multisensorial data when the agent is performing complex tasks in cognitive environments. Methods like Gaussian Processes, Generative Adversarial Networks and Variational Auto Encoders will be discussed as example of possible tools to learn such models. Observations acquired as multidimensional signals can describe context and body of the agent and of surrounding entities. It will be shown how self-awareness models related to private variables of the agent, like its perceptions and representation of its own actions can be interrelated with shared self-awareness models that can be also observed by external entities. A uniform representation for such self-awareness models is discussed that can also be used to improve in time decision capabilities of an agent. Second, self-awareness impact in environments where humans and networks of autonomous systems interact in order to achieve common goals is investigated. Such collaborative systems will be show to acquire improved robustness and capability to satisfy safety requirements when provided of a collective self-awareness distributed layer. Human-focused, task-dependent design of behaviors as well as timely and intuitive communication mechanisms will be shown to be necessary to this end. Concepts developed and discussed through the tutorial will be applied and demonstrated in the intelligent transportation system application domain and in particular to simple networks of drones and autonomous terrestrial vehicles.

Afternoon Tutorials

T6. Vertical-Oriented End-to-End Orchestration in 5G Networks: Modeling, Optimization, Implementation, and Verification

September 3, 2018, Room G, 14:30 - 17:30

Vincenzo Sciancalepore, NEC Laboratories Europe, Germany

Merouane Debbah, CentraleSupélec, France

Alessio Zappone, University of Cassino and Southern Lazio, Italy

Marco Di Renzo, CentraleSupélec, France


The massive deployment of smart devices and heterogeneous verticals like broadband and mission critical services, massive IoT communications, V2X communications, along with a huge variety of scenarios, ranging from smart city to broadband media require a novel and disruptive 5G communication network that will enable massive capacity, zero delay, faster service development, flexibility, elasticity and optimal deployment, less energy consumption, enhanced security, privacy by design, and connectivity to billions of devices with less predictable traffic patterns. Accordingly, next-generation networks need to be capable of handling a complex context of operations and support an increasingly diverse set of new and yet unforeseen services, whose extremely diverging requirements will significantly boost mobile network performance and capabilities. In order to enable this vision, the most promising approach is that of software wireless networks, where a virtualized control plane schedules the available physical resources among multiple vertical applications, enabling an end-to-end design, and flexibly adapting to the needs and requirements of the different classes of users. This tutorial aims at providing a solid understanding of the most recent modeling, optimization, and implementation approaches for 5G software networks, describing in detail the enabling techniques to allow multiple vertical-oriented services to co-exist by sharing the same physical infrastructure.

T7. Signal Processing Methods for Light Field 3D Displays

September 3, 2018, Room A, 14:30 - 17:30

Robert Bregovic, Tampere University of Technology, Finland

Erdem Sahin, Tampere University of Technology, Finland

Atanas Gotchev, Tampere University of Technology, Finland


The tutorial discusses the topic of emerging light field displays from a signal processing perspective. Light field displays are defined as devices that deliver continuous parallax along with the focus and binocular visual cues acting together in rivalry-free manner. The design of such, ultimately immersive, displays goes through the formalization of the light field as a lower-dimensional approximation of the plenoptic function, including its adequate parameterization, sampling and reconstruction. In the first part of the tutorial, the light field basics and the corresponding display technologies are overviewed. This prepares the ground for subsequently discussing two fundamental topics: how to analyze and characterize light field displays as signal processing channels, and how to capture and represent visual content for driving such displays. In the second part of the tutorial, display profiling and characterization is addressed through spectral analysis of multidimensional sampling operators and the display performance is quantified by the notion of display bandwidth. In the third part, modern sparsification approaches working in directional transform domain are presented as they form the core of high-quality light field reconstruction and rendering methods.

T8. Random Matrix Advances in Machine Learning and Neural Nets

September 3, 2018, Room Q, 14:30 - 17:30

Romain Couillet, CentraleSupélec, France

Zhenyu Liao, CentraleSupélec, France

Xiaoyi Mai, CentraleSupélec, France


The advent of the Big Data era has triggered a renewed interest for machine learning and (deep) neural networks. These methods however suffer a double plague (i) as they involve non-linear operators, they are difficult to fathom and offer little guarantees, limits, and hyperparameter control and (ii) they were often developed from small dimensional intuitions and tend to be inefficient to deal with large dimensional datasets. Recent advances in random matrix theory manage to simultaneously deal with both problems; in assuming both dimension and size of the datasets to be simultaneously large, concentration phenomena arise that allow for a renewed understanding and the possibility to control and improve machine learning approaches, sometimes opening the door to completely new paradigms. The objective of the tutorial is twofold. It will first provide a simple and didactic introduction to the basic notions of random matrix theory for the audience to get accustomed to the insights and necessary tools of the domain (∼1h). In a second longer part (∼ 2h), recent advances in applied random matrix theory to machine learning (kernel methods, classification and clustering, semi-supervised learning, etc.) as well as to neural networks (random features and extreme learning machines, backpropagation dynamics) will be investigated. In the end, the audience will get a good grasp on the non-trivial phenomena arising when dealing with large dimensional datasets and on the solutions and methods offered by random matrix theory to embrace large dimensional machine learning.

T9. Robust Covariance and Subspace Learning: dealing with high-dimensionality and small sample support

September 3, 2018, Room H, 14:30 - 17:30

Arnaud Breloy, University Paris Nanterre, France

Guillaume Ginolhac, Université Savoie Mont Blanc, France

Frédéric Pascal, University Paris-Saclay, France


Dealing with both heterogeneous and/or corrupted data with a small sample support is now common in statistical data analysis. Indeed, modern data sets are complex and hard to model due to non-Gaussianity, heterogeneity, non-stationarity, heteroscedasticity, corruption by outliers etc. This drives the problem of robustness to these non-standard conditions. Moreover, high-dimensional data brought by high resolution (imaging, radar), multiplication of sensors (MIMO, large antennas arrays), and/or the large amount of variables/entries (social networks, genomics or financial engineering, among others), leads to the issue of insufficient sample support for parameters learning. These two challenging issues are at the heart of this tutorial on statistical data analysis. More precisely, recent advances in the development of statistical learning tools will be discussed. The main focus will be put onto the second order statistics (covariances) and principal components (subspaces) learning problems since these parameters are a cornerstone of numerous signal processing and machine learning algorithms. To issue of "robustness to non-standard conditions", the framework of robust estimation theory offers an elegant solution to deal with impulsiveness, heterogeneity and outliers. The design of robust learning cost functions can be put into perspective with the statistical model of Complex Elliptically Symmetric (CES) distributions. This provides practical tools for both process design and statistical analysis (i.e. performance study). The second issue ("small sample support") can be tackled using regularization and penalization techniques. Another approach consists in the use of a priori information on the data structure (e.g., lying in a low-rank subspace) to shrink the dimensionality of the problem. Merging these solutions to learn statistical parameters (covariances and principal components) is currently leading to many various research activities, involving tools from multivariate statistical analysis, random matrix theory, and optimization theory. These advances have brought numerous improvements in modern signal processing. This tutorial will present an overview of the recent key results on these topics: design of robust cost functions related to covariances and subspaces, properties of robust estimators, recent shrinkage and structured estimation methods for low sample support configurations, as well as an introduction to the Majorization-Minimization framework allowing to solve the occurring optimization problems. Several examples will illustrate the interest of these techniques for different signal processing applications such as array processing (radar detection and radio interferometer calibration), Ground Penetrating Radar (GPR), and robust sparse principal component analysis (PCA).

T10. Deep learning for Multimedia Forensics

September 3, 2018, Room Pininfarina, 14:30 - 17:30

Luisa Verdoliva, University Federico II of Naples, Italy


With the widespread diffusion of powerful media editing tools, falsifying images and videos has become easier and easier in the last few years. Fake multimedia, often used to support fake news, represents a growing menace in many fields of life, notably in politics, journalism, and the judiciary. In response to this threat, the signal processing community has produced a major research effort. A large number of methods have been proposed for source identification, forgery detection and localization, relying on the typical signal processing tools. The advent of deep learning, however, is changing the rules of the game. On one hand, new sophisticated methods based on deep learning have been proposed to accomplish manipulations that were previously unthinkable. On the other hand, deep learning provides also the analyst with new powerful forensic tools. Given a suitably large training set, deep learning architectures ensure usually a significant performance gain with respect to conventional methods, and a much higher robustness to post-processing and evasions. Generative adversarial networks instead are natural candidates to perform counter-forensic tasks. In this tutorial, after reviewing the main approaches proposed in the literature to ensure media authenticity, the most promising solutions relying on Convolutional Neural Networks are explored, with special focus on realistic scenarios, such as the spreading of manipulated images and videos over social networks. In addition, the robustness of such methods to adversarial attacks is analyzed.