1st DEBS Summer School

1st Summer School in Conjunction with DEBS Doctoral Symposium

Topic: Machine Learning for Real-time Analytics
Monday June 24, 2019

The DEBS 2019 Summer School welcomes participation from graduate students, researchers and professionals who are working on topics that are related to Machine Learning for Real-time Analytics. The event aims at graduate students, researchers and practitioners starting or being active in the field of Machine Learning for Real-time Data Analytics. There will be keynotes by distinguished speakers. Furthermore, there will be plenty of opportunities to exchange ideas and to discuss the topics with other participants and speakers. The summer school will be followed by the traditional doctoral symposium to which Ph.D. candidates are invited to submit and discuss their research work.

Scope

Machine learning (ML) is a key discipline and set of tools for the modern data analytics. It allows engineers to build systems that learn by themselves from data. The topics of the summer school include (but are not limited to) Machine Learning techniques for CEP, Deep Learning, real-time analytics, complex-event detection, data stream processing, big/fast data analysis, event processing for AI/ML, AI/ML for event processing.

Program

Monday 24/06/2019
Time Subject
9:00 - 10:30 Carsten Binnig, TU Darmstadt
2.07 aurum, Darmstadtium
10:30-11:00 Coffee Break
0.03 copernicium, Darmstadtium
11:00 - 12:30 Pedros Casas, AIT Austria
2.07 aurum, Darmstadtium
12:30 - 13:30 Lunch
0.03 copernicium, Darmstadtium
13:30 - 14:15 Maja Rudolph, Bosch AI
2.07 aurum, Darmstadtium
14:15 - 15:00 Lars Dannecker and Andreas Roth, SAP, Walldorf
2.07 aurum, Darmstadtium
15:00 - 15:30 Coffee Break
0.03 copernicium, Darmstadtium
The DEBS doctoral symposium will be held right after the summer school. Please check our call for doctoral symposium papers here. More details will follow up.

Speakers

Speaker
Title
More info
Carsten Binnig
Technical University of Darmstadt, Germany
Towards Interactive Data Analytics
Pedro Casas
AIT Austrian Institute of Technology GmbH, Austria
Continual Learning over Network Streaming Data: What to Remember and What to Forget?
Maja Rudolph
Bosch Center for Artificial Intelligence (BCAI)
Density Estimation for Time Series Data
Lars Dannecker and Andreas Roth
SAP, Walldorf
Scalable Data Processing and Machine Learning Using Modern Container Technologies

Carsten Binnig

Professor

Technical University of Darmstadt, Germany

Keynote Title: Towards Interactive Data Analytics
Abstract:

Technology has been the key enabler of the current Big Data movement. Without open-source tools like R and Spark, as well as the advent of cheap, abundant computing and storage in the cloud, the trend toward datafication of almost every field in research and industry could never have happened. However, the current Big Data tool set is ill-suited for interactive data analytics to better involve the human-in-the-loop which makes the knowledge discovery a major bottleneck in our data-driven society. In this talk, I will present an overview of our current research efforts to revisit the current Big Data stack from the user interface to the underlying hardware to enable interactive data analytics and machine learning on large data sets.

Biography:

Carsten Binnig is a Full Professor in the Computer Science department at at TU Darmstadt and an Adjunct Associate Professor in the Computer Science department at Brown University. Carsten received his PhD at the University of Heidelberg in 2008. Afterwards, he spent time as a postdoctoral researcher in the Systems Group at ETH Zurich and at SAP working on in-memory databases. Currently, his research focus is on the design of data management systems for modern hardware as well as modern workloads such as interactive data exploration and machine learning. His work has been awarded with a Google Faculty Award, as well as multiple best paper and best demo awards for his research.

Pedro Casas

Scientist

AIT Austrian Institute of Technology GmbH, Austria

Keynote Title: Continual Learning over Network Streaming Data: What to Remember and What to Forget?
Abstract:

Continuous, dynamic and short-term learning is an effective learning strategy when operating in very fast and dynamic environments, where concept drift constantly occurs. In an on-line, stream learning model, data arrives as a stream of sequentially ordered samples, and older data is no longer available to revise earlier suboptimal modeling decisions as the fresh data arrives. Learning takes place by processing a sample at a time, inspecting it only once, and as such, using a limited amount of memory; stream approaches work in a limited amount of time, and have the advantage to perform a prediction at any point in time during the stream. In this talk I focus on a particularly challenging problem, that of continually learning detection models capable to recognize cyber-attacks and system intrusions in a highly dynamic environment such as the Internet. I consider adaptive learning algorithms for the analysis of continuously evolving network data streams, using a dynamic, variable length system memory which automatically adapts to concept drifts in the underlying data. By continuously learning and detecting concept drifts to adapt memory length, I show that adaptive learning algorithms can continuously realize high detection accuracy over dynamic network data streams. I would additionally present other approaches leading to the continual acquisition of knowledge, by extending traditional active learning schemes to the streaming domain, through the application of reinforcement learning concepts. Taking a broader look into the application of AI for Networking (Ai4NETS), I will conclude my talk by elaborating on some of the major showstoppers hindering a natural application of machine learning in the networking applied field.

Biography:

Dr. Pedro Casas is Scientist in ICT Security and Information Management at the AIT Austrian Institute of Technology in Vienna. He received an Electrical Engineering degree from Universidad de la República, Uruguay in 2005, and a Ph.D. degree in Computer Science from Institut Mines-Télécom, Télécom Bretagne, France in 2010. He was Postdoctoral Research Fellow at the LAAS-CNRS lab in Toulouse between 2010 and 2011, and Senior Researcher at the Telecommunications Research Center Vienna (FTW) between 2011 and 2015. He works as project manager and technical work leader in different networking-related initiatives, including research projects and commercial solutions. His work focuses on machine-learning and data mining based approaches for Networking, big data analytics and platforms, Internet network measurements, network security and anomaly detection, as well as QoE modeling, assessment and monitoring. He has published more than 135 Networking research papers in major international conferences and journals, received 12 awards for his work - including 7 best paper awards, and he is general chair for different conferences, workshops and leading actions in network measurement and analysis, including the IEEE ComSoc ITC Special Interest Group on Network Measurements and Analytics. He is leading the Big-DAMA project on Big Data analytics for network traffic monitoring and analysis.

Maja Rudolph

Bosch Center for Artificial Intelligence

Keynote Title: Density Estimation for Time Series Data
Abstract:

Machine learning let’s us uncover useful patterns in data. Here we consider signals measured over time, i.e. time series. Time series data is ubiquitous, and understanding it with machine learning can have an impact on many applications including in engineering. This tutorial is about density estimation of time series. Time series data is high dimensional and highly structured and the challenge is to develop methods that have the flexibility to capture the complexities of the data while still being trainable. We will survey auto-regressive models and models based on normalizing flows. Good density estimates are useful for data efficient learning in downstream tasks.

Biography:

Maja is a Research Scientist at the Bosch Center for Artificial Intelligence where she works on reliable probabilistic models of time series data. She received a PhD in Machine Learning from Columbia University for her work with David Blei on probabilistic embedding models – methods for learning interpretable representations from data. In 2013, she graduated from MIT with a BS in Mathematics.

Lars Dannecker & Andreas Roth

SAP, Walldorf

Keynote Title: Scalable Data Processing and Machine Learning Using Modern Container Technologies
Abstract:

The continuing acceleration of innovation in the digital economy dramatically changes the way modern companies are working. Today, enterprises are competing in a race for the best technology to gain better insights into their companies, deals and processes with the goal to obtain competitive advantages in the market. The winners of this race will be more than ever defined by their ability to adopt newly arising technologies such as Cloud Computing, Artificial Intelligence, Machine Learning, Autonomous Systems and Advanced Analytics. However, in today's fast-moving economy, new technological trends arise in a velocity and variety that makes it hard even for modern companies to keep up. One of those ground-breaking technological trends gaining more and more attention in research and industry is machine learning, which is best described as a set of intelligent algorithms and statistical models providing systems the ability to automatically gain knowledge from experience. Companies utilize machine learning to take more intelligent and pro-active decisions and to apply automation for repetitive tasks, ultimately allowing them to transition from classical businesses to intelligent enterprises. However, despite the fact that machine learning has the potential to revolutionize the industry, its comprehensive adoption is still relatively low. The reason is that machine learning requires a complex process before it can be used productively. It starts with extracting data from various, heterogeneous sources, over unifying and transforming the data to a joint data set, towards running potentially expensive training jobs. Realizing and optimizing this process efficiently end-to-end in a way which is compliant to data processing regulations is a key element for a successful adoption of machine learning. Hence, companies are investing huge amounts of resources for building machine learning systems, including the orchestration of a large zoo of individual, heterogenous tools that all realize only partial aspects of the full machine learning process. In this talk, we will present SAP's approach of using cloud-native, containerized development to build an efficient, distributed platform for enterprise-scale end-to-end data science and machine learning.

Biography of Lars Dannecker:

Lars is a Big Data Architect and Development Manager in SAP’s Technology and Innovation department. Currently, he is working on SAP’s novel landscape orchestration and intelligent data processing solution called SAP Data Intelligence, where he is responsible for realizing the managed cloud service and the hybrid cloud deployment.

Lars received his master’s degree in media computer science from the Technische Universität Dresden in 2009. Afterwards, he joined the collaborative PhD program of SAP Research and the Technische Universität Dresden (Database Technology Group, Prof. Dr.-Ing. Wolfgang Lehner). In 2014 he successfully completed his doctorate and received his PhD with his thesis about "Efficient and Accurate Forecasting of Evolving Time Series from the Energy Domain". Lars published multiple papers on international top rank conferences such as VLDB, CIKM and SSDBM. He further served as program committee member or external reviewer for several international conferences (SSDBM, CIKM, ICDE, ER, ADBIS, DOLAP, ENDM) and journals (ACM CSUR, Springer NCAA).

Lars is an experienced speaker giving technical presentations at various internal and external occasions such as VLDB, CIKM, Cisco Live, SAP Sapphire, SAP TechEd, AWS Re:Invent, Tech Field Day, Techwise TV etc.

Biography of Andreas Roth:

Dr. Andreas Roth is a development architect in the machine learning organization at SAP and responsible for the architecture of machine learning tools and machine learning application integration for SAP’s machine learning and data science platform. Before joining the machine learning department, he was working in various areas of SAP’s technology group, like HANA, SAP Cloud Platform, and SAP Research, and leading various European and German research projects in the area of software engineering. He received his PhD in 2006 from Karlsruhe University, Germany, on the topic of formal software engineering methods.

Important Dates

Abstract submission for research track February 19th March 8th, 2019
Research and Industry paper submission February 26th March 8th, 2019
Tutorial proposal submission March 22nd April 5, 2019
Grand challenge solution submission April 7th April 22nd, 2019
Author notification research and industry track April 9th April 19th, 2019
Poster, demo & doctoral symposium submission April 22nd May 3rd, 2019
Early registration May 31st, 2019