Distributed Data Management (WT 2019/20) - tele-TASK

Distributed Data Management (WT 2019/20) - tele-TASK

27 episodes

The free lunch is over! Computer systems up until the turn of the century became constantly faster without any particular effort simply because the hardware they were running on increased its clock speed with every new release. This trend has changed and today's CPUs stall at around 3 GHz. The size of modern computer systems in terms of contained transistors (cores in CPUs/GPUs, CPUs/GPUs in compute nodes, compute nodes in clusters), however, still increases constantly. This caused a paradigm shift in writing software: instead of optimizing code for a single thread, applications now need to solve their given tasks in parallel in order to expect noticeable performance gains. Distributed computing, i.e., the distribution of work on (potentially) physically isolated compute nodes is the most extreme method of parallelization. Big Data Analytics is a multi-million dollar market that grows constantly! Data and the ability to control and use it is the most valuable ability of today's computer systems. Because data volumes grow so rapidly and with them the complexity of questions they should answer, data analytics, i.e., the ability of extracting any kind of information from the data becomes increasingly difficult. As data analytics systems cannot hope for their hardware getting any faster to cope with performance problems, they need to embrace new software trends that let their performance scale with the still increasing number of processing elements. In this lecture, we take a look a various technologies involved in building distributed, data-intensive systems. We discuss theoretical concepts (data models, encoding, replication, ...) as well as some of their practical implementations (Akka, MapReduce, Spark, ...). Since workload distribution is a concept which is useful for many applications, we focus in particular on data analytics.

Podcasts

Exam Preparation

Published: Feb. 4, 2020, 3:15 a.m.
Duration: 1 hour 31 minutes 1 second

Listed in: Education

Distributed DBMSs & Distributed Query Optimization

Published: Feb. 3, 2020, 1:30 p.m.
Duration: 1 hour 29 minutes 20 seconds

Listed in: Education

Distributed DBMSs

Published: Jan. 28, 2020, 3:15 a.m.
Duration: 1 hour 27 minutes 24 seconds

Listed in: Education

Stream Processing & Distributed DBMSs

Published: Jan. 27, 2020, midnight
Duration: 1 hour 28 minutes 22 seconds

Listed in: Education

Stream Processing

Published: Jan. 21, 2020, 3:15 p.m.
Duration: 1 hour 26 minutes 2 seconds

Listed in: Education

Spark Batch Processing & Stream Processing

Published: Jan. 20, 2020, 1:30 p.m.
Duration: 1 hour 27 minutes 6 seconds

Listed in: Education

Spark Batch Processing 2

Published: Jan. 13, 2020, 1:30 p.m.
Duration: 1 hour 30 minutes 5 seconds

Listed in: Education

Spark Batch Processing

Published: Jan. 7, 2020, 3:15 p.m.
Duration: 1 hour 28 minutes 12 seconds

Listed in: Education

Exercise Evaluation Assignment 1-3

Published: Jan. 6, 2020, 1:30 p.m.
Duration: 1 hour 25 minutes 53 seconds

Listed in: Education

Beyond MapReduce

Published: Dec. 17, 2019, 3:15 p.m.
Duration: 1 hour 29 minutes 4 seconds

Listed in: Education

Batch Processing: Distributed File Systems and MapReduce

Published: Dec. 16, 2019, 1:30 p.m.
Duration: 1 hour 31 minutes 30 seconds

Listed in: Education

Transactions & Batch Processing

Published: Dec. 9, 2019, 1:30 p.m.
Duration: 1 hour 11 minutes 9 seconds

Listed in: Education

Consistency and Consensus & Transactions

Published: Dec. 3, 2019, 3:15 p.m.
Duration: 1 hour 26 minutes 42 seconds

Listed in: Education

Distributed Systems & Consistency and Consensus

Published: Dec. 2, 2019, midnight
Duration: 1 hour 29 minutes 52 seconds

Listed in: Education

Distributed Systems

Published: Nov. 26, 2019, 3:15 p.m.
Duration: 1 hour 32 minutes 13 seconds

Listed in: Education

Replication & Partitioning

Published: Nov. 25, 2019, 1:30 p.m.
Duration: 1 hour 28 minutes 37 seconds

Listed in: Education

Replication 2

Published: Nov. 20, 2019, 3:15 p.m.
Duration: 1 hour 24 minutes 21 seconds

Listed in: Education

Storage and Retrieval & Replication

Published: Nov. 18, 2019, 1:30 p.m.
Duration: 1 hour 20 minutes 49 seconds

Listed in: Education

The Graph Data Model

Published: Nov. 12, 2019, 3:15 p.m.
Duration: 1 hour 29 minutes 5 seconds

Listed in: Education

Data Models and Query Languages

Published: Nov. 11, 2019, 1:30 p.m.
Duration: 1 hour 28 minutes 29 seconds

Listed in: Education

Akka Actor Programming 3 - Patterns

Published: Nov. 5, 2019, 3:15 p.m.
Duration: 1 hour 31 minutes 16 seconds

Listed in: Education

Akka Actor Programming 2

Published: Nov. 4, 2019, 1:30 p.m.
Duration: 1 hour 30 minutes 36 seconds

Listed in: Education

Akka Actor Programming

Published: Oct. 28, 2019, 1:30 p.m.
Duration: 1 hour 31 minutes 24 seconds

Listed in: Education

Encoding and Communication 2

Published: Oct. 22, 2019, 3:15 p.m.
Duration: 1 hour 31 minutes 35 seconds

Listed in: Education

Encoding and Communication

Published: Oct. 21, 2019, 1:30 a.m.
Duration: 1 hour 29 minutes 36 seconds

Listed in: Education

Foundations

Published: Oct. 15, 2019, 3:15 p.m.
Duration: 1 hour 21 minutes 37 seconds

Listed in: Education

Introduction

Published: Oct. 14, 2019, midnight
Duration: 1 hour 15 minutes 38 seconds

Listed in: Education