Distributed Data Streams in Dynamic Environments

You are here: Home | Projects | Distributed Data Streams in Dynamic Environments

Staff:

Prof. Dr. Friedhelm Meyer auf der Heide, Universität Paderborn

Web page:

www.hni.uni-paderborn.de/alg/projekte/bigdata/

Description:

We currently observe rapidly growing interest in large systems of devices, each of which permanently observes data that has - often in real time - to be aggregated to useful information. Examples for such systems are (1) Information gathered by the smartphones of world-wide distributed user scan (and in some cases is already) used for aggregating information. Besides the monitoring done by the providers, also position based information like nearby restaurants, information about traffic ahead, nearby friends and many further kinds of information can be requested. (2) Cars generate sensor data (about, for example, their position, their speed, their environment, and other cars nearby) that is used in order to realise a self-organised management for driving in intersections or for passing on freeways. (3) Nodes of a network of an Internet Service Provider observe local usage of links, and work together in order to keep the the network in a healthy state. (4) Sensors or robots are deployed to some field, with the aim of aggregating useful information from observed data. Examples for aggregation are (weighted) average, minimum, maximum of measured temperature.

In this project we plan to lay the foundations for the design and analysis of distributed algorithms that continuously compute aggregated information of streams of data which are observed by a multitude of devices. These devices may be mobile, i.e. capable of moving in the plane or in space, and contain both (wireless) communication devices and sensors for observing their environment. The major challenge is to cope with the huge amount of data generated by the devices. Typically, the data streams are too big and arrive too fast to be completely stored, or sent to a central server through a network, or processed in real time. Thus we have to find ways to extract useful information from the streams using restricted resources like memory, communication volume and computation time. We plan to develop continuous distributed algorithms in dynamic environments, taking both mobility of the devices and of the observed events into account. This reflects the scenario of moving people with smartphones who observe their environments. The models for the dynamics of the devices and the generation of the data streams will on one side be motivated by our theoretical models. Moreover, we expect interesting cooperation about such models with projects within the Priority Programme Algorithms for Big Data(SPP 1736), that deal with social networks.

Algorithms for BIG DATA | DFG Schwerpunktprogramm 1736