Model-based Auto-Scaling of Distributed Data Stream Processing Applications
G. Russo Russo
Proc. of Middleware '20 Doctoral Symposium
Data Stream Processing (DSP) enables near real-time analysis of fast data streams, produced, e.g., by Internet-of-Things devices. Distributed DSP systems exploit distributed computing infrastructures, possibly spanning both Cloud and Fog/Edge platforms, to scale their execution and cope with high-volume streams. To avoid resource under-provisioning or wastage in face of highly variable workloads, DSP applications should elastically acquire and release resources as needed. In this doctoral work we investigate mechanisms and policies to auto-scale DSP applications. Differently from most previous works, we consider resource heterogeneity and model uncertainty as primary challenges. Moreover, we devise a hierarchical control scheme to avoid the scalability issues that may affect centralized solutions.