Auto-scaling in Data Stream Processing: a Model Based Reinforcement Learning Approach

V. Cardellini, F. Lo Presti, M. Nardelli, G. Russo Russo

InfQ 2017 - New Frontiers in Quantitative Methods in Informatics, Communications in Computer and Information Science, Vol. 825, Springer, 2018.

[pdf] [doi]

By exploiting on-the-fly computation, Data Stream Processing (DSP) applications can process huge volumes of data in a near real-time fashion. Adapting the application parallelism at run-time is critical in order to guarantee a proper level of QoS in face of varying workloads. In this paper, we consider Reinforcement Learning based techniques in order to self-configure the number of parallel instances for a single DSP operator. Specifically, we propose two model-based approaches and compare them to the baseline Q-learning algorithm. Our numerical investigations show that the proposed solutions provide better performance and faster convergence than the baseline.