Data-driven-application-offloading

Data-driven application offloading in the continuum


March 2023
By UPC

Cloud Computing (CC) is growing into a vast and complex ecosystem that provides a wide range of services for all types of businesses. In 2021, 72% of large companies in Europe have been using different types of cloud services for their businesses, with an increase of 7% with respect to the previous year [1]. From the other side, the Internet of Things (IoT) is becoming another revolutionary trend, feeding data hungry applications with sensed data to shape dynamic smart environments. Although most data are being collected from the IoT, 80% of their processing is performed in the cloud, which not only overloads network facilities and causes high energy consumptions but also provokes significant communication latencies [2].

This observation fuels a very disruptive idea, paving the way to novel business models benefiting from such highly distributed computing capabilities. We envisage a scenario flooded of devices, from the edge up to the cloud, some of which are available to embrace offloaded computing. In this scenario, smart applications could be either executed close to the data sources, thus absorbing data transfer latency, or executed at the cloud, providing almost unlimited computing power but with a communication penalty. However, for complex and data intensive applications with geographically distributed multi-resource constraints, the solutions span becomes particularly challenging. For instance, some basic applications will be simply executed locally; alternatively, for many computing intensive applications the straightforward solution will be moving the computation to the cloud. But, in cases where the application data is highly distributed, or generated at real-time in remote locations, or obtained through streaming from distributed sources, and where several computing nodes are geographically distributed and available, solutions which efficiently exploit data locality can dramatically improve the execution performance.

In the ICOS project we aim at designing a holistic approach to seamlessly managing resources at the continuum, including the cloud layer, the near and far edge, and the IoT layer, and implementing the continuum as a dynamic, intelligent and secure meta operating system. In such meta OS, one of the most interesting goals is, giving a service consisting of a set of interdependent tasks and multi-resource requirements, finding the most appropriate selection of nodes to offload the execution of this service. Note that this problem becomes far more challenging in an open context dealing with a large number of heterogeneous devices distributed through the continuum, considering the physical nodes' location and distance between them and the distributed geography of data sources and their connectivity to the system. Mapping tasks into computing nodes in heterogeneous systems is an interesting problem of current research. In the scope of cloud computing, existing container scheduling products cannot manage efficiently concurrent multi-resource requests when systems are heterogeneous [3].

 Current literature already shows some research in this area. For example, in [4], a prototype data-flow driven optimal tasks to resources mapping technology for global heterogeneous management systems is proposed. This pioneering contribution considers for the time the heterogeneity and distribution of the computing nodes, along with the geographical scattering of data sources and the effects of the run-time data-flow between nodes for the selected parallelization strategy, all together in the mapping decision process. Two optimization strategies have been designed, a staged model to obtain local optimal solutions by stages, and a global optimal model to be used as a reference, in both cases using mixed integer linear programming technology. The staged model finds an optimal solution in 76% of the evaluated scenarios and, in average, near-optimal solutions are very close to the optimal ones, with a performance difference below 6%. However, this model is fast enough to be used in real system implementations.



[1] https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Cloud_computing_-_statistics_on_the_use_by_enterprises


[2] A. Botta, W. de Donato. V. Persico, A. Pescapé. On the Integration of Cloud Computing and Internet of Things. In 2nd International Conference on Future Internet of Things and Cloud, Barcelona, Spain, August 2014.


[3] Y. Hu, H. Zhou, C. de Laat, Z. Zhao. Concurrent container scheduling on heterogeneous clusters with multi-resource constraints. In Future Generation Computing Systems, Vol. 102, January 2020.


[4] J. Garcia, F. Aguiló, A. Asensio, E. Simó, M. Zaragozá, X. Masip-Bruin. Data-Flow Driven Optimal Tasks Distribution for Global Heterogeneous Systems. In Future Generation Computing Systems, Vol. 125, December 2021.


Share by: