Almeida, Francisco and Assunção, Marcos D. and Barbosa, Jorge and Blanco, Vincente and Brandic, Ivona and Da Costa, Georges and Doltz, Manuel F. and Ester, Anne C. and Jarus, Mateusz and Karatza, Helen D. and Lefèvre, Laurent and Mavridis, Ilias and Oleksiak, Ariel and Orgerie, Anne-Cécile and Pierson, Jean-Marc
Energy monitoring as an essential building block towards sustainable ultrascale systems.
(2018)
Sustainable Computing, 17. 27-42. ISSN 2210-5379
|
(Document in English)
PDF (Author's version) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader 610kB |
Official URL: https://doi.org/10.1016/j.suscom.2017.10.013
Abstract
An ultrascale system (USS) joins parallel and distributed computing systems that will be two to three orders of magnitude larger than today's infrastructure regarding scale, performance, the number of components and their complexity. For such systems to become a reality, however, advances must be made in high performance computing (HPC), large-scale distributed systems, and big data solutions, also tackling challenges such as improving the energy efficiency of the IT infrastructure. Monitoring the power consumed by underlying IT resources is essential towards optimising the manner IT resources are used and hence improve the sustainability of such systems. Nevertheless, monitoring the energy consumed by USSs is a challenging endeavour as the system can comprise thousands of heterogeneous server resources spanning multiple data centres. Moreover, the amount of monitoring data, its gathering, and processing, should never become a bottleneck nor profoundly impact the energy efficiency of the overall system. This work surveys state of the art on energy monitoring of large-scale systems and methodologies for monitoring the power consumed by large systems and discusses some of the challenges to be addressed towards monitoring and improving the energy efficiency of USSs. Next, we present efforts made on designing monitoring solutions. Finally, we discuss potential gaps in existing solutions when tackling emerging large-scale monitoring scenarios and present some directions for future research on the topic
Repository Staff Only: item control page