Research Theme
Software consists of logic (a set of combined instructions) that automates the functions of hardware components. While there have been significant innovations in energy-efficient hardware, the software ultimately determines how computing resources are used, thereby impacting the entire system's energy footprint (Castor-2024). Software plays a crucial role in lowering the GHG emissions of ICT systems, as it directly affects how a system consumes energy during operation.
My research activities reflect my passion for software and sustainability, focusing on methods to improve software performance and energy efficiency. Software performance and energy efficiency are the result of several variables that can be controlled or reliably studied at different stages of the software development process, such as design, implementation, and execution time. I study what are these variables and how to quantify their influence on software performance and energy efficiency. In other words, my goal is to identify performance and energy hotspots and define approaches to solve them. This philosophy fits with the concept of tactics, namely "design decisions that influence the achievement of a quality attribute response" (Procaccianti-2014). Tactics have been widely studied during the past decade. However, research is needed to define and quantify the extent to which tactics impact energy efficiency. For example, the adoption of knowledge distillation on machine learning models (AwesomeAndDarkTactics) is a design decision that can improve the performance and the energy efficiency of the software systems integrating such models. In Yuan et al (Yuan-2024), we quantify the benefits of knowledge distillation on performance and energy efficiency using BERT and GPT2, two well-known machine learning models.
I mainly use Experimentation and Model-Driven Engineering (MDE) to isolate a set of factors and control the variables that can have an impact on software performance and energy efficiency. Most of my research is carried out "in vitro", namely in a controlled environment built for software performance and energy efficiency research. Therefore, my approach can be considered as "inductive", as knowledge is built on the observations made in the laboratory (Procaccianti-2016). At the Vrije Universiteit Amsterdam, I share and maintain the GreenLab, a laboratory created for such experiments (Malavolta-2024). My current research can be summarized by the activities conducted in the lab, which include deploying software systems for data collection and recommendation of solutions to improve them. Additionally, I am developing a surrogate system called the mirror that approximates the energy consumption and performance of complex software systems to facilitate "what-if" scenario analysis in the lab. My current research can be summarized into two primary tracks: Data Collection and Recommendation and Mirroring Complex Software Systems for In Vitro Experimentation. I intend to keep researching these topics in the future.
Data Collection and Recommendation
The data collection and recommendation research line involves three stakeholders: a service provider, which is the organization that owns and contributes to the software (such as a company or a group of developers), researchers, and students. The service provider supplies the software or the logs collected during production to the researchers and students who possess expertise in software energy efficiency and performance. Researchers and students conduct experiments in the GreenLab, where they observe software performance under various scenarios, such as different workloads. They identify performance and energy hotspots and subsequently develop tactics to address these issues. This analysis can also be performed offline by examining the logs provided by the service providers. The results, along with the suggested tactics, are shared with the service provider, who can implement them in a production environment. In addition, we study the interplay between software performance and energy efficiency, such as the the relationship between exection time, memory, and cache usage on software energy usage. In this research line, we use an academic lab as a bridge between academia and industry, examining free and open-source software in depth (e.g., source code and architecture) while conducting experiments in our lab. Data collection and recommendation brought good results and new collaborations in the area of scientific software. We collaborated with the Bonvin Lab from the University of Utrecht, which developed a software called HADDOCK that models the interactions of molecules. We deployed HADDOCK in the GreenLab and derived 8 tactics.
Mirroring Complex Software Systems for In Vitro Experimentation
This research line aims to develop a system that mimics the resource usage of complex software systems, enabling in vitro analysis that would otherwise be unfeasible due to limited resources. We called this system as the mirror. As for data collection and recommendation, this project usually combines a service provider, researchers, and students. The service provider produces execution statistics from its software running in production to the research lab. Researchers and students build the mirror using the artifacts received from the service provider. The mirror can be a new software system that reflects the resource usage (e.g., utilization and power distribution) of the source software system with a factor δ that embodies the differences between the production environment and the research lab. The mirror can also be implemented using models. We built the first example of the mirror using Layered Queuing Networks, a type of performance model (Stoico-2023), where we modeled microservices and embedded software. This work stems from the collaboration between the University of L'Aquila (Italy) and Vrije Universiteit Amsterdam. We presented a poster to the ICT for Sustainability (ICT4S) conference describing the mirror system (Stoico-2024). Currently, there are two master's theses ongoing on the topic.