Laboratory for Analysis, Testing, and Publishing, Energy Efficiency in all layers of Computing

CeecS laboratory will use specific test equipment will be designed and applied to different computing systems together with the corresponding software tools for different applications as shown in the figure below. The measurement results would be made available to the community on a continuous basis. These results will also help highlight specific aspects that could be improved in new designs. In addition, there will be continuing analysis on newer applications and AI methods that are being developed in the ecosystem and released to the public similar to those efforts undertaken by the standards committee.

Software Model — **Figure:** Architecture/Hardware-Algorithm-Application configurations to be studied.

The CeecS is expected to be established jointly between SLAC and Stanford University (in collaboration with other labs and industrial partners), will measure and publish energy estimates used in computing from microprocessors to large datacenters. CeecS also will develop benchmarks in collaborations with the community to estimate energies across all computing layers and is being targeted as a Center which will monitor energy of AI/ML (Artificial Intelligence/Machine Learning) systems and applications. In addition, CeecS intends to demonstrate prototypes of systems which are 1000X or more energy efficient using a combination of hardware/software innovations in one of the four domains of computing identified earlier as required by the Department of Energy’s EES2 efforts. In the previous years (2023 & 2024), we have started benchmarking some preliminary measurements on six Machine Learning algorithms on multiple hardware platforms for different applications.

In addition, we are developing new tools built on existing tools, (e.g. CompJoules) for estimating energy of different hardware-software platforms. The tools are being further developed as a multi-platform energy estimation tool designed to measure the energy cost and performance of custom machine learning algorithms across various hardware architectures, including CPU, GPU, ASIC, and FPGA.