Every year, companies include in their annual report, in the non-financial information section, what is known as the "ESG report," if they so choose (large companies are required to do so, and SMEs will be required starting in 2027 in the European Union). ESG stands for "Environmental, Social, and Governance," representing the company's environmental, social, and governance criteria.
This report includes all corporate social and environmental responsibility activities: the environmental impact of their operations, the decarbonization measures they are implementing, the activities they carry out with various associations and non-governmental organizations, etc. There are various standards for preparing this report, such as the Global Reporting Initiative [1] or specific ISO standards, but companies have considerable freedom in drafting it.
A significant part of the data included in ESG reports relates to the impact of the company’s operations: electricity consumption, its sources (thermal, nuclear, renewable, hydro, wind, etc.), fuel consumption, water usage, and the condition in which it is returned to the environment, among others. Some data is well-known, but others—such as the energy consumption of a specific fleet of computers—are more challenging to obtain.
This project aims to develop a system that, by aggregating data from various sources and potentially using artificial intelligence techniques, can extrapolate missing data and incorporate it into the ESG report. Additionally, language models (LLMs, SLMs), complemented by Retrieval-Augmented Generation (RAG) techniques, will be used to align with the aforementioned reporting standards, minimizing the human effort required.
The solution may be presented as a set of scripts, a Power BI-style report visualization platform, or a web application, depending on the project's progress.
https://www.globalreporting.org/