
Martín-Domingo, L., Fernandez, J. B., Efthymiou, M., & Ali, M. I. (2025). Extracting airline emission KPIs from sustainability reports using large language models (LLMs). Transportation Research Interdisciplinary Perspectives, 33, 101599. https://doi.org/10.1016/j.trip.2025.101599
Luis Martín-Domingo1, 2; Jaime B. Fernandez3; Marina Efthymiou 1; and Muhammad Intizar Ali3
1 Business School, Dublin City University ; 2 Ozyegin University; 3 Insight Research Ireland Centre for Data Analytics, Dublin City University
Abstract
The extraction of environmental Key Performance Indicators (KPIs) from airline sustainability reports is essential for assessing environmental sustainability metrics and regulatory compliance within the European aviation sector. Manual extraction from extensive, unstructured documents is laborious and often inconsistent. This study systematically investigates the potential of advanced Large Language Models (LLMs) –specifically −GPT-4.0, o3-mini, and Deepseek R1- to automate the extraction of emissions-related KPIs from the 2023 sustainability reports of 16 publicly traded European airline groups. Utilizing the Perplexity platform, the research contrasts manual expert extraction with automated approaches, exploring various models, prompt strategies, and data formats. Results indicate that the accuracy of LLM extraction depends significantly on prompt specificity. Attempts to extract data from unstructured documents without guidance yielded low accuracy. However, incorporating explicit KPI terms into prompts increased accuracy from below 30% to above 70%. The format of the data source was also influential, with HTML formats producing superior extraction results compared to PDFs. Despite ongoing challenges in standardizing data and extracting precise KPI metrics, the findings demonstrate that LLMs can substantially streamline environmental, social and governance (ESG) data collection when prompt engineering and source standardization are prioritized. This study represents a novel, interdisciplinary approach by combining advances in large language models (LLMs) with expertise in environmental, social, and governance (ESG) analysis within the aviation sector, offering empirical benchmarking of LLM performance in real-world regulatory contexts. Recommendations for LLM integration into ESG analysis workflows are provided, and future research directions for advancing automation in sustainability reporting are discussed.
Keywords
Airlines, LLMs, ESG, KPIs, GHG Emissions
APA citation:
Martín-Domingo, L., Fernandez, J. B., Efthymiou, M., & Ali, M. I. (2025). Extracting airline emission KPIs from sustainability reports using large language models (LLMs). Transportation Research Interdisciplinary Perspectives, 33, 101599. https://doi.org/10.1016/j.trip.2025.101599