Research Proposal

AI-Enabled Financial and ESG Reporting

Effects on Corporate Accountability and Disclosure Quality in the EU and China.

Background and Motivation

Artificial intelligence is rapidly becoming embedded in corporate reporting processes. Companies increasingly rely on AI tools to assist with drafting, analyzing, and improving financial and ESG disclosures. These technologies promise improved efficiency, enhanced analytics, and more timely communication with stakeholders. At the same time, they raise an important concern: if AI can generate well-structured and persuasive reports at scale, how can stakeholders determine whether disclosure quality has genuinely improved rather than simply becoming more sophisticated?

This tension creates a new challenge for regulators, auditors, and researchers. AI has the potential to reduce information asymmetry and increase transparency, yet it may also enable new forms of automated greenwashing, impression management, or overly optimistic financial narratives. As AI becomes embedded in reporting processes, the key question is no longer whether companies will use AI, but how its use will reshape corporate accountability and trust.

The European Union and China provide an especially valuable comparative setting. The EU is currently developing strict regulatory frameworks, including the Corporate Sustainability Reporting Directive (CSRD) and the AI Act, while China represents a rapidly evolving and large reporting environment shaped by different institutional pressures. Studying both contexts offers an opportunity to understand how regulatory and institutional environments shape the role of AI in corporate disclosure.

AI as an Information Intermediation Layer

From a computational perspective, AI is not merely a writing assistant in corporate reporting; it acts as an information intermediation layer between internal corporate data and external stakeholders.

Traditionally: Corporate performance → Human interpretation → Corporate report

Now: Corporate data → AI models → Generated narrative → Stakeholders

This introduces a new algorithmic agency layer, where language models and automated analytics influence how corporate reality is represented. From an information economics viewpoint, this creates a new form of information asymmetry:

Stakeholders cannot easily observe how the narrative was generated.
The reporting process becomes partially algorithmically opaque.

This shift motivates the need for computational auditing methods capable of analysing AI-mediated disclosures.

Research Aim

The aim of this PhD is to investigate how the adoption of artificial intelligence in financial and non-financial reporting influences disclosure quality, corporate accountability, and stakeholder trust across different institutional contexts.

Formal Problem Framing

The research can be formalised as a mapping problem. Let:

R_i^t = Corporate report text of firm i at time t
A_i^t = AI adoption level
X_i^t = Firm fundamentals (financial + ESG performance)
S_i^t = External signals (news, NGOs, public sentiment)

Goal: Model the relationship:

DisclosureQuality = f(AI_Adoption, Firm_Fundamentals, Institutional_Context)

And measure divergence:

NarrativeGap = Distance(R_i^t, ExternalSignals_i^t)

This allows the project to empirically quantify whether AI improves alignment with reality or increases narrative manipulation.

Research Questions

The project will address the following core inquiries, framed to systematically evaluate the impact of AI:

RQ1

How does the adoption of AI tools influence the quality, readability, and tone of financial and ESG disclosures?

RQ2

Does AI-assisted reporting reduce or increase the risk of greenwashing and impression management?

RQ3

How do regulatory and institutional differences between the EU and China shape the impact of AI on corporate reporting practices?

RQ4

How can AI-based analytical methods be used by researchers, auditors, and regulators to evaluate the credibility of corporate disclosures?

Operationalisation of Key Constructs

Each research question will be translated into measurable variables ensuring empirical testability:

Concept	Operationalisation
Readability	FOG, Flesch, BERT-based clarity score
Tone	Transformer sentiment embeddings
Greenwashing risk	Narrative–performance divergence
AI adoption	AI disclosure + digital intensity proxies

Theoretical Framework

The project will draw on several complementary theoretical perspectives widely used in accounting and disclosure research to analyze both technological and organizational aspects:

Agency Theory

To examine how AI affects information asymmetry between managers and stakeholders.

Stakeholder Theory

To analyze how AI-driven reporting responds to stakeholder expectations and pressures.

Legitimacy Theory

To explore whether AI enables new forms of legitimacy-seeking and impression management.

Institutional Theory

To understand how regulatory environments shape AI adoption and reporting behavior.

Linking Theory with Computation

The theoretical lenses map directly to measurable constructs, creating a bridge between qualitative theory and quantitative measurement:

Agency theory → Information asymmetry metrics: Narrative vs performance divergence.
Legitimacy theory → Impression management detection: Excessive positivity vs external signals.
Institutional theory → Cross-country modelling: EU vs China regulatory interaction terms.

Methodology & Data Strategy

This project will adopt a mixed-methods research design combining large-scale quantitative analysis with qualitative and comparative insights.

5.1 Data Collection

A large dataset of corporate disclosures will be constructed from publicly listed companies in the EU and China, including:

Annual reports and financial filings
Sustainability and ESG reports
Corporate press releases and disclosures

To enable cross-validation of corporate claims, the dataset will be complemented with alternative data sources such as:

News media coverage
NGO and third-party sustainability assessments
Public discussion and sentiment data

5.2 Quantitative Analysis

Natural Language Processing (NLP) techniques will be used to analyze large volumes of textual disclosure data, focusing on:

Readability and linguistic complexity metrics
Sentiment and tone analysis
Detection of impression-management language
Identification of ESG-related narratives and themes

Econometric methods, particularly panel data analysis, will be used to examine the relationship between AI adoption, disclosure characteristics, firm performance, and regulatory context.

Data Pipeline & Econometric Modelling

The project will build an automated research pipeline spanning Data Acquisition, NLP Processing, and Feature Engineering.

pipeline.py

import pandas as pd
from transformers import AutoTokenizer, pipeline
import statsmodels.api as sm

# Step 1: Initialize NLP extraction pipeline (FinBERT)
tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")
nlp_model = pipeline("text-classification", model="ProsusAI/finbert")

# Step 2: Panel regression modelling frameworks

                        

Panel regression framework:

DisclosureQuality_it = β₀ + β₁ AI_it + β₂ Regulation_t + β₃ (AI × Regulation) + ε_it

This allows testing direct AI effects alongside regulatory moderation effects.

Expected Outcomes & Hypotheses

Based on existing literature and initial observations, the project expects to find that:

AI adoption is associated with improved readability and consistency of corporate disclosures.
At the same time, AI may increase the use of impression-management language and strategic narrative framing.
Stronger regulatory environments (such as the EU) are likely to moderate these effects and reduce the risk of AI-enabled greenwashing.

Expected Empirical Patterns

These will be tested using robustness checks and alternative model specifications. Hypothesised patterns:

[1] AI → higher readability & consistency
[2] AI → stronger narrative positivity
[3] Weak regulation → larger narrative gap
[4] Strong regulation → reduced greenwashing

Expected Contributions

This research aims to deliver value across three distinct pillars:

Academic Contribution: Provide new empirical evidence on the role of AI in corporate reporting and contribute to accounting and ESG literature by integrating computational text analysis with disclosure research.
Methodological Contribution: Demonstrate how NLP and large-scale text analysis can be applied to accounting and corporate governance research.
Practical Contribution: Offer findings relevant to regulators, auditors, and policymakers seeking to ensure that AI adoption strengthens rather than weakens corporate transparency and accountability.

Computational Integration Value

Academic Contribution: Introduce computational accounting research combining NLP, Econometrics, and ESG research.

Practical Contribution: Provide early warning indicators of AI-driven greenwashing and deliver empirical tools for regulators and auditors.

Fit with the PhD Project

This research aligns closely with the project's focus on AI, financial and non-financial reporting, and ethical challenges. My background in mathematics, optimization, and NLP provides a strong technical foundation for conducting large-scale data analysis and developing new approaches to studying corporate disclosure.

Personal Technical Alignment

This technical background enables execution of the full research pipeline from data collection to empirical modelling. Relevant skills for execution:

NLP pipelines (Transformers: BERT/RoBERTa)
Large-scale data processing & web scraping
Statistical modelling in R/Python
Optimization and scalable computing