It is likely that in seeking a quick understanding of new research you’ve turned to the paper abstract, or, in this case, this Brief. Such distillations of otherwise lengthy and complicated texts are essential for making practical use of the information contained therein. Similarly, investors looking to cast predictions about companies’ financial trajectories are faced with dense, drawn-out documents from which to cull their insights. In this paper, the authors test whether ChatGPT can summarize corporate disclosures effectively. 

The authors collect narrative discussions of performance from the annual reports (called management discussion and analysis sections, or MD&As) of all public firms in the United States between 2009 and 2020, along with transcripts of earnings conference calls. They use GPT (the program that forms the base of ChatGPT) to summarize a random sample of 20% of the documents and find the following:

  • Summarized MD&As and conference call transcripts are typically less than a fifth of their original length, suggesting that GPT leads to large gains in information processing.
  • Sentiment becomes more amplified when documents are summarized, with positive documents yielding more positive summaries and vice versa. This may result from companies “hedging” their views with precautionary statements that do not contain much information or are largely boilerplate and suggests that summaries offer a cleaner measure of the true sentiment contained in corporate disclosures.
  • Perhaps unsurprisingly given the above, the sentiment contained in disclosure summaries is more predictive of companies’ stock market returns around the time of the disclosures than the sentiment contained in the original documents. This is also true for MD&A summaries, which are more subject to concerns about obfuscation or boilerplate language.

These results point to considerable informational “bloat” in corporate disclosures. Motivated by this, the authors next ask whether companies differ in how bloated their disclosures are and whether bloat is associated with adverse capital market consequences. They use the relative amount by which a document’s length is reduced as a measure of the degree of redundant or irrelevant information, which they refer to as bloat. The authors document the following concerning bloat:

  • Bloat varies substantially across firms and over time. For both MD&A and conference call samples, time, industry, and the interaction between time and industry explain around 30-40% of the variation in bloat, whereas 60-70% of the variation is firm-specific.  
  • Bloat tends to be higher when a firm reports losses, has negative sentiment, and experiences negative stock market reactions. These patterns are consistent with firms including irrelevant language in their disclosures to obfuscate negative information.
  • Bloated disclosures are also associated with adverse capital market consequences. Measures of stock price efficiency and information asymmetry deteriorate in the presence of bloated reporting. These results continue to hold when the authors control for conventional proxies of readability, which highlights the notion that their measure captures a different construct – one that directly measures the relevance and redundancy of information. 

Finally, the authors consider whether GPT is useful for generating more targeted summaries that highlight information that may be relevant to themes of interest, such as environmental impact or regulatory uncertainty. They find the following:

  • More recent MD&A and conference call communications tend to contain more environmental, social, and governance (ESG) related language than do the more dated communications in the authors’ sample. The documents also tend to become more informative over time.
  • The sentiment of ESG-specific summaries is increasingly more important in determining stock market reactions over time. This finding is consistent with prior studies showing that, in recent years, stocks have become more responsive to ESG risks. 

This paper makes a case for the usefulness of generative language AI for analyzing and summarizing unstructured textual data. The authors provide preliminary evidence of dramatic reductions in the length of disclosed information while maintaining and enhancing its information content. The findings are relevant not only to academics but also to regulators and investors who often face significant costs consuming complex financial disclosures.  

This Brief was written by a human. Or was it?

More on this topic

Research Briefs·Sep 4, 2024

What Middle-Income Countries Can Learn from America’s Innovation System

Somik Lall and Ufuk Akcigit
The American model of innovation has long been the envy of the world. From the garage tinkerers of Silicon Valley to the research labs of prestigious universities, the United States has consistently churned out groundbreaking technologies that have reshaped industries...
Topics: Technology & Innovation
Podcast Jul 3, 2024

Using Cellphone Data to Observe Religious Worship in the United States

Tess Vigeland and Devin Pope
Using Cellphone Data to Observe Religious Worship in the United States What do location data from roughly 2.1 million cellphones say about religiosity in the United States? In this episode of The Pie, Devin Pope, Professor of Economics and Behavioral...
Topics: Technology & Innovation
Research Briefs·May 17, 2024

Gaining Steam: Incumbent Lock-in and Entrant Leapfrogging

Anders Humlum and Richard Hornbeck
The adoption of new technologies can be slowed if companies become locked into alternatives that are cheaper at the outset. During the mid 1800s, small mills used waterpower because of its low fixed costs; their failure to switch to steam...
Topics: Technology & Innovation