Textual Factors: A Scalable, Interpretable, and Data-driven Approach to Analyzing Unstructured Information

Speaker

Will Cong

The University of Chicago Booth School of Business

Abstract

We introduce a general framework for analyzing large-scale text-based data, combining the strengths of neural network language models and generative statistical modeling. Our methodology generate textual factors by (i) representing texts using vector word embedding, (ii) clustering words using locality-sensitive hashing, and (iii) identifying spanning vector clusters through topic modeling. Our data-driven approach captures complex linguistic structures while ensuring computational scalability and economic interpretability. We also discuss applications of textual factors in (i) prediction and inference, (ii) interpreting existing models and variables, and (iii) constructing new metrics and explanatory variables, with illustrations using topics in finance and economics such as macroeconomic forecasting and factor asset pricing. The talk will also include discussions on a related paper using textual factors to measure corporate governance.

Type: Research Seminar
Programme: Finance & Accounting
Date: Tue. 10 Sep. 2019
Time: 15:30 - 16:45
Location: Polak 2-22

Mancy Luo

Assistant Professor of Finance

Rotterdam School of Management (RSM), Erasmus University Rotterdam

Textual Factors: A Scalable, Interpretable, and Data-driven Approach to Analyzing Unstructured Information

Abstract

Information

Contact

Coordinator