Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1

Portfolio item number 2

Short description of portfolio item number 2

publications

Fast Conformal Prediction using Conditional Interquantile Intervals

We introduce Conformal Interquantile Regression (CIR), a conformal regression method that efficiently constructs near-minimal prediction intervals with guaranteed coverage. CIR leverages black-box machine learning models to estimate outcome distributions through interquantile ranges, transforming these estimates into compact prediction intervals while achieving approximate conditional coverage. We further propose CIR+ (Conditional Interquantile Regression with More Comparison), which enhances CIR by incorporating a width-based selection rule for interquantile intervals. This refinement yields narrower prediction intervals while maintaining comparable coverage, though at the cost of slightly increased computational time. Both methods address key limitations of existing distributional conformal prediction approaches: they handle skewed distributions more effectively than Conformalized Quantile Regression, and they achieve substantially higher computational efficiency than Conformal Histogram Regression by eliminating the need for histogram construction. Extensive experiments on synthetic and real-world datasets demonstrate that our methods optimally balance predictive accuracy and computational efficiency compared to existing approaches.

Download Paper

Can News Predict Firm Bankruptcy?

with Siyu Bie, Guanhao Feng, Naixin Guo, Jingyu He. (2025). "Can News Predict Firm Bankruptcy? ." Journal of Financial Markets .

We examine whether real-time business news predicts firm bankruptcy. Using full-text daily articles from the Dow Jones Newswires database, we generate firm-level predictors with ChatGPT and benchmark against FinBERT and dictionary-based models. ChatGPT-based variables outperform alternatives, with sentiment scores showing predictive power across horizons. Full-text news significantly enhance predictive accuracy over headlines. News-based measures add explanatory power beyond financial variables. Finally, we show that news captures timely information on macroeconomic conditions relevant to bankruptcy prediction, such as VIX, real GDP growth, and recession probability.

Download Paper

research

Can News Predict Firm Bankruptcy?

with Siyu Bie, Guaohao Feng, Jingyu He

Published in Journal of Financial Markets, 2025

We examine whether real-time business news predicts firm bankruptcy. Using full-text daily articles from the Dow Jones Newswires database, we generate firm-level predictors with ChatGPT and benchmark against FinBERT and dictionary-based models. ChatGPT-based variables outperform alternatives, with sentiment scores showing predictive power across horizons. Full-text news significantly enhance predictive accuracy over headlines. News-based measures add explanatory power beyond financial variables. Finally, we show that news captures timely information on macroeconomic conditions relevant to bankruptcy prediction, such as VIX, real GDP growth, and recession probability.

Download Paper

Spectral Group Lasso for Selecting Factors Hidden in Plain Sight

with Arash.A.Amini, Zhixin Zhou, Guaohao Feng

This paper introduces the Group Lasso Approach (GL) for variable selection, encompassing two innovative methods: Directed Group Lasso (DGL) and Spectral Group Lasso (SGL). DGL is designed to identify factors that span a linear subspace closely resembling that of all candidate factors. In contrast, SGL begins with Principal Component Analysis (PCA) and extends further by incorporating a secondary estimation stage. The variable selection process in SGL utilizes a loss function with an $\ell_2$ to $\ell_\infty$ norm penalty, similar to the traditional group lasso methodology. We demonstrate the consistency of variable selection for both methods under high-dimensional scaling through rigorous theoretical analysis and empirical validation. Our findings highlight the effectiveness of the Group Lasso Approach in accurately identifying true factors within complex, high-dimensional datasets.

One News, Two Markets: LLM-Derived Sentiment and Trading Volume

with Siyu Bie, Guaohao Feng, Jingyu He

(Under Review)

We examine how firm news drives trading activity and information asymmetry across equity and corporate bond markets. Leveraging generative AI through a Large Language Model (DeepSeek) to extract granular, asset-specific sentiment intensity from Dow Jones Newswires (2002–2023), we establish that sentiment intensity is a robust predictor of abnormal volume. A one-standard-deviation increase in news magnitude raises next-day volume shocks by 17\% for bonds and 29\% for equities. We document a significant cross-market asymmetry: bond-specific shocks trigger immediate but transient equity volume, while equity-specific shocks induce persistent activity in the less liquid bond market. We rationalize these findings through a multi-asset microstructure framework that provides a structural explanation for the observed asymmetric information flow. This study positions LLM-derived sentiment intensity as a novel, text-based measure of market-specific information risk, bridging textual analysis with market microstructure theory.

Download Paper

Fast Conformal Prediction using Conditional Interquantile Intervals

with Rui Luo, Zhixin Zhou

Published in Proceedings of the AAAI Conference on Artificial Intelligence, 2026

We introduce Conformal Interquantile Regression (CIR), a conformal regression method that efficiently constructs near-minimal prediction intervals with guaranteed coverage. CIR leverages black-box machine learning models to estimate outcome distributions through interquantile ranges, transforming these estimates into compact prediction intervals while achieving approximate conditional coverage. We further propose CIR+ (Conditional Interquantile Regression with More Comparison), which enhances CIR by incorporating a width-based selection rule for interquantile intervals. This refinement yields narrower prediction intervals while maintaining comparable coverage, though at the cost of slightly increased computational time. Both methods address key limitations of existing distributional conformal prediction approaches: they handle skewed distributions more effectively than Conformalized Quantile Regression, and they achieve substantially higher computational efficiency than Conformal Histogram Regression by eliminating the need for histogram construction. Extensive experiments on synthetic and real-world datasets demonstrate that our methods optimally balance predictive accuracy and computational efficiency compared to existing approaches.

Download Paper

talks

Talk 1 on Relevant Topic in Your Field

Published:

This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown!

Conference Proceeding talk 3 on Relevant Topic in Your Field

Published:

This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.