Financial analysts are increasingly turning to AI to gain an edge in sizing up companies. But before accepting stock picks from large-language models, consider this: Their analyses may carry biases and mistakes.
ChatGPT, owned by US-based OpenAI, issued far rosier forecasts—with higher price targets and more “buy” recommendations—than its Chinese rival DeepSeek when each platform evaluated nearly 5,000 publicly traded Chinese companies, says research by Harvard Business School’s Charles C.Y. Wang. This surprising result is attributed to disparities in information availability: ChatGPT appears to incorporate less coverage from Chinese media.
But at the meta level, it’s about what investors and analysts should be aware of as they deploy these tools for global financial analysis.
As businesses in a wide range of industries seek greater speed and efficiency from algorithms, the study’s findings offer a note of caution: Relying too heavily on AI models without considering the data they may be missing could be risky. AI reliance is accelerating in finance, with an estimated 40% of institutional investors worldwide now saying they use AI for market analysis, according to Bloomberg Intelligence.
“At the micro level, we’re examining how ChatGPT and DeepSeek differ in financial analysis. But at the meta level, it’s about what investors and analysts should be aware of as they deploy these tools for global financial analysis,” says Wang, the Tandon Family Professor of Business Administration.
Wang coauthored the working paper “When LLMs Go Abroad: Foreign Bias in AI Financial Predictions” with Sean Cao, an associate professor at the University of Maryland, and Yi Xiang, an assistant professor at Hong Kong Polytechnic University.
ChatGPT was extra bullish
Wang and his team created a database of 4,978 companies traded on the Shanghai and Shenzhen stock exchanges, inputting their total assets, return on assets, and leverage—data that would provide both AI models with the same foundation for making recommendations.
Around June and July 2024, when both models' training data ended, the researchers asked ChatGPT 4.1 and DeepSeek R1 to assume the persona of a professional financial analyst and predict each company's stock price six months into the future on Dec. 31, 2024. This allowed the researchers to test predictions against actual stock prices. The researchers also asked the models to predict whether each stock would rise or fall, creating a basis for investment recommendations.
The researchers found:
ChatGPT was especially bullish for Chinese companies. The US model projected stock prices would be about 12.5% higher and issued “buy” recommendations 1.3 percentage points more often than DeepSeek.
ChatGPT’s predictions had more errors. Had investors followed the advice, they would have been disappointed: ChatGPT exhibited 13% larger absolute forecast errors and proved less accurate than DeepSeek in its stock recommendations when validated against actual year-end prices.
ChatGPT shows a ‘foreign bias’
The results surprised Wang and his team, who had expected the AI models to show a “home bias”—with each favoring companies in their own countries. Instead, the opposite occurred: ChatGPT was actually more enthusiastic than DeepSeek about Chinese companies, exhibiting a “foreign bias.”
We don’t see the underlying data that each model is trained on, which means there are implicit biases that can percolate into the analyses.
That turnabout helps illustrate “something subtle and something many of us take for granted,” Wang says. “We don’t see the underlying data that each model is trained on, which means there are implicit biases that can percolate into the analyses.”
In ChatGPT’s case, the researchers traced the bias to what they describe as “information asymmetry”—OpenAI simply had less information to work with when developing its analyses of Chinese equities. In many cases, Chinese media were producing negative news about Chinese companies, but the same companies were getting no coverage at all in the US. As a result, Wang says, ChatGPT was filling in gaps with more optimistic conclusions than DeepSeek.
The right information makes a difference
To test their conclusions, Wang and his team fed ChatGPT Chinese-sourced news about the companies and reran the prompts. When they did, ChatGPT’s “excess optimism” disappeared and its predictions fell in line with DeepSeek, suggesting that ChatGPT had previously lacked the comprehensive news coverage DeepSeek had relied on.
“We show that it’s not just missing news. It’s missing negative Chinese news that drives this bias,” Wang says.
The team also asked the chatbots for predictions about a cohort of US companies, but DeepSeek did not show the same “foreign bias.” This asymmetry, Wang argues, likely stems from the fact that information on US firms is a global commodity that is widely available and accessible to both models.
Consider the source
The team recently completed a similar experiment using ChatGPT 5, and early indications show “exactly the same kinds of biases in the latest models,” Wang says.
The study contributes to a growing body of research on the value of generalist AI models such as ChatGPT and DeepSeek, compared to specialized bots. For instance, Bloomberg currently offers BloombergGPT for financial analysis, while a service known as Fiscal.ai has been trained on information from the US Securities and Exchange Commission’s EDGAR database.
“The question is, will we end up with one supermodel that does everything well or a collection of specialized models?” Wang said. “It would be quite interesting to understand their relative performance.”
Regardless of the model, Wang says it’s important to question the algorithm’s recommendations. He provides this advice:
Don’t assume AI analysis is neutral across borders
When using AI to evaluate companies in unfamiliar markets, treat its predictions as potentially skewed by gaps in local data coverage. What looks solid may reflect missing information.
Ground AI insights in local context
Strengthen AI-driven analysis by adding local-language news, regulatory filings, and market data, and validate its output against regional experts or alternative models before making decisions.
Use AI as a checkpoint, not a final call
AI tools can accelerate research, but they’re not a replacement for human judgment, Wang says. Comparing outputs across models and regularly stress-testing assumptions can help prevent costly misreads in global investments.
Illustration created by Ariana Cohen-Halberstam with assets from AdobeStock.
Have feedback for us?
When LLMs Go Abroad: Foreign Bias in AI Financial Predictions
Cao, Sean, Charles C.Y. Wang, and Yi Xiang. "When LLMs Go Abroad: Foreign Bias in AI Financial Predictions." Harvard Business School Working Paper, No. 26-013, September 2025. (Revised January 2026.)

