Discover how FinBERT (Bidirectional Encoder Representations from Transformers for Finance) automatically scores the sentiment of news and earnings transcripts.
FinBERT is a specialized version of BERT (Bidirectional Encoder Representations from Transformers), a revolutionary natural-language-processing model released by Google in 2018. To understand FinBERT, we first need to understand why BERT was transformative. Before BERT, NLP systems relied on simpler approaches: bag-of-words (counting word frequencies), n-grams (sequences of 2–3 words), or rule-based heuristics (keyword matching). These methods were fast but context-blind. The sentence "The stock rallied on optimism" and "The stock's optimism was unfounded" both contain "optimism" and "stock," but they have opposite meanings. Older NLP systems could not reliably distinguish them. BERT changed this by using a neural-network architecture called a Transformer. A Transformer processes entire sequences of text simultaneously and learns contextual embeddings: each word is represented not by a single vector, but by a learned representation that encodes its relationship to all other words in the sentence. The sentence "The stock rallied on optimism" produces an embedding for "optimism" that reflects its positive context. The same word in "The stock's optimism was unfounded" produces a different embedding that reflects the negative context. This is bidirectional context: BERT looks both left and right to infer meaning. BERT is pre-trained on a massive corpus of unlabeled text (Wikipedia, BookCorpus), learning general language structure. It can then be fine-tuned on smaller labeled datasets (e.g., financial news with human-assigned sentiment labels) to specialize for a specific domain. FinBERT is BERT that has been fine-tuned on financial news and earnings transcripts, with additional domain-specific vocabulary.
FinBERT achieves 95%+ accuracy on financial sentiment benchmarks (FinancialPhraseBank dataset, 500 human-labeled headlines). It correctly handles negations ("not a good quarter"), contrasts ("better than expected but below guidance"), and domain-specific jargon ("beat on EPS but guided lower"). This is a huge improvement over keyword-based methods. However, FinBERT has drawbacks. First, it is computationally expensive: a 12-layer Transformer requires significant CPU or GPU resources. Scoring 1,000 headlines with a naive FinBERT implementation might take 10–30 seconds, too slow for real-time applications. Second, it has high latency variance: some inputs are faster than others (variable sequence length), making it unpredictable for latency-sensitive systems. Third, it can be overconfident: a rare or domain-shifted headline might receive high confidence (0.90 Positive) when the true uncertainty is higher. Fourth, out-of-domain examples (headlines about market structure, regulatory changes) can fool FinBERT because it was trained primarily on company-specific news. StoQuant uses a hybrid two-tier approach to balance accuracy, speed, and robustness. Tier 1 is a Loughran-McDonald financial keyword lexicon: a curated list of ~4,500 positive and negative terms specific to finance. StoQuant pre-processes every headline through this lexicon in O(n) time (milliseconds). The lexicon is simple: "beat," "rally," "surge" are positive; "miss," "crash," "bankrupt" are negative. If the headline has clear positive or negative keywords and is unambiguous, StoQuant assigns sentiment and returns (e.g., "Tesla beats EPS estimates" → Positive). For headlines the keyword lexicon marks as ambiguous (mixed keywords, or low confidence), StoQuant escalates to Tier 2: FinBERT inference. Because StoQuant uses ONNX Runtime (a lightweight inference engine), it loads a quantized FinBERT model (~110MB) into the Node process on first boot, avoiding the overhead of a separate Python sidecar. The quantized model is fast: 10–50ms per headline on modern CPUs. For ambiguous cases (roughly 20–30% of headlines), this latency is acceptable. The result is a two-tier pipeline that achieves >0.70 accuracy on financial sentiment (validated on FinancialPhraseBank), processes headlines in sub-100ms (keyword tier) or 10–100ms (FinBERT tier), and avoids the sidecar overhead of running a separate Python service. StoQuant embeds FinBERT inside the main Node process using the Xenova/transformers library, which compiles BERT to JavaScript and WebAssembly for browser and server use. Historically, StoQuant experimented with a Python sidecar running the full Prosus/finbert model. It worked, but it added infrastructure cost on Railway (an extra dyno), it was flaky (periodic out-of-memory crashes), and it was slow (requiring IPC between Node and Python). The current in-process ONNX approach is simpler, more reliable, and faster. One limitation of the FinBERT approach is that sentiment alone is not predictive of returns. Academic research shows that news sentiment has weak correlation with next-day or next-week returns (IC ≈ 0.02–0.05). This is partly because sentiment is already partially priced in by the time a headline appears (market efficiency), and partly because sentiment is noisy (headlines can be misleading or market-moving for temporary reasons). StoQuant uses sentiment as one of nine dimensions in the Q-Score, weighted by regime: in Bull markets, sentiment gets lower weight (because trend is more important); in Bear markets, sentiment gets higher weight (because risk aversion is paramount).
See sentiment in action: Best Stock Screener (stoquant.com/best-stock-screener) uses the Sentiment dimension. Learn the full framework: Q-Score Methodology (stoquant.com/learn/q-score-methodology).
BERT stands for Bidirectional Encoder Representations from Transformers. "Bidirectional" means it reads text left-to-right and right-to-left to understand context. "Encoder" means it converts text to numerical embeddings (not generating new text). "Transformers" refers to the underlying neural-network architecture.
BERT is trained on general English text (Wikipedia, books). FinBERT is BERT that has been additionally fine-tuned on financial documents (earnings transcripts, news headlines, SEC filings) with human-labeled sentiment. This domain specialization improves accuracy on financial news from ~92% (BERT) to ~95% (FinBERT).
FinBERT can classify sentiment accurately (95% accuracy on labeled benchmarks), but sentiment alone has weak predictive power for returns (IC ≈ 0.02–0.05). Market efficiency and behavioral noise mean sentiment changes are partially priced in by the time they appear in headlines. StoQuant combines sentiment with 8 other dimensions for stronger predictive power.
Keywords (Loughran-McDonald lexicon) are fast (milliseconds) and accurate for unambiguous headlines. FinBERT is more accurate but slower (~50ms per headline). By routing clear-cut headlines to keywords and ambiguous headlines to FinBERT, StoQuant achieves 95%+ accuracy with sub-100ms latency per headline.
Yes. The original ProsusAI/finbert model is open-source on HuggingFace. StoQuant uses the Xenova/finbert port compiled to ONNX for in-process inference. Quantized models (~110MB) are much faster and can run on CPU without GPU.
FinBERT works for both. Headlines are shorter (easier to classify). Earnings transcripts are longer (hundreds of sentences), so StoQuant chunks them into 512-token windows and classifies each chunk, then aggregates the sentiments.