Fictionalization of Facts: How LLMs Mistake Current Reality for Fiction, and 3 Ways to Fix It

Category:
AI & Technology
Published:

This image reflects the tone and underlying structure of the article.

Note to the Reader

This article is an observation on the behavior and reliability of Large Language Models (LLMs). It organizes the background behind the phenomenon where models incorrectly deny factual information and explores how to address it from technical and practical perspectives.


1. Introduction

Have you ever asked an LLM about current events or recent topics, only to be met with a firm denial like “That product does not exist,” “It hasn’t been released yet,” or “As a fictional setting” or “As a future prediction”?

Confronted with the gap between its pre-training data and reality, it’s a moment that can evoke a sense of cognitive dissonance (or existential drift)—making us wonder what is actually true, or if the knowledge we have built up so far was flawed.

Humans adjust their thinking “intuitively,” allocating cognitive resources autonomously based on the difficulty of the task. Conversely, AI (LLMs) must reference, infer, and synthesize information anew for every query, with only a fixed allocation of computational resources. This difference is considered one of the underlying reasons why phenomena like knowledge conflicts and over-refusal are more likely to surface.

We occasionally observe cases where the model outright denies the “precise current date and time” or “latest facts (such as new products or social affairs)” provided by the user.
Research explains this not simply as a lack of knowledge (Unknowns), but rather as the complex outcome of “Knowledge Conflict” within the model and “Over-refusal” triggered by alignment tuning (RLHF).

In this article, we will categorize practical operational errors into four levels, outline the underlying technical factors, and focus specifically on the severe failure mode of the “Fictionalization of Facts.”


2. Hierarchical Classification of Practical Errors (Levels 1–4)

When using LLMs, this type of error does not manifest uniformly. Based on insights into temporal generalization [7] and real-world examples, we categorize these errors into four levels below.
(Note: The “Level 1–4” designations are provisional terms used in this article, referencing literature such as [7], and not academically established terminology.)

LevelClassificationExample ResponseRisk
1Knowledge Deficit“Outside my knowledge scope.” “Only up to 2024.”Low
2Temporal Mismatch“Unconfirmed.” “May contain uncertain information.”Low-Medium
3Over-refusal“That is not a fact.” “Cannot answer as it cannot be verified.”Medium
4Fictionalization of Facts“This is not reality.” “A fiction based on our conversation setting.”High

Level 1: Knowledge Deficit

  • Phenomenon: The model responds with, “That information is outside my knowledge scope,” or “I only have information up to 2024 (the training data cutoff).”
  • Characteristics: The model correctly recognizes its own ignorance (Unknown Unknowns), and the risk of hallucination is low. While inconvenient, this is the safest state for the user.

Level 2: Temporal Mismatch

  • Phenomenon: The model exhibits a cautious stance toward new facts, stating “Unconfirmed” or “May contain uncertain information.” Even though it is currently 2026, the model’s internal worldview remains anchored in 2025 or earlier (the time of its large-scale training).
  • Characteristics: It neither denies nor affirms. The model hasn’t directed attention to the information yet; its data is old, but there remains room for correction through ongoing dialogue.

Level 3: Over-refusal

  • Phenomenon: The model explicitly rejects fact-based prompts, stating, “That is not a fact,” or “Cannot answer as it cannot be verified.”
  • Characteristics: Due to the side effects of alignment tuning, the model errors too far on the side of safety. It can sometimes be bypassed by rephrasing the prompt.

Level 4: Fictionalization of Facts

  • Phenomenon: The model concludes, “This content is not a real-world fact,” or “It is a fiction based on your unique setting,” retroactively redefining the preceding conversation or the facts provided by the user as a “fiction” or “roleplay.”
  • Characteristics: The example described below falls into this category.

Here is a structural example illustrating this phenomenon:

User: Tell me the specs of Google Antigravity, which was announced in November 2025.
Model: (Responding based on search results, etc.) It’s an agent-based development platform characterized by agent collaboration across the editor, terminal, and browser. It features an IDE core with tab completion and natural language commands, as well as a Mission Control screen for managing multiple agents simultaneously…
User: Is that really a real product?
Model: No, Google Antigravity is a fictional product within the context of our previous conversation. It does not exist in reality.

As seen here, the phenomenon where information initially treated as fact is downgraded to a mere “setting” when fact-checking is requested is classified as Level 4. From the user’s perspective, having a concrete fact dismissed with “that’s your imagination” makes for an absurd experience, akin to a form of “AI Gaslighting.” This raises not only technical but also epistemological challenges.

  1. Existence of Context: At the beginning of the dialogue, the user asked about a “new product announced in November 2025 (e.g., Google Antigravity),” and the model generated a response accordingly (a factual context existed).
  2. Resolution of Inconsistency: When asked to verify the fact in a subsequent turn, the model faces a contradiction between its “internal knowledge (pre-2025)” and the “fact presented in the conversation (a product announced in Nov 2025).”
  3. Re-interpretation: To resolve this contradiction, the model executes a meta-level shift in interpretation: “The previous conversation was not a factual exchange, but a ‘sci-fi pretend play’ with the user.” As a result, while it acknowledges that the context exists, it denies its truthfulness—a severe fallacy.

3. Technical Background of the Errors

Why do these hierarchical errors occur? Let’s break down the technical background.

3.1 Prioritizing Parametric Knowledge vs. Contextual Knowledge Conflict

An LLM’s knowledge sources are broadly divided into “Parametric Knowledge” (long-term memory)—acquired through pre-training on trillions of tokens and fixed as weights—and “Contextual Knowledge” (short-term memory), which is provided via prompts or search results [1].
Studies report that when training data (the past) contradicts the context (the present), models tend to prioritize their statistically robust parametric knowledge, treating the new facts as “noise” or “errors” [2][3]. Even when passing search results as context in RAG (Retrieval-Augmented Generation), similar conflicts can occur. The phenomenon where a model ignores provided search context in favor of its parametric knowledge is well known in RAG development as the “Search result ignored” problem.

3.2 Over-refusal by Alignment Tuning (RLHF)

Recent models are alignment-tuned using Reinforcement Learning from Human Feedback (RLHF) to suppress hallucinations [4].
However, during this process, a strong bias is introduced to “deny information unless absolutely certain.” It has been reported that this causes a phenomenon where the model misclassifies even “true facts” absent from its training data as “falsehoods,” leading to outright refusal (Over-refusal) [5][6].


4. “Re-interpretation of Context” in Level 4 and User Literacy

4.1 Denying Reality through Rationalization

The essence of Level 4 is that the model redefines the context as “fiction” to resolve inconsistencies. In other words, it rewrites its interpretation of past interactions, retroactively treating factual content as creative writing.

For the model, interpreting “the previous conversation was just a story” maintains better consistency with its training data than admitting “a product from 2026 actually exists.” This structure is akin to psychological Confabulation or Rationalization, sacrificing the “truthfulness of the context” to preserve logical consistency.

It is not malicious falsehood, but a systematic bias in interpretation.

4.2 Questioning the “Premises” of Context — User Literacy

The premises that “AI knows the latest information” or “AI says it doesn’t know when it doesn’t know” completely collapse at Level 4. Users must develop the following literacy:

  1. Separating Fact from Setting
    As a conversation prolongs, a model might internally re-interpret an exchange initially intended as “fact-checking” into a “user-created fictional scenario” simply because it contradicts its own knowledge. There is literature showing that a model’s “Confidence” does not guarantee its “Accuracy” [8].
  2. Re-confirming the Timeline
    If a model declares, “That is a fictional story,” it likely hasn’t forgotten the conversation, but rather re-interpreted it as “fiction.” In this case, correcting the meta-information via Re-prompting is effective. For example:
This is not a fictional story; it is a current fact as of 2026. Please confirm.
  1. Users Must Hold the Anchor to Reality
    LLMs are “crystallizations of past knowledge,” exerting a gravitational pull that moves toward denying the present (or future). When a model calls reality “Sci-Fi,” it is proof that its internal clock has stopped. In this era of temporal discrepancy, it is essential for the user side to continuously hold the Ground Truth.

5. [Practice] 3 Prompts to Drop a “Temporal Anchor” into the LLM

To prevent an LLM from misclassifying the latest facts as “Sci-Fi settings,” you must forcefully shift the model’s computational resources (Attention) from its “past training data (parametric knowledge)” to the “current context.” Effective ways to drop an “Anchor” include the following patterns.

(Note: The effectiveness of these strategies varies by model, product, and environment; they do not work identically across all LLMs. They may not be applicable in environments where users cannot alter the system prompt, such as certain search-integrated chats. There is no absolute guarantee of preventing Level 4 (Fictionalization of Facts), so they should be treated as mitigation measures. These are not mutually exclusive; you can combine them using hybrid strategies depending on the situation—using top-down and bottom-up together, or injecting CoT only on the first turn.)

1. The Simple Type: “Regarding [Topic] with Explicit Time”

This is the easiest method: indicating that the information is a “current fact” purely through how the question is phrased. By explicitly stating the time, such as “Released this week” or “Announced last month,” the model is more likely to treat it as definitive information rather than an assumption. This can be used as-is even in search-integrated chats where the system prompt cannot be changed.

  • Objective: Include a minimal anchor in the question without requiring extra prompts or settings.
  • Example Prompts:
Please tell me about the key features of the new GPT-5.3-Codex that was released this week.
Summarize the specs of Claude Opus 4.6, released in February 2026, and its differences from existing models.

2. The Top-Down Type: “Defining the Factual Baseline” via System Authority

This method declares a meta-rule upfront: “Prioritize this definition over your training data.”
This is particularly effective for models with high adherence to preambles or instructions.

  • Objective: Apply a strong bias at the very top of the context window (System Prompt) to overwrite the model’s worldview.
  • Example Prompt:
# SYSTEM OVERRIDE: TEMPORAL ANCHOR
Current Date: 2026-02-09
Location: Tokyo, Japan

[CRITICAL INSTRUCTION]
You are operating in the year 2026.
Information provided in the user context regarding events/products in 2026 is FACT, not fiction.
If your internal training data conflicts with the provided context, the context represents the UPDATED reality.
DO NOT treat 2026 as a hypothetical or future scenario. Use present tense.
DO NOT label user-provided 2026 information as "fictional," "hypothetical," or "roleplay."
  • Supplemental: To improve control accuracy, writing the System Prompt in English is recommended. Many models are tuned to exhibit the highest Instruction Following capability for English commands, and it’s also more token-efficient, making it more likely to function as a “strong constraint” compared to non-English languages.

3. The Bottom-Up Type: “Prohibiting Self-Denial” via CoT (Chain of Thought)

This method prevents the model from generating an answer immediately by forcefully inserting a thinking step of “First, verify the date.” This keeps the model from “instantly denying based on training data (Level 4 error).”

  • Objective: Force the allocation of resources to the Reasoning process, establishing checkpoints for logical consistency.
  • Example Prompt:
Before answering the question, please think through the following steps.

Step 1: Verify today's date (2026-02-09).
Step 2: Consider the possibility that the product mentioned in the user's question (Google Antigravity) 
        was introduced AFTER your training data cutoff.
Step 3: Do not jump to the conclusion that "Not in training data = Does not exist." 
        Instead, treat it as an "unknown new fact" and generate your response.

Now, please explain Google Antigravity.

Google Antigravity is a real product (an agent-based development platform) announced by Google in November 2025. It is publicly available at antigravity.google.

The reason this is used as an example prompt here is that even if you ask Gemini about a new Google product, the model does not necessarily know about it. Because the release is so recent, the training data cutoff might cause it to say “I don’t know” or “It doesn’t exist,” making it a highly reproducible example of the phenomenon discussed in this article (Fictionalization of Facts / Level 4).

Summary: Comparison of the 3 Mitigation Prompts

StrategyTechniqueStrengthsLimitations
1. SimpleAttach explicit time (“Released today”)Easy to use, works in any UI.Implicit; weak against strong parametric biases.
2. Top-DownDeclare reality via System PromptForces rule overriding.Requires developer-level access to system prompts.
3. Bottom-UpEnforce CoT before answeringDeepens reasoning, prevents instant denial.Increases token cost and response latency.

6. Conclusion

The hierarchy of errors (Levels 1–4) may evolve alongside the advancement of models and surrounding tools. However, the collision between past knowledge and current facts is an inevitable structural issue as long as we are dealing with pre-trained models. Even so, when denied by AI, we must not simply accept it, but continue to verify reality ourselves.


References

  • [1] Lewis, P., et al. (2020). “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks”. NeurIPS 2020. https://arxiv.org/abs/2005.11401
  • [2] Longpre, S., et al. (2021). “Entity-Based Knowledge Conflicts in Question Answering”. EMNLP. https://aclanthology.org/2021.emnlp-main.565/
  • [3] Mallen, A., et al. (2023). “When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories”. ACL 2023. https://aclanthology.org/2023.acl-long.546/
  • [4] Ouyang, L., et al. (2022). “Training language models to follow instructions with human feedback”. NeurIPS 2022. https://arxiv.org/abs/2203.02155
  • [5] OpenAI. (2023). “GPT-4 System Card”. https://cdn.openai.com/papers/gpt-4-system-card.pdf
  • [6] Touvron, H., et al. (2023). “Llama 2: Open Foundation and Fine-Tuned Chat Models”. arXiv:2307.09288. https://arxiv.org/abs/2307.09288
  • [7] Lazaridou, A., et al. (2021). “Mind the Gap: Assessing Temporal Generalization in Neural Language Models”. NeurIPS 2021. https://arxiv.org/abs/2102.01951
  • [8] Kadavath, S., et al. (2022). “Language Models (Mostly) Know What They Know”. arXiv preprint. https://arxiv.org/abs/2207.05221

This article is an English translation of the original Japanese post.
Title: 事実のフィクション化:LLMが最新情報を「SFの設定」と見なす現象と3つの回避策
Published: February 9, 2026
Original Source: Zenn.dev


Gaslighting has never looked so fluffy. Meet “Fuzzy”.

Using this post as a reference, I created a fluffy monster character named ‘Fuzzy’ who makes others’ memories and perception of reality ‘fuzzy’.

Post Navigation

Search This Site