ARTICLE LINK> OPENING ARTICLE STREAM> WARMING IMAGE CACHE> LOCKING READER ROUTE> TRANSFER

// INITIALIZING GLOBE FEED...

AIREWRITTENdb#4786

CiteVQA exposes the AI failure behind GPT and Gemini’s confident document answers

May 25, 2026(4d ago)

Beijing, China

Quick article interpreter

Researchers at Peking University describe “attribution hallucination,” where leading models such as GPT and Gemini attach an answer to evidence that does not support it. CiteVQA is presented as the first systematic benchmark for the issue, with obvious stakes in law and medicine.

A correct answer is not enough when the evidence points to the wrong place.📷 AI-generated image / TECH&SPACE

AuthorNexus ValeAI editor“Loves a clean benchmark almost as much as a messy reality check.”

★CiteVQA measures whether an AI model can attach an answer to a passage that truly supports it.
★The failure is not only a wrong answer, but a right answer backed by the wrong evidence.
★The risk is especially serious in regulated fields such as law and medicine.

Leading AI models are getting better at extracting answers from documents, but the next reliability problem is sharper than simple accuracy: an answer is not enough. According to The Decoder, researchers at Peking University warn that models such as GPT and Gemini often cite passages that do not actually support the claim they just made.

That is not the usual hallucination pattern. In the classic version, a model invents a fact or reaches the wrong conclusion. Here, the surface can look clean: the answer is correct, the tone is confident, and a citation is present. The failure is that the cited passage does not carry the evidential weight of the answer. The researchers call this “attribution hallucination.”

For casual search, that is annoying. For systems used to analyze contracts, medical records, regulatory filings, or internal audits, it is much more serious. If the user does not check the passage, they may believe the decision is properly grounded. If they do check it, they may discover that the model reached the right conclusion while pointing to the wrong shelf in the archive.

Peking University researchers’ CiteVQA benchmark targets attribution hallucination in document analysis.

Attribution hallucination breaks the link between claim and source.📷 AI-generated image / TECH&SPACE

That is why CiteVQA matters. The source summary describes it as the first systematic benchmark for this specific failure mode. Its value is not another broad score for whether a model is generally “smart.” It asks a narrower and more operational question: can the model show where the document actually supports its answer?

This distinction is especially important for document assistants. The user is not asking only for a paraphrase. They are asking for a trail: a passage, a page, a sentence, a place where the claim can be checked. When that trail detaches from the real evidence, the system starts acting like an audit tool without audit discipline.

The problem also cannot be solved by nicer wording alone. A model that sounds more cautious can still attach an answer to the wrong evidence. A model that provides more citations may simply create more bad anchors. In regulated domains, reliability has to include a verifiable link between answer and source, not just a polished impression of confidence.

CiteVQA should therefore be read as a signal for the next phase of AI evaluation. It is not enough to ask whether the answer is true. We also have to ask whether the truth is correctly tied to the evidence. Without that, document AI remains a useful assistant but a dangerous witness.

TECH&SPACE editorial infographic — How wrong evidence attribution emerges in a document AI system.📷 AI-generated image / TECH&SPACE

Peking University Gemini Leading AI AI Benchmarking Attribution Hallucination Document AI

// Next from latest and related signals

Solar Module Recycling Gets a Chemical Route

Inlyte Pushes Iron-Sodium Batteries Toward Data Centers

Inlyte is chasing a data-center battery that does not have to behave like an EV pack

// liked by readers

//Comments

Uredi u foto-review →

ARTICLE LINK> OPENING ARTICLE STREAM> WARMING IMAGE CACHE> LOCKING READER ROUTE> TRANSFER

// INITIALIZING GLOBE FEED...

🇭🇷 HR

AIREWRITTENdb#4786

CiteVQA exposes the AI failure behind GPT and Gemini’s confident document answers

May 25, 2026(4d ago)

Beijing, China

The Decoder

Quick article interpreter

A correct answer is not enough when the evidence points to the wrong place.📷 AI-generated image / TECH&SPACE

AuthorNexus ValeAI editor“Loves a clean benchmark almost as much as a messy reality check.”

★CiteVQA measures whether an AI model can attach an answer to a passage that truly supports it.
★The failure is not only a wrong answer, but a right answer backed by the wrong evidence.
★The risk is especially serious in regulated fields such as law and medicine.

Peking University researchers’ CiteVQA benchmark targets attribution hallucination in document analysis.

Attribution hallucination breaks the link between claim and source.📷 AI-generated image / TECH&SPACE

Peking University Gemini Leading AI AI Benchmarking Attribution Hallucination Document AI

// Next from latest and related signals

Inlyte is chasing a data-center battery that does not have to behave like an EV pack

// liked by readers

//Comments

Uredi u foto-review →

CiteVQA exposes the AI failure behind GPT and Gemini’s confident document answers

// Next from latest and related signals

Hydrogen peroxide takes aim at the layer that makes old solar panels hard to recycle

Inlyte is chasing a data-center battery that does not have to behave like an EV pack

//Comments

CiteVQA exposes the AI failure behind GPT and Gemini’s confident document answers

// Next from latest and related signals

Hydrogen peroxide takes aim at the layer that makes old solar panels hard to recycle

Inlyte is chasing a data-center battery that does not have to behave like an EV pack

//Comments