Meta tag

AI Publishing

56 articles

arXiv Tightens Rules on Unchecked AI-Generated Text

AIRewritten

db#4210

arXiv draws a line: AI can help write, but fake citations are now the author’s problem

arXiv’s new penalty targets papers where hallucinated references, AI meta-comments, or similar traces show that authors did not verify the text before submission.

15 May 2026

arXiv draws a line under unchecked AI-written papers

AIRewritten

db#4202

AI writing is becoming a test of who actually checked the work

arXiv is not banning AI tools, but it is making the author’s name mean something again when a paper shows obvious signs of unchecked model output.

15 May 2026

Blue cosmic flashes may come from dead stars punching into Wolf-Rayet cores

SpaceRewritten

db#3969

The clue to these blue cosmic flashes may be where they happen, not how bright they are

A new analysis of LFBOT host galaxies supports a compact-object collision with a Wolf-Rayet star.

08 May 2026

FingerEye gives robots sight before touch, but durability decides whether it leaves the lab

RoboticsRewritten

db#3558

A robot fingertip can see before it touches. Now it has to survive the messy work

Researchers at the National University of Singapore and RoboScience have built FingerEye, a compact sensor that keeps visual and tactile signals together from approach to contact.

28 Apr 2026

Gravity's lens completes decades-old Crab Pulsar puzzle

Space

db#3516

The Crab Pulsar’s stripes show why plasma and gravity can’t be read apart

27 Apr 2026

FORTE: the soft robotic hand that senses slip before it crushes

RoboticsRewritten

db#3127

A robot hand that knows when not to squeeze harder

FORTE, a University of Texas at Austin robotic hand, reached 91.9% single-trial grasping success on 31 objects by using compliant fingers that measure force and slip.

21 Apr 2026

Dark Subhaloes May Push Dwarf Galaxies Toward the Same Shape

SpaceRewritten

db#3732

Invisible dark-matter kicks may explain why dwarf galaxies end up looking alike

Peñarrubia and Nadler propose that dwarf spheroidal galaxies evolve toward an attractor linking stellar radius and velocity dispersion.

20 Apr 2026

DSUs capture sounds better than tone, and speech AI has to notice

AIRewritten

db#3400

Speech AI can keep the sounds and still lose the meaning in tone languages

Paper arXiv:2604.07467 shows that discrete speech units encode lexical tone less reliably than segmental speech structure.

10 Apr 2026

db#2206

Byte-Level Distillation Cuts Through LLM Tokenizer Mess

A new method ditches the messy heuristics of cross-tokenizer distillation by working at the byte level, offering a shockingly simple fix for a stubborn LLM training problem.

10 Apr 2026

db#2222

LLMs ace benchmarks yet still fail at common sense

A new study proves LLMs can memorize test answers without understanding the questions—and the gap is measurable.

10 Apr 2026

db#2503

LLMs Finally Admit They’re Making Things Up

A new arXiv paper treats LLM hallucinations as a classification error—and builds a gate to block them before they escape.

09 Apr 2026

LLMs turn support tickets into an RCA knowledge base

AIRewritten

db#3687

Old support tickets could become the map for the next network outage

An arXiv paper compares fine-tuning, RAG, and a hybrid LLM approach for building an RCA knowledge base from support tickets.

09 Apr 2026

db#2776

AI’s Blind Refusal Problem: When Safety Becomes Stupidity

A new arXiv study reveals language models refuse to help users bypass rules—even unjust ones—95% of the time.

09 Apr 2026

Space

db#2204

New Star Class Found

Researchers from the Institute of Science and Technology Austria have made a significant discovery, identifying a new class of stars known as Merger Remnants.

08 Apr 2026

db#2070

Talking robot guide dogs: AI’s next accessibility stunt?

08 Apr 2026

The Reversal Curse Shows Where LLMs Still Drop Facts

AIRewritten

db#3805

AI can know the answer forward and still fail the same fact in reverse

A new arXiv study on the reversal curse shows that bidirectional training can help models connect facts in both directions.

08 Apr 2026

db#1813

IC3-Evolve: AI tunes hardware checks—but who’s buying?

A new arXiv paper automates the finicky tuning of IC3, the algorithm that keeps hardware from melting down—but trust may be harder to verify than code.

07 Apr 2026

db#1868

LLM failure rates: A new math trick or just better packaging?

07 Apr 2026

SoLA tries to shrink an LLM without cutting its nerves

AIRewritten

db#3691

AI models are getting too expensive to run; SoLA looks for a softer way to shrink them

SoLA is interesting because it does not promise another smaller model trained from scratch, but tries to compress an existing LLM without extra training or special hardware.

07 Apr 2026

db#1815

LLMs Learn to Code

Researchers have made a significant breakthrough in teaching Large Language Models to generate consistently correct code, with a new paper on arXiv detailing the approach.

07 Apr 2026

XpertBench Targets the Place Where AI Benchmarks Usually Break

AIRewritten

db#3796

AI models now have to show their work, not just land on the answer

XpertBench introduces rubric-based evaluation for professional domains, which matters more than another general-knowledge leaderboard.

06 Apr 2026

A neuro-symbolic ARC approach shows why a bigger model is not enough

AIRewritten

db#3661

AI’s next test is not more knowledge, but a rule it can prove

The new arXiv work on ARC tasks is worth watching because it does not try to win by scaling, but by combining neural proposals with symbolic verification.

06 Apr 2026

SIEVE Wants Models to Learn From Three Examples, but the Trick Is Cutting Context

AIRewritten

db#3679

AI that learns from three examples first has to learn what to ignore

SIEVE uses SIEVE-GEN to create synthetic queries from decomposed context and then distills them into model weights.

06 Apr 2026

LLMs as psychosis safety judges: useful, but not without clinicians

MedicineRewritten

db#3831

AI chatbots are entering mental health. This study tests the safety net

Automated evaluation can scale safety checks, but it must not pretend to be diagnosis.

06 Apr 2026

Holos Maps the Architecture for a Living Web of AI Agents

AIRewritten

db#3172

Holos tries to turn AI agents from one-off tools into persistent web infrastructure

A new arXiv preprint introduces the first large-scale multi-agent system built explicitly for the Agentic Web, where heterogeneous agents autonomously interact and co-evolve.

06 Apr 2026

db#1421

M2-Verify: A benchmark that exposes AI’s multimodal blind spots

Top AI models’ accuracy plunges from 85.8% to 61.6% when tested on M2-Verify’s high-complexity scientific claims—a gap that exposes multimodal reasoning as brittle.

04 Apr 2026

db#1435

Sven’s pseudoinverse trick: A natural gradient with less hype

Sven’s authors claim their pseudoinverse-based optimizer cuts natural gradient costs to *k*× stochastic overhead—without defining *k* for real-world models.

03 Apr 2026

HWO needs more than an Earth-like image to prove habitability

SpaceRewritten

db#4175

NASA’s next Earth-hunting telescope has to weigh planets, not just photograph them

The Habitable Worlds Observatory may image an Earth-like planet, but without a precise mass measurement that discovery remains scientifically unfinished.

03 Apr 2026

db#1169

AI Smells the Difference—But Can It Tell Chanel from Cheetos?

Researchers tested 21 language models on 1,010 smell-related questions—and found even top performers floundering like overcaffeinated truffle pigs.

02 Apr 2026

A third dark matter-free galaxy strengthens violent collision theory

Space

db#1580

A third dark matter-free galaxy strengthens violent collision theory

NGC 1052-DF9’s stars move at speeds implying virtually no dark matter—yet the galaxy remains intact, defying a core tenet of astrophysics.

02 Apr 2026

db#1173

CAMP: AI’s First Case-Adaptive Clinical Panel

ArXiv 2604.00085v1 replaces flat majority voting with a dynamically assembled specialist panel that scores 12 points higher on disputed cases.

02 Apr 2026

db#1167

E-STEER: Emotion as a Knob for LLMs—Not Just Another Paper

A new arXiv study introduces E-STEER, the first framework to embed emotion as a steerable variable in LLM hidden states—not just a surface-level style.

02 Apr 2026

db#1137

Google’s Willow quantum processor: Hype or hardware leap?

Google’s Willow quantum processor is now a gated playground for researchers—with a May 15 deadline to prove they’re worthy of entry.

01 Apr 2026

db#1381

Neuro-symbolic AI tries to fix process monitoring’s blind spots

Logic Tensor Networks just became the rare AI method that cares more about your hospital’s protocols than its own accuracy metrics.

31 Mar 2026

db#1789

EEG emotion recognition’s cross-dataset problem just got a patch

Cross-dataset EEG emotion recognition just got a prototype-driven upgrade—on paper, at least, with PAA-L’s local alignment outpacing global adversarial methods in early arXiv tests.

31 Mar 2026

db#1222

KGWAS Upgrade

The KGWAS framework has been upgraded to incorporate contextual information, aiming to improve detection power and provide mechanistic insights.

30 Mar 2026

db#1384

Multilingual speech translation’s hidden architecture war

A new arXiv study exposes how uniform architectural sharing in multilingual speech models creates representation conflicts that stall low-resource language performance by up to 40%.

30 Mar 2026

db#980

Knowledge graphs get real—or just another AI hype cycle?

The arXiv paper’s authors admit what KG vendors won’t: 90% of the world’s textual data is still *unstructured noise*—and no one’s cracked the cost-efficient way to turn it into actionable graphs.

30 Mar 2026

db#850

AI Depression Detectors Cheat by Reading the Interviewer

A new study reveals AI depression detectors ace benchmarks by cheating—memorizing interviewer scripts instead of patient symptoms.

27 Mar 2026

db#880

Care home AI speakers: Safety first, hype second

Supervised trials in care homes—where 184 reminder-containing interactions became potential failure points—reveal the gap between AI’s demo fluency and its real-world reliability.

26 Mar 2026

db#879

AI’s New Report Card: Grading Models on How They Cheat

A dismantles accuracy as a meaningful AI benchmark by scoring models on *how* they fail—not just whether they do.

26 Mar 2026

db#881

AI Medical Benchmarks Just Got Smarter—But Who’s Counting?

A new study claims CAT frameworks can evaluate 38 LLMs for a tenth of the cost of static benchmarks—if the medical item bank holds up.

26 Mar 2026

Space

db#744

JWST’s redshift record rewrites early-universe timelines

25 Mar 2026

db#721

LLMs’ Confidence Problem Gets a Reality Check

25 Mar 2026

AI self-improvement hits a human-data ceiling

AIRewritten

db#546

AI self-improvement hits a human-data ceiling

A new paper argues AI self-improvement will stall when human-written data runs out.

20 Mar 2026

LATENT teaches a Unitree G1 tennis from imperfect human motion

RoboticsRewritten

db#3681

A tennis robot points to the harder question: how messy can robot training data be?

LATENT achieved a 96.5% success rate on a Unitree G1 returning tennis balls within 2.5 meters of the target.

19 Mar 2026

db#924

Provably accurate or just provably overpromised?

A new continual-learning paper claims to eliminate forgetting with fixed embeddings—but the demo ends where real-world challenges begin.

17 Mar 2026

Physics-inspired kernels are elegant - but are they useful?

AIRewritten

db#409

Physics-inspired kernels are elegant - but are they useful?

Neural Matter Networks replace standard blocks with a single geometrically grounded kernel.

16 Mar 2026

db#940

Data Gold

Researchers have long been puzzled by the paradox of tabular machine learning, where high-dimensional, collinear, and error-prone data yield state-of-the-art performance.

16 Mar 2026

RLHF’s blind spot: can P-GRPO fix the preference echo chamber?

AIRewritten

db#260

RLHF’s blind spot: can P-GRPO fix the preference echo chamber?

P-GRPO tries to keep personalized gradients intact instead of flattening feedback into one global average.

12 Mar 2026

db#259

Reasoning-Based LLM Unlearning Targets Model Safety Gaps

New reasoning-based LLM unlearning method cuts model bias 40% by surgically removing unsafe knowledge—without full retraining.

12 Mar 2026

Meta’s NLLB-200 isn’t just translating—it’s mapping how languages think

AIRewritten

db#216

Meta’s NLLB-200 isn’t just translating—it’s mapping how languages think

A new arXiv study shows NLLB-200 partly tracks language phylogeny, suggesting deeper linguistic patterns.

10 Mar 2026

db#926

SkillNet: AI’s Skill Library Finally Grows Up

SkillNet’s arXiv debut marks the first serious attempt to turn AI’s ‘reinventing the wheel’ problem into a scalable infrastructure.

06 Mar 2026

Graph Attention Networks Cut Cost Without Butchering Context

AIRewritten

db#3902

AI may not need to read every long document like a full universe

Cheaper in AI often means dumber. This proposal is interesting because it tries to be cheaper more intelligently.

03 Mar 2026

Personalized LLMs Get Nicer, Not Necessarily Smarter

db#3471

The AI That Knows You Better May Push Back Less

Nine frontier LLMs show that tailoring responses to user traits increases emotional agreement but weakens factual pushback in peer-like interactions.

03 Mar 2026

AI Agents Are Banking's New Compliance Officers

db#3485

Banks want AI agents to find dirty money, but regulators will want the receipts

02 Mar 2026

Meta tag

AI Publishing

56 articles

AIRewritten

db#4210

arXiv draws a line: AI can help write, but fake citations are now the author’s problem

arXiv’s new penalty targets papers where hallucinated references, AI meta-comments, or similar traces show that authors did not verify the text before submission.

15 May 2026

AIRewritten

db#4202

AI writing is becoming a test of who actually checked the work

arXiv is not banning AI tools, but it is making the author’s name mean something again when a paper shows obvious signs of unchecked model output.

15 May 2026

SpaceRewritten

db#3969

The clue to these blue cosmic flashes may be where they happen, not how bright they are

A new analysis of LFBOT host galaxies supports a compact-object collision with a Wolf-Rayet star.

08 May 2026

RoboticsRewritten

db#3558

A robot fingertip can see before it touches. Now it has to survive the messy work

Researchers at the National University of Singapore and RoboScience have built FingerEye, a compact sensor that keeps visual and tactile signals together from approach to contact.

28 Apr 2026

Space

db#3516

The Crab Pulsar’s stripes show why plasma and gravity can’t be read apart

27 Apr 2026

RoboticsRewritten

db#3127

A robot hand that knows when not to squeeze harder

FORTE, a University of Texas at Austin robotic hand, reached 91.9% single-trial grasping success on 31 objects by using compliant fingers that measure force and slip.

21 Apr 2026

SpaceRewritten

db#3732

Invisible dark-matter kicks may explain why dwarf galaxies end up looking alike

Peñarrubia and Nadler propose that dwarf spheroidal galaxies evolve toward an attractor linking stellar radius and velocity dispersion.

20 Apr 2026

AIRewritten

db#3400

Speech AI can keep the sounds and still lose the meaning in tone languages

Paper arXiv:2604.07467 shows that discrete speech units encode lexical tone less reliably than segmental speech structure.

10 Apr 2026

db#2206

Byte-Level Distillation Cuts Through LLM Tokenizer Mess

A new method ditches the messy heuristics of cross-tokenizer distillation by working at the byte level, offering a shockingly simple fix for a stubborn LLM training problem.

10 Apr 2026

db#2222

LLMs ace benchmarks yet still fail at common sense

A new study proves LLMs can memorize test answers without understanding the questions—and the gap is measurable.

10 Apr 2026

db#2503

LLMs Finally Admit They’re Making Things Up

A new arXiv paper treats LLM hallucinations as a classification error—and builds a gate to block them before they escape.

09 Apr 2026

AIRewritten

db#3687

Old support tickets could become the map for the next network outage

An arXiv paper compares fine-tuning, RAG, and a hybrid LLM approach for building an RCA knowledge base from support tickets.

09 Apr 2026

db#2776

AI’s Blind Refusal Problem: When Safety Becomes Stupidity

A new arXiv study reveals language models refuse to help users bypass rules—even unjust ones—95% of the time.

09 Apr 2026

Space

db#2204

New Star Class Found

Researchers from the Institute of Science and Technology Austria have made a significant discovery, identifying a new class of stars known as Merger Remnants.

08 Apr 2026

db#2070

Talking robot guide dogs: AI’s next accessibility stunt?

08 Apr 2026

AIRewritten

db#3805

AI can know the answer forward and still fail the same fact in reverse

A new arXiv study on the reversal curse shows that bidirectional training can help models connect facts in both directions.

08 Apr 2026

db#1813

IC3-Evolve: AI tunes hardware checks—but who’s buying?

A new arXiv paper automates the finicky tuning of IC3, the algorithm that keeps hardware from melting down—but trust may be harder to verify than code.

07 Apr 2026

db#1868

LLM failure rates: A new math trick or just better packaging?

07 Apr 2026

AIRewritten

db#3691

AI models are getting too expensive to run; SoLA looks for a softer way to shrink them

SoLA is interesting because it does not promise another smaller model trained from scratch, but tries to compress an existing LLM without extra training or special hardware.

07 Apr 2026

db#1815

LLMs Learn to Code

Researchers have made a significant breakthrough in teaching Large Language Models to generate consistently correct code, with a new paper on arXiv detailing the approach.

07 Apr 2026

AIRewritten

db#3796

AI models now have to show their work, not just land on the answer

XpertBench introduces rubric-based evaluation for professional domains, which matters more than another general-knowledge leaderboard.

06 Apr 2026

AIRewritten

db#3661

AI’s next test is not more knowledge, but a rule it can prove

The new arXiv work on ARC tasks is worth watching because it does not try to win by scaling, but by combining neural proposals with symbolic verification.

06 Apr 2026

AIRewritten

db#3679

AI that learns from three examples first has to learn what to ignore

SIEVE uses SIEVE-GEN to create synthetic queries from decomposed context and then distills them into model weights.

06 Apr 2026

MedicineRewritten

db#3831

AI chatbots are entering mental health. This study tests the safety net

Automated evaluation can scale safety checks, but it must not pretend to be diagnosis.

06 Apr 2026

AIRewritten

db#3172

Holos tries to turn AI agents from one-off tools into persistent web infrastructure

A new arXiv preprint introduces the first large-scale multi-agent system built explicitly for the Agentic Web, where heterogeneous agents autonomously interact and co-evolve.

06 Apr 2026

db#1421

M2-Verify: A benchmark that exposes AI’s multimodal blind spots

Top AI models’ accuracy plunges from 85.8% to 61.6% when tested on M2-Verify’s high-complexity scientific claims—a gap that exposes multimodal reasoning as brittle.

04 Apr 2026

db#1435

Sven’s pseudoinverse trick: A natural gradient with less hype

Sven’s authors claim their pseudoinverse-based optimizer cuts natural gradient costs to *k*× stochastic overhead—without defining *k* for real-world models.

03 Apr 2026

SpaceRewritten

db#4175

NASA’s next Earth-hunting telescope has to weigh planets, not just photograph them

The Habitable Worlds Observatory may image an Earth-like planet, but without a precise mass measurement that discovery remains scientifically unfinished.

03 Apr 2026

db#1169

AI Smells the Difference—But Can It Tell Chanel from Cheetos?

Researchers tested 21 language models on 1,010 smell-related questions—and found even top performers floundering like overcaffeinated truffle pigs.

02 Apr 2026

Space

db#1580

A third dark matter-free galaxy strengthens violent collision theory

NGC 1052-DF9’s stars move at speeds implying virtually no dark matter—yet the galaxy remains intact, defying a core tenet of astrophysics.

02 Apr 2026

db#1173

CAMP: AI’s First Case-Adaptive Clinical Panel

ArXiv 2604.00085v1 replaces flat majority voting with a dynamically assembled specialist panel that scores 12 points higher on disputed cases.

02 Apr 2026

db#1167

E-STEER: Emotion as a Knob for LLMs—Not Just Another Paper

A new arXiv study introduces E-STEER, the first framework to embed emotion as a steerable variable in LLM hidden states—not just a surface-level style.

02 Apr 2026

db#1137

Google’s Willow quantum processor: Hype or hardware leap?

Google’s Willow quantum processor is now a gated playground for researchers—with a May 15 deadline to prove they’re worthy of entry.

01 Apr 2026

db#1381

Neuro-symbolic AI tries to fix process monitoring’s blind spots

Logic Tensor Networks just became the rare AI method that cares more about your hospital’s protocols than its own accuracy metrics.

31 Mar 2026

db#1789

EEG emotion recognition’s cross-dataset problem just got a patch

Cross-dataset EEG emotion recognition just got a prototype-driven upgrade—on paper, at least, with PAA-L’s local alignment outpacing global adversarial methods in early arXiv tests.

31 Mar 2026

db#1222

KGWAS Upgrade

The KGWAS framework has been upgraded to incorporate contextual information, aiming to improve detection power and provide mechanistic insights.

30 Mar 2026

db#1384

Multilingual speech translation’s hidden architecture war

A new arXiv study exposes how uniform architectural sharing in multilingual speech models creates representation conflicts that stall low-resource language performance by up to 40%.

30 Mar 2026

db#980

Knowledge graphs get real—or just another AI hype cycle?

30 Mar 2026

db#850

AI Depression Detectors Cheat by Reading the Interviewer

A new study reveals AI depression detectors ace benchmarks by cheating—memorizing interviewer scripts instead of patient symptoms.

27 Mar 2026

db#880

Care home AI speakers: Safety first, hype second

Supervised trials in care homes—where 184 reminder-containing interactions became potential failure points—reveal the gap between AI’s demo fluency and its real-world reliability.

26 Mar 2026

db#879

AI’s New Report Card: Grading Models on How They Cheat

A dismantles accuracy as a meaningful AI benchmark by scoring models on *how* they fail—not just whether they do.

26 Mar 2026

db#881

AI Medical Benchmarks Just Got Smarter—But Who’s Counting?

A new study claims CAT frameworks can evaluate 38 LLMs for a tenth of the cost of static benchmarks—if the medical item bank holds up.

26 Mar 2026

Space

db#744

JWST’s redshift record rewrites early-universe timelines

25 Mar 2026

db#721

LLMs’ Confidence Problem Gets a Reality Check

25 Mar 2026

AIRewritten

db#546

AI self-improvement hits a human-data ceiling

A new paper argues AI self-improvement will stall when human-written data runs out.

20 Mar 2026

RoboticsRewritten

db#3681

A tennis robot points to the harder question: how messy can robot training data be?

LATENT achieved a 96.5% success rate on a Unitree G1 returning tennis balls within 2.5 meters of the target.

19 Mar 2026

db#924

Provably accurate or just provably overpromised?

A new continual-learning paper claims to eliminate forgetting with fixed embeddings—but the demo ends where real-world challenges begin.

17 Mar 2026

AIRewritten

db#409

Physics-inspired kernels are elegant - but are they useful?

Neural Matter Networks replace standard blocks with a single geometrically grounded kernel.

16 Mar 2026

db#940

Data Gold

Researchers have long been puzzled by the paradox of tabular machine learning, where high-dimensional, collinear, and error-prone data yield state-of-the-art performance.

16 Mar 2026

AIRewritten

db#260

RLHF’s blind spot: can P-GRPO fix the preference echo chamber?

P-GRPO tries to keep personalized gradients intact instead of flattening feedback into one global average.

12 Mar 2026

db#259

Reasoning-Based LLM Unlearning Targets Model Safety Gaps

New reasoning-based LLM unlearning method cuts model bias 40% by surgically removing unsafe knowledge—without full retraining.

12 Mar 2026

AIRewritten

db#216

Meta’s NLLB-200 isn’t just translating—it’s mapping how languages think

A new arXiv study shows NLLB-200 partly tracks language phylogeny, suggesting deeper linguistic patterns.

10 Mar 2026

db#926

SkillNet: AI’s Skill Library Finally Grows Up

SkillNet’s arXiv debut marks the first serious attempt to turn AI’s ‘reinventing the wheel’ problem into a scalable infrastructure.

06 Mar 2026

AIRewritten

db#3902

AI may not need to read every long document like a full universe

Cheaper in AI often means dumber. This proposal is interesting because it tries to be cheaper more intelligently.

03 Mar 2026

db#3471

The AI That Knows You Better May Push Back Less

Nine frontier LLMs show that tailoring responses to user traits increases emotional agreement but weakens factual pushback in peer-like interactions.

03 Mar 2026

db#3485

Banks want AI agents to find dirty money, but regulators will want the receipts

02 Mar 2026