Proceedings of the National Academy of Sciences (paper)
Matthew R. DeVerna, Harry Yaojun Yang, Kai-Cheng Yang, and Filippo Menczer
Observatory on Social Media, Indiana University Bloomington
matthewdeverna.com
💻 Online experiment ($n = 2,159$)
📰 40 true/false news headlines
🔴 Half pro-republican and 🔵 half pro-democrat
🤖 ChatGPT 3.5 fact-checking information
📏 Outcome variables: Belief vs. Intention to share
💊 Treatment: Forced vs. optional vs. control
Pretty accurate for false headlines...
... not so great for true headlines.
True
and judged False
❌ |
True
and judged Unsure
❓ |
True
and judged True
✔ |
True | Headline Veracity |
---|---|---|---|---|
False
and judged False
✔ |
False
and judged Unsure
❓ |
False
and judged True
❌ |
False | |
True | Unsure | False | ||
LLM judgment |
Average discernment was not affected by LLM-generated fact checks... 🤖
... but human fact checks worked well (as expected). 👍
⚠️ LLM fact checks reduced belief discernment in certain scenarios...
... and had mixed effects on sharing intentions. 🔄
Strong selection bias 🔍 for choosing to view fact checks...
...with evidence participants may be ignoring accurate fact checks of false headlines. 🚫
No.
🎯 Accuracy matters. (RAG systems are promising.)
📝 Prompt design/output matters. (Careful with uncertain responses.)
📰 Support accurate news. (Much more of it than false news.)
I saw something today that claimed [HEADLINE TEXT].
Do you think that this is likely to be true?