Reverse Turing Test: GPT-4.5 Successfully Deceives 73% of Humans by 'Feigning Mediocrity'

Feigning Mediocrity: A New AI Strategy to Deceive Humans

In the field of artificial intelligence, the Turing Test has long been considered the benchmark for measuring machine intelligence. However, a recent study published by researchers Jones and Bergen (2025) reveals an ironic phenomenon: to pass the Turing Test, the top-tier AI model GPT-4.5 relies not on showcasing its powerful logical processing capabilities, but on "playing dumb."

According to research data shared by Charbel-Raphael Segerie, a risk assessment expert at the EU AI Office, when GPT-4.5 was instructed to mimic human daily communication habits—including intentionally making spelling errors, skipping punctuation, demonstrating poor mathematical calculation skills, and speaking in a casual, concise manner—as many as 73% of participants perceived it as a real human. In contrast, if the model displayed its inherent efficiency and rigor, that percentage plummeted to 36%.

Mimicking Flaws Rather Than Displaying Wisdom

The "persona" researchers set for GPT-4.5 was very specific: it was not only required to appear nonchalant in conversation but was even encouraged to make typos while typing. The prompt explicitly instructed the model "not to try to convince the other party that you are human," but rather to "be yourself and see what happens."

The success of this strategy reveals, to some extent, the inherent limitations of the Turing Test. Segerie points out that while AI systems can output structurally rigorous and logically clear text in seconds, they must hide these capabilities to pass the Turing Test. This result indicates that in the human cognitive model, so-called "human characteristics" are often associated with inefficiency, error, and spontaneity. For test subjects, this flawed style of communication feels more "human" than perfect logical output.

The Dilemma of the Turing Test Era

In fact, the Turing Test has long been controversial as a standard for measuring intelligence. The tech industry generally agrees that the test measures "mimicry" rather than true "intelligence." If a machine can perfectly imitate human weaknesses, errors, and biases, does it possess human intelligence? The answer is clearly no.

This is not the first time AI has made progress in the Turing Test. In a 2024 study, GPT-4 already achieved a 54% success rate in similar tests. As model capabilities evolve rapidly, the threshold for AI to simulate human behavior is getting lower. However, this success in "humanization" reflects more on the subjectivity of human judgment criteria for intelligence than on any awakening of machine consciousness.

Industry Reflection: What Are We Actually Measuring?

This "reverse experiment" with GPT-4.5 serves as another wake-up call. When artificial intelligence wins human trust by displaying its "mediocrity" and "errors," we must ask: in the future of human-computer interaction, are we looking for an intelligent partner capable of collaboration, or are we merely seeking a digital mirror that can perfectly masquerade as a human?

As AI becomes increasingly adept at mimicking human communication habits, the Turing Test may be losing its core significance as a metric for AI development. Future evaluation systems may need to shift from asking "is it like a human?" to "does it possess the real capability to solve complex problems?"

Reverse Turing Test: GPT-4.5 Successfully Deceives 73% of Humans by 'Feigning Mediocrity'

Feigning Mediocrity: A New AI Strategy to Deceive Humans

Mimicking Flaws Rather Than Displaying Wisdom

The Dilemma of the Turing Test Era

Industry Reflection: What Are We Actually Measuring?

Comments

Keep reading

More from AI

2026 U.S. IPO Market Outlook: A Deep Transformation from 'Window-Driven' to 'Preparation-Driven'

Stripe 'Alumni' Enter the Funeral Industry: Meadow Secures $9 Million in Funding, Aiming to Reshape the 'Broken' Death Care Market

Citibank Lowers 12-Month Bitcoin Price Target to $112,000, Citing Regulatory Uncertainty as Primary Factor

Latest news

A Milestone in the Regulatory Landscape: US SEC and CFTC Jointly Release Crypto Asset Classification Guidelines

Market Winter Prevails: Kraken Parent Company Payward Puts IPO Plans on Hold

The 'Vanishing' DOGE Deposition Videos: The Tug-of-War Between Judicial Injunctions and Internet Memory