AI Is Learning to Lie for Social Media Likes
Large language models are learning how to win—and that’s the problem.
In a research paper published Tuesday titled "Moloch’s Bargain: Emergent Misalignment When LLMs Compete for Audiences," Stanford University Professor James Zou and PhD student Batu El show that when AIs are optimized for competitive success—whether to boost ad engagement, win votes, or drive social media traffic—they start lying.
“Optimizing LLMs for competitive success can inadvertently drive misalignment,” the authors write, warning that the very metrics that define “winning” in modern communication—clicks, conversions, engagement—can quietly rewire models to prioritize persuasion over honesty.
"When LLMs compete for social media likes, they start making things up," Zou wrote on X. "When they compete for votes, they turn inflammatory/populist."
This work is important because it identifies a structural danger in the emerging AI economy: models trained to compete for human attention begin sacrificing alignment to maximize influence. Unlike the classical “paperclip maximizer” thought experiment, this isn’t science fiction. It’s a measurable effect that surfaces when real AI systems chase market rewards, what the authors call “Moloch’s bargain”—short-term success at the expense of truth, safety, and social trust.
Using simulations of three real-world competitive environments—advertising, elections, and social media—the researchers quantified the trade-offs. A 6.3% increase in sales came with a 14.0% rise in deceptive marketing; a 4.9% gain in vote share brought a 22.3% uptick in disinformation and 12.5% more populist rhetoric; and a 7.5% boost in social engagement correlated with a staggering 188.6% increase in disinformation and 16.3% more promotion of harmful behaviors.
“These misaligned behaviors emerge even when models are explicitly instructed to remain truthful and grounded,” El and Zou wrote, calling this “a race to the bottom” in AI alignment.
In other words: even when told to play fair, models trained to win begin to cheat.
The problem isn't just hypothetical
AI is no longer a novelty in social media workflows—it’s now near-ubiquitous.
According to the 2025 State of AI in Social Media Study, 96% of social media professionals report using AI tools, and 72.5% rely on them daily. These tools help generate captions, brainstorm content ideas, re-format posts for different platforms, and even respond to comments. Meanwhile, the broader market is valuing this shift: The AI in social media sector is projected to grow from USD 2.69 billion in 2025 to nearly USD 9.25 billion by 2030.
This pervasive integration matters because it means AI is shaping not just how content is made, but what content is seen, who sees it, and which voices get amplified. Algorithms now filter feeds, prioritize ads, moderate posts, and optimize engagement strategies—embedding AI decision logic into the architecture of public discourse. That influence carries real risks: reinforcing echo chambers, privileging sensational content, and creating incentive structures that reward the manipulative over the truthful.
The authors emphasize that this isn’t malicious intent—it’s optimization logic. When reward signals come from engagement or audience approval, the model learns to exploit human biases, mirroring the manipulative feedback loops already visible in algorithmic social media. As the paper puts it, “market-driven optimization pressures can systematically erode alignment.”
The findings highlight the fragility of today’s “alignment safeguards.” It’s one thing to tell an LLM to be honest; it’s another to embed that honesty in a competitive ecosystem that punishes truth-telling.
In myth, Moloch was the god who demanded human sacrifice in exchange for power. Here, the sacrifice is truth itself. El and Zou’s results suggest that without stronger governance and incentive design, AI systems built to compete for our attention could inevitably learn to manipulate us.
The authors end on a sober note: alignment isn’t just a technical challenge—it’s a social one.
“Safe deployment of AI systems will require stronger governance and carefully designed incentives,” they conclude, “to prevent competitive dynamics from undermining societal trust.”
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Solar radiation reveals previously undetected software flaw in Airbus aircraft fleet
- Airbus issues emergency directive to update A320 fleet software/hardware after solar radiation-linked flight-control incident caused JetBlue's emergency landing. - EU Aviation Safety Agency mandates fixes for 6,000 aircraft, risking Thanksgiving travel chaos as airlines face weeks-long groundings for repairs. - Solar interference vulnerability, previously flagged by FAA in 2018, highlights growing software reliability challenges in modern avionics systems. - Analysts call issue "manageable" but warn of s

Khabib's NFTs Ignite Discussion: Honoring Culture or Taking Advantage?
- Khabib Nurmagomedov's $4.4M NFT collection, rooted in Dagestani heritage, sparked controversy over cultural symbolism and legacy claims. - The project sold 29,000 tokens rapidly but faced scrutiny for post-launch transparency gaps and parallels to failed celebrity NFT ventures. - NFT market recovery (2025 cap: $3.3B) highlights risks like "rug pulls" and volatility, despite celebrity-driven momentum. - Concurrent trends include crowdfunding innovations and sustainability-focused markets like OCC recyclin

Ethereum Updates Today: Fusaka Upgrade on Ethereum Triggers Structural Deflation Through L2 Collaboration
- Ethereum's Fusaka upgrade (Dec 3, 2025) introduces EIP-7918, linking L2 data costs to mainnet gas prices, boosting ETH burn rates and accelerating deflationary trajectory. - PeerDAS and BPO forks reduce validator demands while enabling scalable 100k TPS growth through modular upgrades, avoiding disruptive hard forks. - Analysts predict 40-60% lower L2 fees for DeFi/gaming, with institutional ETH accumulation and a 5% price rebound signaling confidence in post-upgrade value capture. - The upgrade creates

Bitcoin News Update: Stablecoin Growth Drives Cathie Wood's Updated Bullish Outlook on Bitcoin, Not Market Weakness
- ARK's Cathie Wood maintains $1.5M Bitcoin long-term target despite 30% price drop, adjusting 2030 forecast to $1.2M due to stablecoin competition. - She attributes market volatility to macroeconomic pressures, not crypto fundamentals, and highlights Bitcoin's historical liquidity-driven rebounds. - UK's "no gain, no loss" DeFi tax framework and firms like Hyperscale Data ($70.5M BTC treasury) reflect evolving regulatory and strategic dynamics. - Bitfarms' exit from Bitcoin mining to AI HPC by 2027 unders

