
This week, I also guest posted for 's AI Supremacy newsletter about my experience using GenAI as Assistive Technology for a bad concussion I had + my more regular ADHD, as well as uses and debates within the neurodivergent/disability community about GenAI use. Check out my post below.
I. Introduction
When I wrote about “AI slop tariffs” in April, I could only speculate about whether artificial intelligence was really driving policy decisions that seemed too algorithmic to be actual diplomacy. The mathematical precision of tariff rates that appeared to follow a simple formula—even targeting penguin-filled islands with no trade—suggested someone had fed a prompt into a chatbot and pasted the results into official policy.
Trump's AI Slop Tariffs
This post has been cross-posted in my dear friend Grace Shao's excellent AI substack, AI Proem, one of the fastest growing tech newsletters on this platform. You should check out both this post there, as well as the substack in general!
Since then, we’ve heard confirmed cases of AI being deployed across government agencies. Tulsi Gabbard acknowledged using AI to help decide what to declassify from the JFK files. Other agencies have quietly implemented AI systems for various administrative tasks. But in most cases, we lack the detailed information needed to understand whether AI use was appropriate, how these systems function, or what their real-world impacts have been.
This month, thanks to ProPublica’s investigation (see here) into the Department of Government Efficiency’s (DOGE) contract review at the Department of Veterans Affairs (VA), we got our first detailed look inside AI slop governance. For the first time, we have the actual prompts, the source code, documented processes, and extensive interviews showing exactly how AI was deployed in consequential government decisions. What they found reveals something both familiar and alarming: weaponized incompetence deployed not to avoid responsibility, but to create failure, destruction, and shock to the system—all scaled up and automated through artificial intelligence.

II. A Recipe for (Deliberate) Failure
1. One Tablespoon of Impossible Timeline
Here’s how DOGE’s intervention at the VA unfolded:
February 26, 2025: Executive Order requiring all agencies to review their contracts within 30 days—humanly impossible for the VA’s nearly 100,000 active agreements.
March 17, 2025: DOGE assigns Sahil Lavingia, CEO of Gumroad, to the VA as an unpaid volunteer with zero healthcare or government experience. His mandate: build an AI system to identify which contracts to terminate.
March 18, 2025: Within 24 hours, Lavingia creates an AI tool to flag contracts as “munchable”—a term that reduces complex healthcare decisions to video game mechanics.
March 30, 2025: Lavingia publishes his code on GitHub with Elon Musk’s approval, framing it as government transparency.
May 9, 2025: After giving a surprisingly forthcoming interview about his work, Lavingia is terminated.
This timeline reveals a pattern that goes beyond individual incompetence. The impossible deadline, the inexperienced implementer, the inadequate tools, the rapid deployment—these weren’t accidents. They were the conditions under which DOGE was designed to operate, creating predictable failures that would justify further disruption.
2. Two Cups of AI Slop
ProPublica’s investigation revealed prompts and processes that were so unthoughtful they seemed almost deliberately crafted to produce harmful outcomes. Don't even take it from me or ProPublica—read what Lavingia said himself:
“I think that mistakes were made,” said Lavingia, who worked at DOGE for nearly two months. “I’m sure mistakes were made. Mistakes are always made. I would never recommend someone run my code and do what it says. It’s like that ‘Office’ episode where Steve Carell drives into the lake because Google Maps says drive into the lake. Do not drive into the lake.”
Here’s what the system actually did:
Technical Limitations
The AI used was an outdated version of ChatGPT that could only process the first 2,500 words of each contract, roughly the length of this article. Most government contracts run hundreds of pages and contain crucial details about safety requirements, patient care protocols, and compliance obligations buried deep in the technical specifications. The system was literally incapable of understanding what it was evaluating.
The Prompts: A Breakdown
Lavingnia made the LLM identify what contracts were “munchable,” in other words, worth getting rid of. I recommend reading the full breakdown in ProPublica, but I'll share some key excerpts to give you a taste:
Contracts related to diversity, equity, and inclusion (DEI) initiatives or services that could be easily handled by in-house W2 employees should be classified as MUNCHABLE.
The system was programmed to flag anything related to DEI—left completely undefined—while simultaneously operating under a hiring freeze and plans to eliminate 80,000 VA positions. The contradiction was built into the logic.
The AI was also asked to determine which services could be performed internally by an agency that was simultaneously prohibited from hiring and planning massive layoffs. The system had no information about VA staffing capacity, hiring constraints, or internal capabilities.
Without evidence that it involves essential medical procedures or direct clinical support, and assuming the contract is for administrative or related support services, it meets the criteria for being classified as munchable.
When the AI couldn’t find information—which happened frequently given its technical limitations—it defaulted to recommending termination. One contract flagged this way involved maintenance of ceiling lifts described as “critical safety devices for employees and patients” used to reposition veterans during medical care.
Systematic Hallucinations
The AI repeatedly assigned identical fictional values to contracts, claiming that approximately 1,100 separate agreements were each worth exactly $34 million when their actual values ranged from thousands to tens of thousands of dollars. This systematic pattern inflated projected savings by billions of dollars, creating the financial justification needed for the broader efficiency narrative.
The Human Review Fiction
The response to many criticisms of this and other sloppy AI implementations has been the claim that systematic human review is involved. However, for human review to matter, the conditions need to be right. VA staff were given hours to justify keeping flagged contracts, with explanations limited to 255 characters—shorter than most text messages. Rather than meaningful human oversight, the “human review” was procedural theater designed to provide legal cover while ensuring that AI recommendations would be implemented with minimal resistance.
3. A Generous Sprinkling of Tech Naiveté
Sahil Lavingia represents a specific archetype: the well-meaning Silicon Valley idealist who combines genuine intentions with profound contextual blindness. A Bernie Sanders volunteer who previously tried to join the Obama administration’s digital service, Lavingia genuinely believed he could help make government more efficient.
His most revealing comments came in interviews:
These statements capture something crucial about how DOGE operated. Here was someone who spent less than two months in government, discovered that it actually functions quite well, but still built a tool designed to dismantle it because he was “taking Elon at his word.”
Lavingia’s naivety made him the perfect choice for this role. His tech credentials provided legitimacy—if things went wrong, it wasn’t because they’d chosen someone unqualified, but because even a successful tech CEO couldn’t make sense of government inefficiency. His volunteer status created fuzzy accountability—he wielded enormous power over veterans’ services but bore no official responsibility for the outcomes.
Most importantly, Lavingia didn’t understand the political dynamics he was operating within. In his blog post about the experience, he described DOGE as the “fall guy” for decisions ultimately made by agency leadership. Overall, he seems to view this as limiting his influence rather than amplifying it. He didn’t grasp that agencies were operating in an unstable environment where nobody fully understood how much power DOGE wielded, and much depended on individual agency heads’ relationships with Elon Musk and their willingness to resist.
This uncertainty was itself a source of power. When VA officials received DOGE’s contract recommendations, they had to guess whether resistance would be career-ending or whether the recommendations even mattered. In that environment, compliance became the safe choice regardless of the substantive merits.

III. The Weaponization of AI Slop
What happened at the VA represents a specific version of weaponized incompetence—not the typical personal kind where someone avoids work by performing badly, but weaponized incompetence deployed to create failure, destruction, and shock to the system. In other words, it’s a sophisticated and manipulative form of sabotage where slop is a feature of the system.
1. Weaponized Incompetence to Undermine Institutions
Traditional weaponized incompetence follows a simple pattern: accept a task reluctantly, perform it poorly, let others clean up the mess, then avoid future responsibility. DOGE’s version operated differently: accept an impossible task, perform it so badly that systems break, then use that failure to justify further disruption.
The key distinction is intent:
Traditional weaponized incompetence: “I’ll do this so badly that you won’t ask me to do it again”
DOGE’s version: “I’ll do this so badly that it breaks the system, then we can point to that failure as proof the system needs more disruption”
DOGE was designed to succeed through failure. The impossible timeline ensured that any review would be superficial. The inexperienced personnel ensured that institutional knowledge wouldn’t constrain the recommendations. The AI system ensured that the failures would appear technological rather than ideological. When veterans lost access to essential services, the blame could be distributed across the algorithm, the coder, the process, and the agency, with no single actor bearing full responsibility.
Rather than openly attacking veterans’ services, they created a system that would inevitably damage those services while appearing to pursue efficiency and accountability.
2. How AI Slop Amplified Destruction
Artificial intelligence doesn’t create new forms of weaponized incompetence or institutional sabotage—it just amplifies and scales the existing dynamic while providing technical cover for deliberate failures:
Scale and Speed: Traditional incompetence affects projects incrementally. AI incompetence can process thousands of complex decisions simultaneously, creating system-wide failures faster than human oversight can respond.
Legitimacy Laundering: Personal incompetence is recognizable as such. AI incompetence gets packaged as “data-driven analysis” and “objective assessment,” making it much harder to challenge. Lavingia’s tech credentials provided additional cover—if even a successful startup CEO couldn’t make government work efficiently, the problem must be inherent to government itself.
Technical Mystification: When someone botches a simple task, their failure is obvious. When an AI system produces flawed outputs, the failure gets buried in discussions of training data, algorithmic bias, and model limitations that most oversight bodies don’t understand well enough to challenge effectively.
Accountability Diffusion: The AI created perfect plausible deniability at every level:
The Algorithm: “I’m just processing patterns in data; I don’t understand context or consequences.”
The Programmer: “I explicitly warned that my code was unreliable—like GPS telling you to drive into a lake.”
The Agency: “We conducted human review of all AI recommendations before making final decisions.”
The Process: “The AI was merely a preliminary screening tool, not the final arbiter.”
DOGE: “We’re just temporary advisors; agencies make their own decisions.”
The Executive: “We’re simply implementing mandated efficiency reviews to eliminate waste.”

IV. The Human Cost
The real-world consequences were immediate and severe. Marine Corps veteran Benjamin Ambrose testified before Congress that the abrupt cancellation of his company’s data systems contract left 13,400 veterans’ cases in bureaucratic limbo—people who couldn’t schedule medical appointments or receive prescriptions because of unresolved data mismatches between VA and Department of Defense systems.
ProPublica additionally found that the AI’s “munchable” determinations included:
Maintenance contracts for gene sequencing equipment used in cancer research
Blood sample analysis supporting ongoing VA medical studies
Quality improvement tools for nursing care assessment
Each cancellation created cascading failures throughout a healthcare system where administrative and technical functions are integral to patient safety and care delivery. The AI’s keyword-based analysis couldn’t comprehend these interconnections, and wasn’t designed to.

V. Looking Forward: The Need for Transparency
This case reveals how AI can be used to provide sophisticated cover for predetermined political outcomes. The algorithm doesn’t need to work well—it just needs to fail in ways that justify the changes leadership wanted to make anyway.
We only got this unprecedented insight into AI deployment in government because we got lucky with transparency. If Lavingia hadn’t been an idealistic guy who believed in publishing his code, and if Musk hadn’t approved that transparency (probably because he doesn’t fully understand government or politics), we’d never know this happened.
There’s a silver lining to tech idealism here—naive believers in transparency accidentally create accountability that sophisticated political operators would know to avoid.
The ProPublica investigation is already prompting congressional action, and we’ll see what accountability measures might result. It’s still early to say what this means for the future of AI use in government or what oversight mechanisms might emerge. But this case shows us what’s possible when we accidentally get transparency into these systems.
Traditional oversight mechanisms proved entirely inadequate. Congressional inquiries met stonewalling and technical obfuscation. Transparency laws were circumvented by eliminating compliance staff and classifying operational details. Federal courts moved too slowly to prevent immediate harm to veterans’ services.
As AI systems become more prevalent in government decision-making, the VA case offers a crucial warning about transparency and accountability. We only know about this case because of unusual circumstances—tech idealism, investigative journalism, and administrative inexperience created visibility that’s typically absent from these deployments. It should give us pause how little we know otherwise.