Lawyers Continue Using AI Despite Sanctions

U.S. law firms keep relying on generative AI for legal research and drafting despite rising disciplinary actions and court scrutiny. In a notable November 2025 ruling, a federal judge in Oregon declined to impose formal sanctions on Buchalter after the firm filed pleadings containing two AI-generated, inaccurate case citations; the firm agreed to remedial steps including a $5,000 donation, internal policy reviews, and offers to reimburse fees. That outcome sits alongside multiple disciplinary penalties and fines in other jurisdictions for attorneys who submitted fabricated citations or AI-produced briefs. For AI/ML practitioners, the trend highlights persistent model hallucinations, operational risk in downstream workflows, and an urgent need for guardrails, provenance, and verification tooling when models are used in regulated professional contexts.
What happened
A federal judge in Oregon (U.S. District Judge Michael Simon) declined to impose formal sanctions on attorneys at Buchalter after a filing included two non-existent or misleading case citations generated by a generative AI system. The judge characterized one citation as “totally fake” and the other as “almost real,” and accepted the firm’s remedial package: a $5,000 donation to a legal-aid organization, an internal review of safeguards, and offers to reimburse legal fees caused by the error (Reuters, Nov. 13, 2025). Buchalter called the episode a clear violation of its strict AI-use policy. That outcome follows a string of disciplinary actions—fines and sanctions in Utah, Massachusetts, California, and New York—arising from lawyers who submitted AI-produced fabrications or relied on AI outputs without adequate verification. Courts have also questioned whether AI-generated documents are protected by privilege or work-product doctrines in discovery contexts (Paul Weiss, BakerLaw, DLA Piper summaries). Why this matters technically — These incidents expose three persistent technical failure modes of current generative models: hallucination (fabricating sources and citations), opaque provenance (no reliable trace of how an assertion was produced), and brittle human-in-the-loop controls (professionals failing to validate outputs). For ML teams and platform owners, the legal sector functions as a stress test for model risk management: hallucinations cause reputational, ethical, and monetary harm; unverifiable outputs create compliance and discovery exposure; and naive tool integration undermines professional duties of competence. Key details practitioners should note — Courts are splitting on remedies: some judges and bar authorities impose fines or sanctions; others accept remediation without formal penalties. Firms respond with policy tightening, training, and limited remediation (donations, fee reimbursements). Separate rulings have raised questions about privilege for client-produced AI content, increasing discovery risk where defendants used public LLMs. What to do — Product teams, MLOps engineers, and legal-tech vendors must prioritize:
- •citation verification layers and retrieval-augmented generation with auditable provenance
- •guardrails that block unsupported citations and surface confidence signals
- •operational controls and training for end users in regulated roles
- •audit trails useful in litigation or disciplinary review. What to watch — Upcoming bar decisions and appeals that create formal standards for AI use in legal practice; regulatory guidance on disclosures when parties use generative models; and technical advances in provenance, retrieval, and hallucination mitigation that could change liability calculations. Sources: Reuters, The Guardian, CalMatters, MSBA, PaulWeiss, DLA Piper
Scoring Rationale
The story is highly relevant to AI/ML practitioners because it highlights real-world harms from model hallucinations and governance gaps (relevance=2.0). Credibility is high due to court orders and reputable coverage (2.0). Scope and novelty are moderate—the issue is widespread but not new (scope~1.5, novelty~0.8). Actionability is moderate: it points to concrete engineering and compliance steps (1.0). A 1.0 freshness penalty applies because reporting dates span recent months.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



