Sullivan & Cromwell Apologizes for AI-Generated Citations

Sullivan & Cromwell, a leading Wall Street law firm, apologized after a bankruptcy court filing contained numerous inaccurate citations and fabricated passages generated by AI. Partner Andrew Dietderich informed Judge Martin Glenn that the motion included around three dozen errors, many described as AI "hallucinations," and filed a corrected version. The mistakes were identified by opposing counsel Boies Schiller Flexner. The firm said its AI policies and secondary review procedures were not followed in this instance. The episode underscores operational risk when attorneys use LLM tools for legal research and drafting without robust verification, and it is likely to accelerate demand for stronger citation-checking controls and audit trails in legal workflows.
What happened
Sullivan & Cromwell, one of the nation's most prominent law firms, acknowledged that a court filing in a U.S. Bankruptcy Court in Manhattan contained numerous inaccurate citations and fabricated legal text produced by AI. Partner Andrew Dietderich wrote to Judge Martin Glenn that the filing included roughly three dozen errors and apologized, saying the firm filed a corrected motion after opposing counsel Boies Schiller Flexner flagged the problems. "We deeply regret that this has occurred," Dietderich wrote. The firm said established AI policies and secondary review procedures were not followed in this instance.
Technical details
The errors are characterized as hallucination behaviors typical of generative LLM tools: fabricating case citations, misquoting authorities, and inventing non-existent legal sources. The firm did not disclose the specific AI product used. Practitioners should note these operational failure modes:
- •Fabricated citations: plausible-looking case names or page references that do not exist.
- •Misquotations: paraphrases presented as verbatim language from judicial opinions.
- •Attribution errors: assigning holdings to incorrect cases or courts.
Mitigation strategies that matter for practitioners
- •Enforce deterministic review gates, including automated citation-verification tools and mandatory human signoff on every legal citation.
- •Integrate provenance and audit logs for any AI-assisted drafting step so editors can trace which text segments derived from models.
- •Use specialized legal retrieval models or retrieval-augmented pipelines with verified sources rather than free-form LLM completions.
Context and significance
This is not the first high-profile case of AI hallucinations entering formal filings, but the involvement of Sullivan & Cromwell raises the stakes because of the firm's elite client base and public profile. The incident highlights a recurring theme in legal tech: the productivity gains of generative AI collide with strict professional and ethical duties to verify authority and accuracy. Courts and bar regulators have previously issued guidance that lawyers remain ethically responsible for work product even if AI tools are used. Operational lapses at a top firm will accelerate scrutiny across firms, vendors, and courts and will likely drive adoption of specialist models and defenses designed for legal accuracy.
Why it matters for ML practitioners and legal engineers
Legal workflows expose a critical product requirement that many generic LLM offerings do not meet out of the box: verifiable, auditable retrieval with guarantees on citations. Building production-grade legal assistants requires:
- •Tight integration of retrieval systems with canonical legal databases.
- •Citation resolvers that return structured, machine-verifiable references.
- •Human-in-the-loop validation steps instrumented in the UI and logs.
What to watch
Expect rapid uptake of tools that combine retrieval-augmented generation with deterministic citation resolvers, vendor contracts that require provenance guarantees, and possibly formal judicial guidance or firm-level controls specifying acceptable uses of generative models. Firms that fail to operationalize verification will face reputational and ethical risk.
Bottom line
The S&C episode is a practical reminder that LLM outputs are probabilistic text, not authoritative legal research. For production use in high-stakes domains, prioritize source-verified pipelines, provenance, and explicit human verification checkpoints.
Scoring Rationale
High-profile operational failure at a leading firm makes this a notable cautionary case for AI deployment in regulated, high-stakes workflows. It is not a technical breakthrough but materially raises adoption and governance priorities for practitioners.
Practice interview problems based on real data
1,500+ SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problemsStep-by-step roadmaps from zero to job-ready — curated courses, salary data, and the exact learning order that gets you hired.



