What happened
Sony Music filed a motion on May 22, 2026 in the US District Court for the Southern District of New York seeking permission to expand its copyright complaint against AI music generator Udio, the filing states. According to the proposed Second Amended Complaint, the plaintiffs assert 30,442 copyrighted sound recordings that Udio "copied and ingested into its generative AI models," per the motion quoted by Music Business Worldwide. The motion says the plaintiffs identified these additional works after receiving access to Udio's training data during discovery.
Technical details
Editorial analysis - technical context: Public reporting connects the discovery review of training material to a large-scale list of specific recordings, which underscores the role of dataset forensics in modern copyright litigation. For practitioners, robust metadata, hashing, and provenance tools are central to either proving or defending against claims that specific copyrighted files were used in model training.
Context and significance
Editorial analysis: The motion frames the amendment as non-prejudicial to Udio, noting it would at most require extending the document production deadline currently set to close on June 26, 2026, or alternatively permitting ownership-related discovery on the new works to be stayed until after summary-judgment briefing on fair use, the filing says. Music Business Worldwide also reports that in August 2024 Udio and Suno largely admitted using copyrighted recordings to train their models while asserting the use was protected by fair use.
What to watch
Editorial analysis: Observers should track whether the court permits the large amendment and how the judge treats discovery tied to training data. Industry observers will also watch any court rulings that clarify whether ingesting copyrighted audio into model training constitutes direct infringement or is protected by fair use, because those rulings would influence dataset acquisition, documentation, and risk management for model builders.
Key Points
- 1Sony seeks to add 30,442 recordings to its complaint, a mass-identification tied to discovery of Udio's training data.
- 2For practitioners, precise dataset provenance and forensic matching matter more for audio than before; claims can enumerate thousands of works.
- 3Court decisions on discovery scope and fair-use motions will shape dataset acquisition, labeling, and legal risk for generative-audio models.
Scoring Rationale
This is a notable legal escalation that uses discovery of training data to enumerate 30,442 works, increasing legal risk for audio-model training and raising dataset-provenance requirements for practitioners.
Sources
Public references used for this report.
Practice interview problems based on real data
1,625 SQL & Python problems across 15 industry datasets — the exact type of data you work with.
Try 250 free problems
