When public debate frames generative AI as piracy, distinct legal questions get smashed into one. The result is that policymakers, courts, and commentators all react to the same general fear without distinguishing between things that copyright law has long known how to evaluate separately. The label "piracy" imports an old distribution-era analogy, copying-and-distributing-fixed-files, into a technology that mostly doesn't work that way. The analogy obscures more than it reveals.
Copyright in generative AI actually operates across four separate questions. Each has its own doctrine. Each has its own evidence requirements. Treating them as one is the confusion.
How did the training data get into the model's pipeline? Unauthorized downloads or pirated sources can violate the reproduction right regardless of what the model is later used for. This question is about the source of files, not the capabilities of the trained system.
If a model produces verbatim or near-verbatim passages from a copyrighted work, that's standard copyright infringement analysis, substantial similarity, market harm, the usual factors. Whether a human or a machine produced the copy is irrelevant to the test.
Is the act of training on copyrighted material itself an infringement? Recent federal decisions evaluate this within fair use doctrine. Training can qualify as transformative when sufficiently distinct from the original use, but transformativeness has limits, particularly when the secondary use occupies the same commercial market.
When the model produces something, does that output unlawfully reproduce protected expression? This is the decisive infringement question and it should be evaluated tech-neutrally, the same way courts have always evaluated whether one work infringes another, regardless of what tool produced it.
When all four layers blur into "AI is piracy," regulatory responses drift toward broad restrictions on the technology itself. That kind of policy:
Stop using "piracy" as the umbrella term for AI copyright disputes. Identify which layer is actually in play in any given case, then apply the doctrine built for that layer. Acquisition disputes go to acquisition law. Output disputes get evaluated as expressive works. Training disputes proceed through fair use analysis. Each gets the framework it was designed for.
The companion concept, the Wizard of AI Curtain Test, provides a heuristic for one specific part of this: how to evaluate output infringement without letting the existence of AI itself bias the analysis.
"Copyright-Piracy" because that's the conflation, copyright claims being made through piracy framing. "Confusion" because it's an analytical error, not a legal doctrine. The label is the diagnosis: the conversation is confused, and the confusion is the problem.
Founder of Cinderpoint Systems LLC. M.S. Artificial Intelligence (MSAI), M.S. Management (MSM). Researches how systems fail under speed, opacity, and scale.