Section 230a: The Freedom for AI Innovation and Research Use Act (the “FAIR Use Act”)

In its lawsuit against OpenAI, the New York Times has requested that any AI model trained on its data – which includes GPT3/4, Claude, Mistral, Llama/LLama 2 and, per Dan Jeffries, “any other model in existence” – be destroyed. Literally, from the pleadings, “ordering destruction under 17 USC § 503(b) of all GPT or other LLM models and training sets that incorporate Times Works.”

Old Media wants to kill AI because AI will put Old Media out of business. The industry’s response to this should be like the Internet industry’s response to Stratton Oakmont v. Prodigy. Namely, we should legalize training and using AI models, forever.

We would do this by amending the provisions of the U.S. Code added by the Communications Decency Act in 1996 with a new law, the Freedom for AI Innovation and Research Use Act (or “FAIR Use Act“), which would substantially protect AI in the same manner that the CDA 1995 wound up protecting the consumer Internet, which was (as we all know) a multi-decatrillion-dollar, four-decade economic boom in the United States and cemented our position as the richest country on Earth.

So it’s something we, as a society, might wish to repeat.

To achieve that, we need a new law which ensures that the misuse of AI is punishable by the civil court system but also ensures the mere bringing into existence of, or facilitating access to, an AI is not so punishable. Users of an AI model who use the model to infringe someone’s copyright or violate the rights of others should suffer penalties, but developers and hosts of AI models who do not specifically conspire to bring about that misuse should not be made vicariously liable for that misuse.

This law must therefore address two very specific legal problems: (a) copyright infringement liability for training an AI model and (b) copyright infringement and other liabilities arising out of hosting and using an AI model.

First, because AI models don’t store copies of the data on which they’re trained during training runs, we need a law that deems training of AI models which does not involve retaining a copy of the training data to be, presumptively, fair use for the purposes of the Copyright Act. This is dealt with in draft Section 230a(c), below.

Second, there is the issue of publisher liability in relation to copyrighted and other information (e.g. defamation) arising from normal use of an AI model by users providing prompts and other information as inputs, and then automatically publishing the response to those prompts as outputs. Although there is an argument that this kind of activity is covered by existing Section 230 (except for copyright, which is expressly carved out of Section 230’s application and is generally handled in the consumer Internet context by the notice-and-takedown regime under the DMCA), there are good arguments the other way, too. The issue is that with AI, to the extent that Section 230 is unclear as to whether the output of an AI model is user-generated content for the purposes of that section, AI model providers might find themselves defending enormous and numerous lawsuits for the torts of their users, in addition to copyright problems arising from the fact that the DMCA is outdated and no longer fit for purpose. This is an unacceptable situation for any mass-use consumer application, and a problem so huge, and so potentially threatening to the development of the web, that it led to the enactment of Section 230(c)(1) of the Communications Decency Act in 1995 to protect the early Internet from destruction-by-litigation.

Regarding liability for this user input and output data, I therefore propose to handle the publishing infringement problem with draft 230a(d)(1), below, which mirrors existing Section 230(c)(1) save that IP law is not carved out of 230a(d)(1). Because outdated copyright law is the issue with AI, the appropriate response to facilitate development of the industry is to immunize it from copyright and frankly all other IP concerns completely and push the liability for any infringements back on the user, much as Section 230 did for the non-IP content of speech on the Web.

Here’s how this law might look in the statute books:

47 U.S.C. § 230a

a. Findings.

This Congress finds the following:

The rapidly developing array of artificial intelligence ( “AI” ) services available to individual Americans represents today, much as the developing array of Internet services thirty years ago, an extraordinary advance in the productive capacity and availability of educational and informational resources to our citizens.
AI offers the chance for individual self-fulfillment. Like the Internet, the flourishing of AI services will benefit all Americans most with the minimum of government regulation.

b. Policy.

It is the policy of the United States to permit, promote, and encourage the development of AI Models to the maximum possible extent, with a view towards the United States becoming the unchallenged AI leader in the world. Achieving this requires that AI be unhindered by Federal, State, or foreign regulation, or by copyright infringement lawsuits with regard to the training data which are necessary prerequisites to fuel these new, digital engines of scientific advancement and economic development.

c. Protection for use of copyrighted material as training data.

Use of a copyrighted work to train an AI Model, where that model or associated databases do not store a substantially complete copy of the copyrighted work after completion of the training, shall be “fair use” of that copyrighted work for the purposes of 17 U.S.C. § 107.

d. Protection for AI Model providers.

No provider or user of an AI Model shall be treated as the publisher or speaker of (i) any input to any AI Model provided by another information content provider, and (ii) any output of an AI Model resulting, in whole or in part, from any input provided by another information content provider.
“Information content provider” has the meaning given to it in 47 USC § 230(f)(3).
“AI Model” means any service or system which contains, provides or enables access to a computer program which utilizes AI, which is made accessible to any user or computer, regardless of whether such users or computers access such computer programs via downloadable program, remotely via a computer server or over the Internet, or otherwise.

e. No effect on Federal criminal law or consistent State law.

Nothing in this section shall be construed to impair the enforcement of any Federal criminal statute.
Nothing in this section shall be construed to prevent any State enforcing any State law that is consistent with this section. No cause of action may be brought and no liability may be imposed under any State or local law that is inconsistent with this section.

f. Application of foreign law or regulation contrary to U.S. law barred.

A domestic court shall not recognize or enforce any foreign judgment, ruling, fine, or other legal, administrative, or other official penalty which that domestic court, applying the provisions of this section, would itself be barred from applying by this section or any other provision of U.S. Federal or State law.

Excuse the reproduction of the text below, need to make sure the thumbnail image works on Twitter!