US lawmakers Anna Eshoo and Don Beyer, serving in the United States House of Representatives, have proposed the AI Foundation Model Transparency Act to enhance transparency in the training data used for AI systems and thus prevent copyright infringement. This bill proposes stringent measures for creators of foundation models, compelling them to disclose sources of training data. If enacted, it will involve regulatory bodies such as the Federal Trade Commission (FTC) and the National Institute of Standards and Technology (NIST) in establishing rules for transparent reporting of training data.
Companies developing foundation models would be mandated to provide comprehensive reports covering aspects like the sources of training data, data retention during inference, limitations, risks associated with the model, alignment with NIST's AI Risk Management Framework, and adherence to potential federal standards. Additionally, the bill would require AI developers to prevent the generation of 'inaccurate or harmful information', including medical information, cybersecurity, elections, and services impacting vulnerable populations.
Essentially, the bill comes as a response to the increase in lawsuits and public apprehensions related to copyright infringement. Namely, the bill emphasizes that the use of foundational AI models in public settings, has resulted in instances where the public is exposed to inaccurate, imprecise, or biassed information. It is now awaiting assignment to a committee for further discussion, with uncertainties about its timing amidst the upcoming election campaign season.
As reported by the Verge, the proposed legislation aligns with the Biden administration's AI executive order, offering a complementary regulatory framework. While the executive order serves as a guideline, the AI Foundation Model Transparency Act, if passed, would elevate transparency requirements for training data to the status of federal law, addressing pressing concerns surrounding AI-generated content and legal implications.