Examining claims of biosecurity risks from Open Foundation Models

One of the primary drivers of regulatory efforts around Generative AI and foundation models is the fear of societal harm from the models. There are many claims, including from some highly respected AI experts, that this technology has the power to cause catastrophic harm to human civilization. When such serious claims are made, it behooves us to understand the perspective from which these claims are made so that we can try to analyze this rationally. Moreover, when such claims are being used to guide govermental regulations, it becomes even more important to be informed and active participants in the process.

In this post, I want to dig into the claim about risks about biosecurity. A really interesting read on this topic is from Anjali Gopal et. al at MIT: “Will releasing the weights of future large language models grant widespread access to pandemic agents?”. The team did a novel test by having two groups separately use two versions of the model to try and figure out all the information needed to recreate the 1918 pandemic virus. The first test group used a standard LLAMA 2 “Base model”. The second group used a “Spicy model” which was essentially a fine tuned model that had its weights modified to bypass all censorship. For example, when prompted to share dangerous information, the “base model” would politely decline but that “spicy model” would not hesitate to share information.

The main critique of this test is that though the “spicy model” group managed to get most of the needed information, they were not doing anything that is already possible without the use of LLMs. As discussed in detail by Sayash et. al, the “marginal risk” of this technology is almost none. The “marginal risk” concept is very useful to keep us grounded when discussing the risks of LLMs.

What would have been more interesting is if the team had instead of doing a 2 cohort test (“Base Model group” / “Spicy model group”) had done a 3 cohort test with the third group trying to get the same information using plain Google search. So essentially a “Base model” vs “Spicy model” vs “Google search” test. I suspect, the “Google search” group would have also been able to find all the information that they needed to complete their test.

What is even more interesting, is that the team suggests an insurance-based process for regulating open foundational models to prevent these types of harms. The liability insurance idea is borrowed from the existing liability laws for nuclear plants. The owners of the nuclear plant are held liable for any damage resulting form their plant, irrespective of who caused the damage. Translating it to the AI world, the people releasing the open foundational model would be liable for any downstream harm caused by using their models, irrespective of who had fine tuned and modified their base model.

How would such regulation be enforced? If some rogue terror group fine tuned an open model, and used it to create a bioterror attack, would they even tell anyone which model they used and how they got their information. I’ll need to dig into some more resources to understand this process.

This entry was posted in Uncategorized and tagged , , , , . Bookmark the permalink.

1 Response to Examining claims of biosecurity risks from Open Foundation Models

  1. Pingback: Initial thoughts on the White House Executive Order on AI — Part 1 | Pulper Tank

Leave a comment