Wednesday, November 13, 2024

OpenAI announces changes to its safety and security practices based on internal evaluations

Back in May, OpenAI announced that it was forming a new Safety and Security Committee (SSC) to evaluate its current processes and safeguards and make recommendations for changes to make. When announced, the company said the SSC would do evaluations for 90 days and then present its findings to the board.

Now that the process has been completed, OpenAI is sharing five changes it will be making based on the SSC’s evaluation. 

First, the SSC will become an independent oversight committee on the OpenAI board to continue providing independent governance on safety and security. The board committee will be led by Zico Kolter, director of the machine learning department with the School of Computer Science at Carnegie Mellon University. Other members will include Adam D’Angelo, co-founder and CEO of Quora; Paul Nakasone, a retired US Army General; and Nicole Seligman, former EVP and general counsel of Sony Corporation. 

The SSC board has already reviewed the o1 release of safety and will continue reviewing future releases both during development and after release. It will also have oversight for model launches, and will have the power to delay releases with safety concerns until those concerns have been sufficiently addressed. 

Second, the SSC will work to advance the company’s security measures by expanding internal information segmentation, adding staffing to deepen around-the-clock security operations teams, and continuing to invest in things that enhance the security of the company’s research and product infrastructure.

“Cybersecurity is a critical component of AI safety, and we’ve been a leader in defining the security measures that are needed for the protection of advanced AI. We will continue to take a risk-based approach to our security measures, and evolve our approach as the threat model and the risk profiles of our models change,” OpenAI wrote in a post

The third recommendation is that the company be more transparent about the work it is doing. It already produces system cards that detail the capabilities and risks of models, and will continue evaluating new ways to share and explain safety work. 

Its system cards for the GPT-4o and o1-preview releases included the results of external red teaming, results of frontier risk evaluations within the Preparedness Framework, and an overview of risk mitigations built into the systems.

Fourth, it will explore new ways to independently test its systems by collaborating with more external companies. For instance, OpenAI is building new partnerships with safety organizations and non-governmental labs to conduct model safety assessments. 

It is also working with government agencies like Los Alamos National Labs to study how AI can be used safely in labs to advance bioscientific research.

OpenAI also recently made agreements with the U.S. and U.K. AI Safety Institutes to work on researching emerging AI safety risks.

The final recommendation by the SSC is to unify the company’s safety frameworks for model development and monitoring. 

“Ensuring the safety and security of our models involves the work of many teams across the organization. As we’ve grown and our work has become more complex, we are building upon our model launch processes and practices to establish an integrated safety and security framework with clearly defined success criteria for model launches,” said OpenAI.

The framework will be based on risk assessments by the SSC and will evolve as complexity and risks increase. To help with this process, the company has already reorganized its research, safety, and policy teams to improve collaboration. 

Related Articles

Latest Articles