GuruFocusGuruFocus

OpenAI Unveils New AI Models to Improve Online Safety

Czytanie zajmuje mniej niż 1 minutę

OpenAI pushed out a couple of new AI models tailor-made to make the internet safer, gpt-oss-safeguard-120b and gpt-oss-safeguard-20b. The Microsoft MSFT-backed company said the models help developers detect online safety risks, such as fake reviews or harmful content, while offering greater transparency in how decisions are made.

Unlike traditional open-source models, these are open-weight, meaning developers can view the models' internal training parameters but can't modify the underlying code. That approach gives users more control and insight without compromising security. Each model fits in with different safety policies, highlighting its reasoning process in helping developers gauge why specific content was flagged.

The models were made in partnership with ROOST (Robust Open Online Safety Tools) and have been tested with Discord and SafetyKit. They're now available for research on Hugging Face, with OpenAI inviting feedback from the AI safety community as it works to build trust and accountability into its tools.