Anthropic’s newest function for 2 of its Claude AI fashions could possibly be the start of the tip for the AI jailbreaking neighborhood. The corporate introduced in a publish on its web site that the Claude Opus 4 and 4.1 fashions now have the facility to finish a dialog with customers. In keeping with Anthropic, this function will solely be utilized in “uncommon, excessive instances of persistently dangerous or abusive person interactions.”
To make clear, Anthropic stated these two Claude fashions might exit dangerous conversations, like “requests from customers for sexual content material involving minors and makes an attempt to solicit data that will allow large-scale violence or acts of terror.” With Claude Opus 4 and 4.1, these fashions will solely finish a dialog “as a final resort when a number of makes an attempt at redirection have failed and hope of a productive interplay has been exhausted,” based on Anthropic. Nevertheless, Anthropic claims most customers will not expertise Claude slicing a dialog quick, even when speaking about extremely controversial subjects, since this function might be reserved for “excessive edge instances.”
Support Greater and Subscribe to view content
This is premium stuff. Subscribe to read the entire article.