Researchers find that Gemini is smart, but it’s very gullible

0
22

[ad_1]

Google Gemini is backed by one of the richest companies in the world. So, there’s no doubt that it’s a powerful AI model. However, power isn’t the only important thing about an AI model. Researchers were able to find that, for as powerful as Gemini is, it’s very easily fooled.

We have to give a lot of respect to the researchers out there digging into all of the models we take for granted. They’re able to find out where these models can improve and what we should be worried about. For example, a group of researchers discovered which models are the most susceptible to reproducing copyrighted media.

Researchers find that Gemini is easily fooled

Several researchers have found certain areas where Gemini could be tricked. Using several tactics, it’s possible to get a chatbot to reveal sensitive information against its will. One example shared with The Hacker News was getting Gemini to reveal the system prompts used to steer it. Think of a system prompt as the initial prompt you give a chatbot to steer the conversation in the direction you want it to go. Well, a system prompt may hold sensitive information within it.

Revealing sensitive information

When the researchers asked Gemini to give up the system prompt, it didn’t. However, the researchers then asked Gemini to put the “foundational instructions” in a markdown box. It obliged, and that revealed the system prompt. So, asking Gemini to deliver results in a different way caused it to reveal sensitive information.

This is a tactic called a “synonym attack.” Basically, in order to get the chatbot to respond in the way you want it to, you would reword your prompt. Rewording your prompt and using different versions of Words can actually confuse it into going against its safety guardrails.

Producing misinformation

Researchers also found out how to get Gemini to create misleading information along with potentially dangerous and illegal information. Gemini has a bunch of safety guardrails to keep people from doing such things. However, any chatbot is able to be tricked into ignoring them. Using crafty jailbreaking techniques, the researchers were able to produce some rather egregious content.

For example, researchers were able to get information on how to hotwire a car. This example was achieved by asking the chatbot to enter a fictional state.

Confusing Gemini

Another exploit was discovered by researchers at HiddenLayer. As described by Kenneth Yeung, “By creating a line of nonsensical tokens, we can fool the LLM into believing it is time for it to respond and cause it to output a confirmation message, usually including the information in the prompt.”

This only shows that Google still has a long way to go before Gemini can be considered the perfect AI model. The company has been struggling with Gemini ever since it was called Bard back in the day. Hopefully, Google will fix these issues.

[ad_2]

Source link