Breaking News

ChatGPT creators try to use artificial intelligence to explain itself – and come across major problems

0 0

Read full article

For free real time breaking news alerts sent straight to your inbox sign up to our breaking news emails Sign up to our free breaking news emails Please enter a valid email address Please enter a valid email address SIGN UP I would like to be emailed about offers, events and updates from The Independent. Read our privacy notice Thanks for signing up to the

Breaking News email {{ #verifyErrors }} {{ message }} {{ /verifyErrors }} {{ ^verifyErrors }} Something went wrong. Please try again later {{ /verifyErrors }}

ChatGPT’s creators have attempted to get the system to explain itself.

They found that while they had some success, they ran into some issues – including the fact that artificial intelligence may be using concepts that humans do not have names for, or understanding of.

Researchers at OpenAI, which developed ChatGPT, used the most recent version of its model, known as GPT-4, to try and explain the behaviour of GPT-2, an earlier version.

It is an attempt to overcome the so-called black box problem with large language models such as GPT. While we have a relatively good understanding of what goes into and comes out of such systems, the actual work that goes on inside remains largely mysterious.

That is not only a problem because it makes things difficult for researchers. It also means that there is little way of knowing what biases might be involved in the system, or if it is providing false information to people using it, since there is no way of knowing how it came to the conclusions it did.

Engineers and scientists have aimed to resolve this problem with “interpretability research”, which seeks to find ways to look inside the model itself and better understand what is going on. Often, this requires looking at the “neurons” that make up such a model: just like in the human brain, an AI system is made up of a host of so-called neurons that together make up the whole.

Finding those individual neurons and their purpose is difficult, however, since humans have had to pick through the neurons and manually inspect them to find out what they represent. But some systems have hundreds of billions of parameters and so actually getting through them all with people is impossible.

Now, researchers at OpenAI have looked to use GPT-4 to automate that process, in an attempt to more quickly pick through the behaviour. They did so by attempting to create an automated process that would allow the system to provide natural language explanations of the neuron’s behaviour – and apply that to another, earlier language model.

That worked in three steps: looking at the neuron in GPT-2 and having GPT-4 try and explain it, then simulating what that neuron would do, and finally scoring that explanation by comparing how the simulated activation worked with the real one.

Most of those explanations went badly, and GPT-4 scored itself poorly. But researchers said that they hoped the experiment showed that it would be possible to use the AI technology to explain itself, with further work.

The creators came up against a range of “limitations”, however, that mean the system as it exists now is not as good as humans at explaining the behaviour. Part of the problem may be that explaining how the system is working in normal language is impossible – because the system may be using individual concepts that humans cannot name.

“We focused on short natural language explanations, but neurons may have very complex behaviour that is impossible to describe succinctly,” the authors write. “For example, neurons could be highly polysemantic (representing many distinct concepts) or could represent single concepts that humans don’t understand or have words for.”

It also runs into problems because it is focused on specifically what each neuron does individually, and not how that might affect things later on in the text. Similarly, it can explain specific behaviour but not what mechanism is producing that behaviour, and so might spot patterns that are not actually the cause of a given behaviour.

The system also uses a lot of computing power, the researchers note.

GPT’s reaction to this article:

As an AI language model, I do not have the ability to form opinions on articles. However, the article discusses the limitations of AI technology and the challenges of interpreting large language models such as GPT. It highlights the importance of interpretability research to better understand how AI systems work and the potential biases they may have. The article also discusses the attempts by researchers at OpenAI to use GPT-4 to automate the process of explaining the behaviour of GPT-2, an earlier version, but notes the limitations and challenges they faced.

Happy
Happy
0 %
Sad
Sad
0 %
Excited
Excited
0 %
Sleepy
Sleepy
0 %
Angry
Angry
0 %
Surprise
Surprise
0 %