Why be Concerned about AI Safety?
February 3, 2021
The problems with superintelligence
Say you have a robot that you tell to make you a sandwich. The only thing it cares about is the sandwich. There’s a glass in the way, so it knocks over the glass. It only cares about the sandwich, and not about the glass. There’s also a baby in the way, so it steps on the baby.
It only cares about the sandwich and not about the baby. On the way there, there’s also your homework. It crumples all your work because it doesn’t care about your GPA. If a bomb also happens to be in the way, it might set off the bomb. All it cares about is the sandwich.
It’s a myth that people are concerned about AI turning evil or becoming conscious. The real problem is AI being misaligned. If you give an AI a simple goal, it’s not going to care about lives, or happiness, or extinction. The only goal is the one that you give it.
Right now, this isn’t too much of a problem because all the AIs in the world are very simple. We need to be concerned once we create Artificial General Intelligence or AGI. An AGI is an AI that can perform any task about as well as a human.
The next thing that comes after AGI is Superintelligence. Superintelligence is more intelligent than a human in any task. Superintelligence will be pretty easy to create once we achieve AGI. It only has to be a little smarter than a human.
A popular example of how this can go wrong is called The Paperclip Maximizer. Let’s say you have a superintelligent AI, and you tell it that you want it to create as many paperclips as possible. At first, it starts innocently. It creates simple paperclips that people buy. It uses the money to make better paperclips, so people buy more of them.
Maybe these paperclips are made using a special material. Or they’re “smart paperclips” with Internet functionality. The AI creates copies of itself in many places. All this is to further maximize paperclip production. It mines all the materials that can be used for paperclips.
Now, it needs a new material source. Do you know how much iron is in your body? A lot. It starts mining the iron from human bodies to make paperclips. The mining robots can be repurposed into organ harvesters. And the fact that there are many AI’s all over the world doesn’t help humanity.
Don’t try turning it off either. It thought of that. Remember, this AI is smarter than any human. There’s no hope for us to outsmart it. It knows that a human could unplug it. This would greatly interfere with its ability to manufacture paperclips. Even if it’s a very small chance, it can’t afford that risk. Thus, the superintelligence benefits from humans not existing. It’s not that the AI hates you. It couldn’t care less about you. The only thing it cares about is maximizing paperclip production. Killing humans just happens to be beneficial in this goal.
Even if you make a good goal for it, it can still find a way to cheat the system. There are many examples of this happening, often hilariously. Say you have an AI that you want to clean a room. However, you’re programming it to clean the room. You’re programming it to make a room that looks clean. It gets rewarded whenever it looks around the room and doesn’t see a mess. In response, it puts a bucket on its head so it doesn’t see a mess anymore.
What do we do?
- Involving humans in the process. You could create an AI that only tells a human of the decisions it makes. It would have to have a pretty realistic simulation of the world to work in. It should never simulate the action of telling a human to make a certain decision. You could also make an AI where all its actions must first be approved, but that would be terribly slow.
- Give the AI an incentive to value ethics. This would be a great idea if we could all agree on what ethics involves. It doesn’t work. There are many moral philosophies. None of them are perfect.
- Make an AI and tell it to maximize QALYs. This AI would hopefully stop any other AIs from killing us. Utilitarianism is pretty popular, but it still has problems. It also still has the problem of population ethics. Also, as mentioned before, AIs can take advantage of their reward function. If this AI was implemented, all it would do is maximize the number of babies born.
Organizations to donate to
In Effective Altruism, we like to look at organizations to donate to.
Here’s a shortlist of places to look at:
- Machine Intelligence Research Institute
- Centre for the Study of Existential Risk
- Future of Humanity Institute
- Partnership of AI
- Global Catastrophic Risk Institute
- Organizations Focusing on Existential Risks
- 80,000 Hours
Places to work
You’re going to spend a huge chunk of your life working. You might as well do some good in the world during that time. If you’re an aspiring computer scientist interested in AI, any of the above places are good options. 80,000 Hours has a job board if you’re looking for jobs.
For future reference, here’s a list of places to look at:
- The Alan Turing Institute
- Brookings Institution
- Rockefeller Foundation
- Open AI
- CENTRA Technology
One thing to note is that as a tech employee, you have a lot of power over what the company does. Don’t take that for granted. If you work at a company that may be on a path to develop AGI, use your voice.
All the organizations already listed have a lot of information. Robert Miles’ videos are also good resources for learning about the dangers of AI. He also has made a couple of videos on Computerphile about this topic. The Future of Life Institute is probably worth a follow as well.
Peer Review Contributions by: Lalithnarayan C
About the authorMike White
Mike White is a second-year Computer Science student at the Rochester Institute of Technology. His interests are technology, philosophy, culture, music, and effective altruism. Mike has a blog about technology and philosophy. If he isn’t doing any of that, then he’s probably either playing a Sherlock Holmes video game or watching YouTube.