Jump to content


Photo
- - - - -

The insanity of trying to make safe AI


  • Please log in to reply
6 replies to this topic

#1 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 18 June 2017 - 01:29 PM

The AI Rebellion. A staple of sciene fiction, so it propably will make a appeareane in a Star Hero Setting.

Computerphile is a nice channel and right now, they have a number of of things on the Mater of how AI could go utterly wrong. And I mean actually scientific examination.

 

 

First a minor mater of terminology, as there are 2 types of Robot/AI wich are easy to mix up:

 - AI as part of the Engineering Deparment. Also "Smart Technologies" or Dumb Agent AI. This was a huge success. This is what we have now, in Computer games, your Diswasher, your Selfdriving car.

 - AI as part of the Cognitive Science Deparment. This was a complete disaster. Also called General AI or Artificial General Intelligence (AGI) in the Videos. This is the human level science fiction AI.

Mostly my definition is based on what I saw and what was talked about in this Video:

 

 

The first case of course is Asimovs 3 Laws. Wich do not work, unless you first take a stance on every ethic question that does and may come up:



#2 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 18 June 2017 - 01:31 PM

But it does not stop there. What if you install a Stop Button or any similar shutoff feature? Well you end up with one of 3 cases:

1) The AI does not want you to press the button and fights or deceives you so you do not (when you still could).

2) The AI wants you to press the button and will hit it itself or deveive you into doing it (when you can).

3) The AI does not care about the button. So then it might optimise out the "not caring about the button" or "the function of the button" as part of self improovement.

The System is fundamentally flawed. And all of the above are just attempts to patch it. And you are trying to outsmart a fellow human level intelligence with a lot of free time.

 

 

Finally, someone actually made a paper with "5 large Problems" of AGI design:



#3 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 18 June 2017 - 01:34 PM

Now a natural instinct is to just design the AI precisely after us. Except, that never worked in Engineering. A plane is in no way similar to a bird.

So why should AGI be (or stay) similar ot use humans?

You could also call this Video "how a Stamp collection AI ended live in the Universe".



#4 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 18 June 2017 - 03:26 PM

The guy that made most of the AI Videos, now has a Facebook Page on the thematic:

https://www.facebook...hc_location=ufi

 

How AI Problems are like nuclear problems:

 

He also started discussing the paper on "AI without Side effects" in much more detail:



#5 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 25 June 2017 - 11:57 AM

Two more videos. Nr 1 is on why "It can not let you do that, Dave":

AGI will nessesarily develop "intermediate goals":

One possible goal is "become a lot smarter". So even if the stamp collector AGI started out as dog-smart, it would still strive to become a super-intelligence down the road. Because if it was smarter and understood humans and the world better, it could propably get more stamps.

 

Another imtermediate goal would be "do not shut me off". If it was shut down, it could not get to the stuff it finds rewarding. "The thing it cares about would no longer be cared about by anyone" (to the same degree).

It will try to fight you, when you try to turn it off. Skynet from Terminator 2 actually got the perfect idea there. But it could also use tricks like spawning a sub-intelligence, like Petey did in Schlock Mercenary.

schlock20040829b.jpg?v=1443894892924

schlock20040829c.jpg?v=1443894892924

schlock20040829d.jpg?v=1443894892924

 

A similar intermediate goal would be "do not change me". Wich means "do not patch or update me after I was turned on". In essence, any change (not produced by itself) would be a small death. It would change what it cares for. Something it does not care for.



#6 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 25 June 2017 - 12:20 PM

Nr. 2 is about "laymans thinking up simple solutions". While there is a small chance their distance actually allows them to find something (and the scientsit would be happy to learn about it), more often then not the idea has already been thought about way back in the earliest stages of research. Or is simply a bad idea, like "put the AI in a box so it can not do anything".

 

 

I goes down to the problem that we propably will have to design AI to be save from the ground up. Creating a working AI and then trying to make it save is not going to work.

It also shows us a really odd case of a "Manipulative AI" that I never thought about: R2-D2 manipulating Luke Skywalker into removing his restraining bolt.



#7 Christopher

Christopher

    Awesome Programmer

  • HERO Member
  • 8,730 posts

Posted 27 June 2017 - 09:31 AM

Besides us needing to avoid Negative Side Effecst from having a Robot, we also must teach the AGI not to avoid positive side effects from it's action:

Previously he talked about using the "what if I did nothing" worldstate as a "ideal worldstate" to work towards. Except, that can have unintneded side effects itself.

If the AGI did nothing, I would get confused and try to fix it. So if it did exactly what I want but I would end up NOT confused and trying to fix it, it would be further off the "ideal" worldstate. So suddenly it wants to do what I tell it. Yet also get me confused so I try to fix it.

 

A worse example:

You task a AGI to cure cancer.

The worldsstate would change to "everyone dies" if it did nothing.

So it will cure cancer.*

And then make certain that everyone still dies, because that would be closer to the worldstate if it did nothing!*

 

*And since "killing everyone" is a lot easier then "cure cancer, then kill everyone" it would propably just do the 4th step.

 

 

Wich incidentally reminds me of Dragonball Z Abridged:

https://youtu.be/W5KIpjgYJL8?t=127