Jump to content

The insanity of trying to make safe AGI


Christopher

Recommended Posts

One idea is "Supervise them" at least during the learning phase. Aside from teh issues that pwoerfull AI would be incentivised to "Manipulate, Deceive or Modify the Human" as it is part of it's reward function, learning would be glacially slow. Current self-lerning neural networks need hundreds on convolutions, something human supervision does not scale to:

 

Link to comment
Share on other sites

1 hour ago, megaplayboy said:

Hmm.  If a human alone is incapable of supervising at sufficient speed, what about a human assisted by a limited AI designed to flag instances of concern, and to "correct" the most basic instances of error or misbehavior?  

It is all the things the limited AI can not learn, that we need to teach the AGI. Otherwise we could just stay with the limtied AI. Much safer for humanity.

 

As discussed under I/O speed: If there already is a perfect limited AI or programm that can do something, the AGI will most likely just run that one.

Link to comment
Share on other sites

  • 1 month later...
  • 9 months later...

The part about the AI way of looking at "Self-Preservation" and "Goal Preservation" is interesting. Let us take a story that most people do not like: The Catalyst of Mass Effect 3. In particualr including what Leviathan has to say on the mater:

 

The Levaitahsn kept wondering: "Why do AI always murder their creators after a few thousand years?"

So they made a AI with the Termina Goal ("Mandate"): Preservation of Life. Facilitate peacefull relations between Organics and Synthethics.

Propably for goal preservation, it first harvested it's creators. Because they had a real power to disrupt it's work.

Over the cycles it tried a number of solutions, like forced Synthesis ending. Only to learn "it can not be forced". It also tried "Maybe the AI do not turn evil eventually"? Also failed.

 

So as the next best thing, it picks the harvests: It can stave off this problem, via Galactic Civilisation rests/interfering with any running AI Rebellion. It conserves as much as possible. And it hopes that sometime a better solution can be found.

 

The moment Sheppard stepped in that control center it realized: "Well, that solution is a bust. Okay, time to try something else I guess."
Literally the first option it gives you (and the only option if you have poor gamescore) is to destroy it. It does not care to survive for survivals sake (as a terminal goal). It only cares for Survival as a Intermediate Goal. As a way to fullfill it's terminal goal. With sheppard reaching there, it knew the solution would not longer work. Moreover it's attempt to "eradicate the idea of a Crucible" was actually exactly the wrong thing to do. And it ony realized that afterwards. Like it's creators before, it had become part of the Problem.

Link to comment
Share on other sites

A question often asked in AI safety by laymen is "What if you do X"?

The answer is now: You can find out for yourself!

 

A large part of any science is using the same or at least comparable datasets. There is now such a dataset for AI development: the AI Gridworlds. A set of 8 rather simple scenarios. In each of them there is the AI with their rewward function. And a "performance evaluation" function, that measures how close it actually performed to what we expected.

 

 

What happens if you make the AI drunk and how likely is it to "volkswagen" you?

 

 

Link to comment
Share on other sites

  • 1 month later...

A christmas episode on teh question "Are Corporations Superintelligences". Wich leads to a question "what is a Superintelligence anyway?"

He had to filter out a lot of politcal stuff, but it still went somepalce interesting. In particular looking at different kinds of speed/ability:

Particular the difference between Superhuman skill ceiling and superhuman skill width.

And between Latency and Throughput.

As such it also goes a bit into Multitasking scenarios, wich he apparently discussed in another video I missed.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...