/
The Nuts & Bolts Of IMPLEMENTING A SAFE MOTIVATIONAL SYSTEM The Nuts & Bolts Of IMPLEMENTING A SAFE MOTIVATIONAL SYSTEM

The Nuts & Bolts Of IMPLEMENTING A SAFE MOTIVATIONAL SYSTEM - PowerPoint Presentation

wellific
wellific . @wellific
Follow
343 views
Uploaded On 2020-06-17

The Nuts & Bolts Of IMPLEMENTING A SAFE MOTIVATIONAL SYSTEM - PPT Presentation

Mark R Waser Digital Wisdom Institute MWaserDigitalWisdomInstituteorg Outline What is a safe motivational system How do we ensure that it happens and sticks 2 What is a safe motivational system ID: 780683

system amp ethics moral amp system moral ethics clp foundation processes human utility subconscious goals efficiency underlies integrity fraud

Share:

Link:

Embed:

Download Presentation from below link

Download The PPT/PDF document "The Nuts & Bolts Of IMPLEMENTING A S..." is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.


Presentation Transcript

Slide1

The Nuts & Bolts OfIMPLEMENTING A SAFE MOTIVATIONAL SYSTEM

Mark R. Waser

Digital Wisdom Institute

MWaser@DigitalWisdomInstitute.org

Slide2

OutlineWhat is a “safe” motivational system?

How do we ensure that it happens (and sticks)?

2

Slide3

What is a “safe” motivational system?

*ANYTHING*

that reliably leads to

ETHICAL BEHAVIOR

3

Slide4

What is Ethical Behavior?4

The problem is that no ethical system has ever reached consensus. Ethical systems are completely unlike mathematics or science. This is a source of concern.

Slide5

Entities Require EthicsEthics are “rules of the road

Necessary for “safe” interaction

Yet, we cannot come to a consensus about them

There is something horribly wrong with this picture

5

Slide6

The Human Moral SystemIs primarily implemented via emotions

Is not transparent or reflective

Frequently conflicts with “rationality”

Is “clearly” subjective

6

Slide7

Humans are . . . .Evolved to self-deceive in order to better deceive others (Trivers 1991)

Unable to directly sense agency (Aarts et al. 2005)

Prone to false illusory experiences of self-authorship (

Buehner

and Humphreys 2009)

Unable to correctly retrieve the reasoning behind moral judgments (Hauser et al. 2007)

Almost always unaware of what morality is and why it should be practiced . . . .

Slide8

Inflammatory Statements

>Human intelligence REQUIRES ethics

All humans want the same things

Ethics are universal

Ethics are SIMPLE in concept

Difference in power is irrelevant

(to ethics)

Evolution has “designed” you to

disagree with the above five points

Slide9

The Origin of MoralitySelfishness predictably evolvesReciprocal altruism predictably evolves

But requires cognitive complexity to ensure that is

is

not taken advantage of

Ethics predictably evolves

A

s an attractor in the state space of behavior because community is so valuableBut altruistic punishment is a necessity

Arms

Race between

Individual benefits of successful personal

cheating (really only in a short-term/highly time-discounted view

)

Societal benefits of cheating detection & prevention

9

Slide10

Haidt’s Functional ApproachMoral systems are interlocking sets of

values

, virtues, norms

, practices

, identities, institutions, technologies, and evolved

psychological

mechanisms that work together to

suppress

or

regulate

selfishness and

make

cooperative social life

possible

10

Slide11

How to Universalize EthicsQuantify/evaluate

intents, actions & consequences

with respect to

codified consensus moral foundations

Permissiveness/Utility Function

equivalent to a “consensus” human (generic entity) moral sense

11

Slide12

Instrumental Goals/Universal Subgoals(adapted from Omohundro 2008 The Basic AI Drives)

Self-improvement

Rationality/integrity

Preserve goals/utility function

Decrease/prevent fraud/counterfeit utility

Survival/self-protection

Efficiency (in resource acquisition & use)

Community = assistance/non-interference through GTO reciprocation (

OTfT

+ AP)

Reproduction

Slide13

Human Goalssurvival/self-protection

& reproduction

happiness & pleasure

------------------------------------------------------------------------------------

community

-------------------------------------------------------------------------------------

self-improvement

rationality/integrity

reduce/prevent fraud/counterfeit utility

efficiency (in resource acquisition & use)

Slide14

Human Goals & Sinssuicide

(& abortion?)

masochism

------------------------------------------------

selfishness

(pride, vanity)

-------------------------------------------------

acedia (sloth/despair)

insanity

wire-heading

(lust)

wastefulness

(gluttony, sloth)

murder

(& abortion?)

cruelty/sadism

-------------------------------------------------

ostracism, banishment

& slavery (wrath, envy)

----------------------------------------------------

s

lavery

manipulation

lying/fraud (swear falsely/false witness)

theft (greed, adultery,

coveting)

survival

/reproduction

happiness/pleasure

-------------------------------------------------

c

ommunity

(ETHICS)

--------------------------------------------------

s

elf-improvement

rationality/integrity

reduce/prevent fraud/counterfeit utility

efficiency (in resource acquisition & use)

Slide15

Haidt’s Moral Foundations1)

Care/harm

: This foundation is related to our long evolution as mammals with attachment systems and an ability to feel (and dislike) the pain of others. It underlies virtues of kindness, gentleness, and nurturance.

2)

Fairness/cheating

: This foundation is related to the evolutionary process of reciprocal altruism. It generates ideas of justice, rights, and autonomy. [Note: In our original conception, Fairness included concerns about equality, which are more strongly endorsed by political liberals. However, as we reformulated the theory in 2011 based on new data, we emphasize proportionality, which is endorsed by everyone, but is more strongly endorsed by conservatives]

3)

Liberty/oppression*

:

This foundation is about the feelings of reactance and resentment people feel toward those who dominate them and restrict their liberty. Its intuitions are often in tension with those of the authority foundation. The hatred of bullies and dominators motivates people to come together, in solidarity, to oppose or take down the oppressor.

4)

Loyalty/betrayal

: This foundation is related to our long history as tribal creatures able to form shifting coalitions. It underlies virtues of patriotism and self-sacrifice for the group. It is active anytime people feel that it's "one for all, and all for one."

5)

Authority/subversion

: This foundation was shaped by our long primate history of hierarchical social interactions. It underlies virtues of leadership and followership, including deference to legitimate authority and respect for traditions.

6)

Sanctity/degradation

: This foundation was shaped by the psychology of disgust and contamination. It underlies religious notions of striving to live in an elevated, less carnal, more noble way. It underlies the widespread idea that the body is a temple which can be desecrated by immoral activities and contaminants (an idea not unique to religious traditions).

15

Slide16

Additional ContendersWaste efficiency in use of resourcesOwnership/Possession

efficiency

in use of

resources; Tragedy of the Commons

Honesty

r

educe/prevent fraud/counterfeit utilitySelf-controlRationality/integrity

16

Slide17

How to Universalize EthicsQuantify/evaluate

intents, actions & consequences

with respect to

codified consensus moral foundations

Permissiveness/Utility Function

equivalent to a “consensus” human (generic entity) moral sense

17

Slide18

Critical Components I:Self-Knowledge & Reflection

A self must know itself to be a self

Composed of three parts:

The running

processes (consciousness)

The personal

knowledge base (memory)

The physical

hardware (body)

Must start with:

A competent model of each

Sensors to detect changes and their

effects

*MUST* “care” about itself (motivation)

Slide19

Critical Components II:Explicit “Anchor” Values

Do not defect from the community

Do not become too large/powerful

Acquire and integrate knowledge

Instrumental goals

Slide20

Critical Components III:Reliability

Self-Control, Integrity, Autonomy,

Responsibility

In “

p

redictive control” of its own state and that of the physical objects that support it

Yes!

This is a

marked deviation

from the human example.

Slide21

ArchitectureProcesses will be divided into three main classes:

Operating system processes

Subconscious/tool processes

One serial consciousness/learner process (CLP)

The CLP will be able to create, modify and/or influence many of the subconscious/tool processes.

The CLP will NOT be given access to modify operating system processes

Indeed, it will have multiple/redundant logical, emotional & moral reasons to seriously convince it not to even try

Slide22

Operating System ArchitectureOpen, Pluggable, Service-Oriented/Message-Passing

Quickly adopt novel input streams

Handle resource requests and allocation

Provide connectivity between

components

Safety Features

Act as a “black box” security monitor capable of reporting problems without the consciousness’s awareness

Able to “manage” the CLP by manipulating the amount of processor time and memory available to it (assuming that the normal subconscious processes are unable to do so)

Other protections against hostile humans, inept builders, and the learner itself may be implemented as

well

Slide23

Automated Predictive World ModelIs

the most important subconscious

process(

es

)

Will serve as an interface to the “real” world

The CLP will live in a virtual world (just as we do)

Will be both reactive and predictive

Will generate “anomaly interrupts” upon deviations from expectations as an approach to solving the “brittleness” problem (Perlis 2008)

Will contain certain relatively immutable concepts to serve as anchors both for emotions and for ensuring safety (trigger patterns –

Ohman

et al. 2001)

Slide24

Anchors & EmotionsAnchors create a

multiple attachment point model

which is

much safer than the single-point-of-failure, top-down-only approach of “machine

enslavement”

advocated

by the MIRI

(Yudkowsky 2001)

Emotions will be generated by the subconscious processes as “actionable qualia” to inform the CLP and will also bias the selection and urgency tags of information relayed via the predictive model

Violations of the cooperative social living “moral” system will result in a flood of urgently–tagged anomaly interrupts demanding that consciousness resources be expended to “solve the problem”

Slide25

Conscious Learning Process (CLP)The goal is to provide as many optional structures and standards to support and speed development as much as possible while not restricting possibilities beyond what is absolutely required for safety.

We believe the best way to do this is with a blackboard system similar to Learning IDA (

Baars

and Franklin 2007).

The CLP acts like the Governing Board of the Policy Governance model (Carver 2006) to create a coherent, consistent, integrated narrative plan of action to fulfill the goals of the larger self.

Slide26

The Digital Wisdom Institute is a non-profit think tank

focused

on the promise and challenges of ethics

,

artificial

intelligence & advanced computing solutions

.

 

We believe that

the development of ethics and artificial intelligence

and

equal

co-existence with ethical machines is

humanity's best

hope

http://DigitalWisdomInstitute.org

26