top of page

Center for Human-Compatible AI, UC Berkeley

Michael K. Cohen

aor (1).png

Solutions In Theory

Σ

i

Different posts discuss different AI proposals and why I do or don't consider them to be solutions in theory

Critera for solutions in theory

Could do superhuman long-term planning
Ongoing receptiveness to feedback about its objectives
No reason to escape human control to accomplish its objectives
No impossible demands on human designers/operators
No TODOs when defining how we set up the AI’s setting
No TODOs when defining any programs that are involved, except how to modify them to be tractable

philosophical problem → computer science problem

Surely Human-Like Optimization

Surely Human-Like Optimization

Surely Human-Like Optimization

A randomly sampled human cannot be trusted to prescribe medicine, fix a pipe, or write important code. But they can be trusted not to...

Michael Cohen

Jun 27, 20248 min read

Pessimism

Pessimism

Pessimism

Me: Is there a quote from The Art of War about how if each side can predict the result of a battle, they won't fight it? Claude 3 Opus:...

Michael Cohen

Jun 20, 20246 min read

Boxed Myopic AI

Boxed Myopic AI

Boxed Myopic AI

We place an episodic agent in a box with an operator. When the operator leaves, the episode ends. No incentive to affect the outside world.

Michael Cohen

Jun 13, 20245 min read

bottom of page