A case for courage, when speaking of AI danger

350

I think more people should say what they actually believe about AI dangers, loudly and often. Even (and perhaps especially) if you work in AI policy.

I’ve been beating this drum for a few years now. I have a whole spiel about how your conversation-partner will react very differently if you share your concerns while feeling ashamed about them versus if you share your concerns while remembering how straightforward and sensible and widely supported the key elements are, because humans are very good at picking up on your social cues. If you act as if it’s shameful to believe AI will kill us all, people are more prone to treat you that way. If you act as if it’s an obvious serious threat, they’re more likely to take it...

(Continue Reading – 1601 more words)

1Haiku13m

It's important for everyone to know that there are many things they can personally do to help steer the world onto a safer trajectory. I am a volunteer with PauseAI, and while I am very bad at grassroots organizing and lobbying, I still managed to start a local chapter and have some positive impact on my federal representative's office. I suggest checking out PauseAI's Take Action page. I also strongly endorse ControlAI, who are more centralized and have an excellent newsletter.

18So8res9h

I don't think most anyone who's studied the issues at hand thinks the chance of danger is "really small", even among people who disagree with me quite a lot (see e.g. here). I think folks who retreat to arguments like "you should pay attention to this even if you think there's a really small chance of it happening" are doing a bunch of damage, and this is one of many problems I attribute to a lack of this "courage" stuff I'm trying to describe. When I speak of "finding a position you have courage in", I do not mean "find a position that you think should be logically unassailable." I'm apparently not doing a very good job at transmitting the concept, but here's some positive and negative examples: ✓ "The race towards superintelligence is ridiculously risky and I don't think humanity should be doing it." ✓ "I'm not talking about a tiny risk. On my model, this is your most likely cause of death." ✓ "Oh I think nobody should be allowed to race towards superintelligence, but I'm trying to build it anyway because I think I'll do it better than the next guy. Ideally all AI companies in the world should be shut down, though, because we'd need way more time to do this properly." (The action is perhaps a bit cowardly, but the statement is courageous, if spoken by someone for whom it's true.) ✗ "Well we can all agree that it'd be bad if AIs were used to enable terrorists to make bioweapons" (spoken by someone who thinks the chance is quite high). ✗ "Even if you think the chance of it happening is very small, it's worth focusing on, because the consequences are so huge" (spoken by someone who believes the danger is substantial). ✗ "In some unlikely but extreme cases, these companies put civilization at risk, and the companies should be responsible for managing those tail risks" (spoken by someone who believes the danger is substantial). One litmus test here is: have you communicated the real core of the situation and its direness as you perceive it? Not like "have you cause

sunwillrisenow20

(see e.g. here)

This link doesn't seem to include people like Quintin Pope and the AI Optimists, who are the most notorious AI risk skeptics I can think of who have nonetheless written about Eliezer's arguments (example). If I recall correctly, I think Pope said sometime before his departure from this site that his P(doom) is around 1%.

17So8res15h

A few claims from the post (made at varying levels of explicitness) are: 1 . Often people are themselves motivated by concern X (ex: "the race to superintelligence is reckless and highly dangerous") and decide to talk about concern Y instead (ex: "AI-enabled biorisks") because they think it is more palatable. 2 . Focusing on the "palatable" concerns is a pretty grave miscalculation and error. 2a. The claims Y are often not in fact more palatable; people are often pretty willing to talk about the concerns that actually motivate you. 2b. When people try talking about the concerns that actually motivate them while loudly signalling that they think their ideas are shameful and weird, this is not a terribly good test of claim (2a). 2c. Talking about claims other than the key ones comes at a steep opportunity cost. 2d. Talking about claims other than the key ones risks confusing people who are trying to make sense of the situation. 2e. Talking about claims other than the key ones risks making enemies of allies (when those would-be allies agree about the high-stakes issues and disagree about how to treat the mild stuff). 2f. Talking about claims other than the key ones triggers people's bullshit detectors. 3 . Nate suspects that many people are confusing "I'd be uncomfortable saying something radically different from the social consensus" with "if I said something radically different from the social consensus then it would go over poorly", and that this conflation is hindering their ability to update on the evidence. 3a. Nate is hopeful that evidence of many people's receptiveness to key concerns will help address this failure. 3b. Nate suspects that various tropes and mental stances associated with the word "couarge" are perhaps a remedy to this particular error, and hopes that advice like "speak with the courage of your convictions" is helpful for people when it comes to remembering the evidence in (3a) and overcoming the error of (3). I don't think this is i

Support for bedrock liberal principles seems to be in pretty bad shape these days

Max H

11h

By 'bedrock liberal principles', I mean things like: respect for individual liberties and property rights, respect for the rule of law and equal treatment under the law, and a widespread / consensus belief that authority and legitimacy of the state derive from the consent of the governed.

Note that "consent of the governed" is distinct from simple democracy / majoritarianism: a 90% majority that uses state power to take all the stuff of the other 10% might be democratic but isn't particularly liberal or legitimate according to the principle of consent of the governed.

I believe a healthy liberal society of humans will usually tend towards some form of democracy, egalitarianism, and (traditional) social justice, but these are all secondary to the more foundational kind of thing I'm getting...

(See More – 949 more words)

3Stephen Martin25m

Okay, so what would be your pitch to those people for why they should be YIMBYs?

3Stephen Martin1h

The benefits for who though? Certainly not those people who specifically moved to that neighborhood for its character and are now having said character changed.

5sunwillrise10m

Indeed, not for them; they lose out, illustrating why these policies do not generate Pareto improvements. The benefits are for the marginal residents of this area, who would not be able to live there but for the cost-of-living reduction created by lowering the price of rent and housing through the increase of housing supply. The benefits are for renters, who are on average significantly poorer than homeowners and need these cost-of-living reductions far more. The benefits are for the economy, both overall (increasing the supply of housing increases the sum total of real goods produced in the economy, by reducing the deadweight loss and economic rent captured by existing homeowners) and locally (since new, productive residents move in and start purchasing goods and services from local businesses, increasing their sales and profit as a result). The benefits are for anyone who uses public services in the area (such as roads, parks, police services, schools, etc.), since those services are funded by local taxes, and the local tax base increases when new residents can move in and engage in economic activities. The benefits are for new homeowners, who would like to purchase a house in a nice area where they have access to good jobs, but who often cannot because houses cost millions of dollars in the most sought-after neighborhoods (see: virtually all of San Francisco) due to artificial restrictions on supply. The benefits are for investors and construction companies, who would like to create economic value for themselves and their surrounding community but can't because of zoning regulations or other legal restrictions. The housing crisis has been going on for quite a while at this point, and its deleterious effects touch almost all aspects of urban and suburban life, particularly in the United States. Increasing the supply of housing through YIMBY policies is the only economically literate and sustainable way of fighting against it.

Stephen Martin3m10

Alright so there's an acknowledgement that at the very least, the people who originally occupied that nice area are losing out.

Your breakdown of the benefits seems more or less fair. The only thing I take issue with is "who would like to purchase a house in a nice area where they have access to good jobs". I don't think it's fair to take it for granted that the area will stay nice post-YIMBY change (in fact the core acknowledgement here is that the character of the neighborhood is going to change) or that jobs will stay.

Would you consider that a "load bear... (read more)

If Moral Realism is true, then the Orthogonality Thesis is false.

Eye You

Claim: if moral realism is true, then the Orthogonality Thesis is false, and superintelligent agents are very likely to be moral.

I'm arguing against Armstrong's version of Orthogonality^[1]:

The fact of being of high intelligence provides extremely little constraint on what final goals an agent could have (as long as these goals are of feasible complexity, and do not refer intrinsically to the agent’s intelligence).

Argument

1. Assume moral realism; there are true facts about morality.

2. Intelligence is causally^[2] correlated with having true beliefs.^[3]

3. Intelligence is causally correlated with having true moral beliefs.^[4]

4. Moral beliefs constrain final goals; believing “X is morally wrong” is a very good reason and motivator for not doing X.^[5]

5. Superintelligent agents will likely have final goals that cohere with their (likely true) moral beliefs.

6. Superintelligent agents are likely to be moral.

^{^}
Though this argument basically applies to most other versions, including strong form: "There can exist arbitrarily intelligent agents pursuing any kind of goal [and] there's no extra difficulty or complication in the existence of an intelligent agent that pursues a goal."
This argument as is does not work against Yudkowsky's weak form: "Since the goal of making paperclips is tractable, somewhere in the design space is an agent that optimizes that goal."
^{^}
Correction: there was a typo in the original post here. Instead of 'causally', it read 'casually'.
^{^}
A different way of saying this: Intelligent agents tend towards a comprehensive understanding of reality; towards having true beliefs. As intelligence increases, agents will (in general, on average) be less wrong.
^{^}
A different way of saying this: Intelligent agents tend toward having true moral beliefs.
To spell this out a bit more:
A. Intelligent agents tend toward having true moral beliefs.
B. Moral facts (under moral realism) are (an important!) part of reality.
C. Intelligent agents tend toward having true moral beliefs.
^{^}
One could reject this proposition by taking a strong moral externalism stance. If moral claims are not intrinsically motivating and there is no general connection between moral beliefs and motivation, then this proposition does not follow. See here for discussion of moral interalism and orthogonality and for discussion of moral motivation.
As

...

lesswronguser12327m10

I think I agree with bostrom's 2012 position here: (On how it's still a problem if moral realism is true; even though I think it's false— believing moral realism is true doesn't quite help people in designing friendly AI, as highlighted by other commentators)

The Orthogonality Thesis
Intelligence and final goals are orthogonal axes along which possible agents can freely
vary. In other words, more or less any level of intelligence could in principle be
combined with more or less any final goal
[...]

The orthogonality thesis, as formulated here, makes

... (read more)

I'm scared.

Mass_Driver

15y

Recently, I've been ratcheting up my probability estimate of some of Less Wrong's core doctrines (shut up and multiply, beliefs require evidence, brains are not a reliable guide as to whether brains are malfunctioning, the Universe has no fail-safe mechanisms) from "Hmm, this is an intriguing idea" to somewhere in the neighborhood of "This is most likely correct."

This leaves me confused and concerned and afraid. There are two things in particular that are bothering me. On the one hand, I feel obligated to try much harder to identify my real goals and then to do what it takes to actually achieve them -- I have much less faith that just being a nice, thoughtful, hard-working person will result in me having a pleasant life, let alone in...

(See More – 194 more words)

1daijin3h

It's been 15 years. Did you figure out how to be less scared?

Mass_Driver1h20

I mostly just got older and therefore calmer. I've crossed off most of the highest-priority items from my bucket list, so while I would prefer to continue living for a good long while, my personal death and/or defeat doesn't seem so catastrophically bad anymore, and to cope with the loss of civilization/humanity I read a lot of history and sci-fi and anthropology and other works that help me zoom out and see that there has already been great loss and that while I do want to spend my resources fighting to reduce the risk of that loss, it's not something I need to spend a lot of time or energy personally suffering over, especially not in advance. Worry is interest paid on trouble before it's due.

Conciseness Manifesto

Vasyl Dotsenko

This page is intentionally left blank

3Vasyl Dotsenko2h

But, seriously, in the age of generative AI and summarizers, there's no need to pretend. We might as well just say the thing we want to say without the need for the reader to invoke a summarizer on it.

CstineSublime1h10

How do you feel about conciseness and educational resources? I'm in two minds, on the one hand I am often as someone learning battling jargon defined by jargon - I see a word, I want to know what it means so I can keep reading. I don't want to read a chain of three wikipedia articles. And that filler and fluff can actually distract and make the lesson or knowledge harder to retain. The reader/student is more likely to misunderstand or to miss core information as they are assaulted by tangents.

On the other hand, I do accept that repetition can improve reten... (read more)

lesswronguser123's Shortform

lesswronguser123

1lesswronguser12315h

Not that I'm aware of sometimes it's pure coincidence whilst searching for unrelated things, other times I remember some old context out of nowhere, or if I am looking into adjacent topics, which leads me to stumble upon it. Nowadays AI can also help, but I'm currently bad at prompt engineering.

CstineSublime1h10

I'm a bad prompt engineer myself and I get quite envious when people announce these amazing results they are getting from the LLM which feel like pulling teeth from a chicken to me.

The reason I ask is because if it is more a matter of incidental as you research other things, I was wondering if there's in a sense a way of reverse engineering that process of recognizing it after the fact?

To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)

Simulators Increase the Likelihood of Alignment by Default

Wuschel Schulz

Alignment by Default is the idea that achieving alignment in artificial general intelligence (AGI) may be more straightforward than initially anticipated. When an AI possesses a comprehensive and detailed world model, it inherently represents human values within that model. To align the AGI, it's merely necessary to extract these values and direct the AI towards optimizing the abstraction it already comprehends.

In a summary of this concept, John Wentworth estimates a 10% chance of this strategy being successful, a perspective I generally agree with.

However, in light of recent advancements, I have revised my outlook, now believing that Alignment by Default has a higher probability of success, perhaps around 30%. This update was prompted by the accomplishments of ChatGPT, GPT-4, and subsequent developments. I believe these systems are approaching...

(Continue Reading – 1359 more words)

jsnider31h10

If alignment-by-default works for AGI, then we will have thousands of AGIs providing examples of aligned intelligence. This new, massive dataset of aligned behavior could then be used to train even more capable and robustly aligned models each of which would then add to the training data until we have data for aligned superintelligence.

If alignment-by-default doesn't work for AGI, then we will probably die before ASI.

zyansheep's Shortform

zyansheep

Feedback wanted: Shortlist of AI safety ideas

mmKALLL

Summary:

I made a prioritized list of 130 ideas on how to contribute to AI safety.
The top 12 ideas have similar overall scores, so I'd like to verify my assumptions.
In particular, I'd like to know which ones are already being worked on by other people or otherwise irrelevant.

Background:

Hello everyone! 👋

At the start of year, I was trying to figure out what my goals should be regarding AI safety. I ended up making a list of 130 small, concrete ideas ranging all kinds of domains. A lot of them are about gathering and summarizing information, skilling up myself, and helping others be more effective.

I needed a rough way to prioritize them, so I built a prioritization framework inspired by 80,000 Hours. I won't go into the details here, but...

(Continue Reading – 1269 more words)

LESSWRONG
LW

Quick Takes

Popular Comments

Recent Discussion

Popular Comments

Recent Discussion

Argument

Summary:

Background: