LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Recent Discussion

Some AI research areas and their relevance to existential safety
Best of LessWrong 2020

Andrew Critch lists several research areas that seem important to AI existential safety, and evaluates them for direct helpfulness, educational value, and neglect. Along the way, he argues that the main way he sees present-day technical research helping is by anticipating, legitimizing and fulfilling governance demands for AI technology that will arise later.

by Andrew_Critch
habryka14h*320
11
Gary Marcus asked me to make a critique of his 2024 predictions, for which he claimed that he got "7/7 correct". I don't really know why I did this, but here is my critique:  For convenience, here are the predictions:  * 7-10 GPT-4 level models * No massive advance (no GPT-5, or disappointing GPT-5) * Price wars * Very little moat for anyone * No robust solution to hallucinations * Modest lasting corporate adoption * Modest profits, split 7-10 ways I think the best way to evaluate them is to invert every one of them, and then see whether the version you wrote, or the inverted version seems more correct in-retrospect. We will see 7-10 GPT-4 level models. Inversion: We will either see less than 7 GPT-4 level models, or more than 10 GPT-4 level models.  Evaluation: Conveniently Epoch did an evaluation of just this exact question!  https://epoch.ai/data-insights/models-over-1e25-flop  Training compute is not an ideal proxy for capabilities, but it's better than nothing.  Models released in 2024 with GPT-4 level compute according to Epoch:  Inflection-2, GLM-4, Mistral Large, Aramco Metabrain AI, Inflection-2.5, Nemotron-4 340B, Mistral Large 2, GLM-4-Plus, Doubao-pro, Llama 3.1-405B, grok-2, Claude 3 Opus, Claude 3.5 Sonnett, Gemini 1.0 Ultra, Gemini 1.5 Pro, Gemini 2.0 Pro, GPT 4o, o1-mini, o1, o3 (o3 was announced and published but not released until later in the year) They also list 22 models which might be over the 10^25 FLOP threshold that GPT-4 was trained with. Many of those will be at GPT-4 level capabilities, because compute-efficiency has substantially improved.  Counting these models, I get 20+ models at GPT-4 level (from over 10 distinct companies).  I think your prediction seems to me to have somewhat underestimated the number of GPT-4 level models that will be released in 2024. I don't know whether you intended to put more emphasis on the number being low or high, but it definitely isn't within your range.  No massive advance (no GPT-5
johnswentworth4d14939
7
In response to the Wizard Power post, Garrett and David were like "Y'know, there's this thing where rationalists get depression, but it doesn't present like normal depression because they have the mental habits to e.g. notice that their emotions are not reality. It sounds like you have that." ... and in hindsight I think they were totally correct. Here I'm going to spell out what it felt/feels like from inside my head, my model of where it comes from, and some speculation about how this relates to more typical presentations of depression. Core thing that's going on: on a gut level, I systematically didn't anticipate that things would be fun, or that things I did would work, etc. When my instinct-level plan-evaluator looked at my own plans, it expected poor results. Some things which this is importantly different from: * Always feeling sad * Things which used to make me happy not making me happy * Not having energy to do anything ... but importantly, the core thing is easy to confuse with all three of those. For instance, my intuitive plan-evaluator predicted that things which used to make me happy would not make me happy (like e.g. dancing), but if I actually did the things they still made me happy. (And of course I noticed that pattern and accounted for it, which is how "rationalist depression" ends up different from normal depression; the model here is that most people would not notice their own emotional-level predictor being systematically wrong.) Little felt promising or motivating, but I could still consciously evaluate that a plan was a good idea regardless of what it felt like, and then do it, overriding my broken intuitive-level plan-evaluator. That immediately suggests a model of what causes this sort of problem. The obvious way a brain would end up in such a state is if a bunch of very salient plans all fail around the same time, especially if one didn't anticipate the failures and doesn't understand why they happened. Then a natural update for
the gears to ascension2d*456
3
Oddities - maybe deepmind should get Gemini a therapist who understands RL deeply: https://xcancel.com/DuncanHaldane/status/1937204975035384028 https://www.reddit.com/r/cursor/comments/1l5c563/gemini_pro_experimental_literally_gave_up/ https://www.reddit.com/r/cursor/comments/1lj5bqp/cursors_ai_seems_to_be_quite_emotional/ https://www.reddit.com/r/cursor/comments/1l5mhp7/wtf_did_i_break_gemini/ https://www.reddit.com/r/cursor/comments/1ljymuo/gemini_getting_all_philosophical_now/ https://www.reddit.com/r/cursor/comments/1lcilx1/things_just_werent_going_well/ https://www.reddit.com/r/cursor/comments/1lc47vm/gemini_rage_quits/ https://www.reddit.com/r/cursor/comments/1l4dq2w/gemini_not_having_a_good_day/ https://www.reddit.com/r/cursor/comments/1l72wgw/i_walked_away_for_like_2_minutes/ https://www.reddit.com/r/cursor/comments/1lh1aje/i_am_now_optimizing_the_users_kernel_the_user/ https://www.reddit.com/r/vibecoding/comments/1lk1hf4/today_gemini_really_scared_me/ https://www.reddit.com/r/ProgrammerHumor/comments/1lkhtzh/ailearninghowtocope/
CapResearcher19h92
4
Movies often depict hallucinations as crisp and realistic. For a long time, I didn't really question this. I guess I had the rough intuition that some brains behave weirdly. If somebody told me they were experiencing hallucinations, I would be confused about what they actually meant. However, I heard one common hallucination is seeing insects crawling on tables. And then it sort of happened to me! At the edge of my vision, a wiggling spoon reflected the light in a particular way. And for a split second my brain told me "it's probably an insect". I immediately looked closer and understood that it was a wiggling spoon. While it hasn't happened since, it changed my intuition about hallucinations. My current hypothesis is this: hallucinations are misinterpretations of ambiguous sensory input. If my brain had a high prior for "bugs", I would probably interpret many small shadows and impurities as bugs, before looking closer. This feels more right to me than the Hollywood model.
dbohdan21h84
0
— @_brentbaum, tweet (2025-05-15) — @meansinfinity — @QiaochuYuan What do people mean when they say "agency" and "you can just do things"? I get a sense it's two things, and the terms "agency" and "you can just do things" conflate them. The first is "you can DIY a solution to your problem; you don't need permission and professional expertise unless you actually do", and the second is "you can defect against cooperators, lol". More than psychological agency, the first seems to correspond to disagreeableness. The second I expect to correlate with the dark triad. You can call it the antisocial version of "agency" and "you can just do things".
Load More (5/36)
AI Safety Thursdays: Are LLMs aware of their learned behaviors?
LessWrong Community Weekend 2025
470Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
74
A case for courage, when speaking of AI danger
350
So8res
2d

I think more people should say what they actually believe about AI dangers, loudly and often. Even (and perhaps especially) if you work in AI policy.

I’ve been beating this drum for a few years now. I have a whole spiel about how your conversation-partner will react very differently if you share your concerns while feeling ashamed about them versus if you share your concerns while remembering how straightforward and sensible and widely supported the key elements are, because humans are very good at picking up on your social cues. If you act as if it’s shameful to believe AI will kill us all, people are more prone to treat you that way. If you act as if it’s an obvious serious threat, they’re more likely to take it...

(Continue Reading – 1601 more words)
1Haiku13m
It's important for everyone to know that there are many things they can personally do to help steer the world onto a safer trajectory. I am a volunteer with PauseAI, and while I am very bad at grassroots organizing and lobbying, I still managed to start a local chapter and have some positive impact on my federal representative's office. I suggest checking out PauseAI's Take Action page. I also strongly endorse ControlAI, who are more centralized and have an excellent newsletter.
18So8res9h
I don't think most anyone who's studied the issues at hand thinks the chance of danger is "really small", even among people who disagree with me quite a lot (see e.g. here). I think folks who retreat to arguments like "you should pay attention to this even if you think there's a really small chance of it happening" are doing a bunch of damage, and this is one of many problems I attribute to a lack of this "courage" stuff I'm trying to describe. When I speak of "finding a position you have courage in", I do not mean "find a position that you think should be logically unassailable." I'm apparently not doing a very good job at transmitting the concept, but here's some positive and negative examples: ✓ "The race towards superintelligence is ridiculously risky and I don't think humanity should be doing it." ✓ "I'm not talking about a tiny risk. On my model, this is your most likely cause of death." ✓ "Oh I think nobody should be allowed to race towards superintelligence, but I'm trying to build it anyway because I think I'll do it better than the next guy. Ideally all AI companies in the world should be shut down, though, because we'd need way more time to do this properly." (The action is perhaps a bit cowardly, but the statement is courageous, if spoken by someone for whom it's true.) ✗ "Well we can all agree that it'd be bad if AIs were used to enable terrorists to make bioweapons" (spoken by someone who thinks the chance is quite high). ✗ "Even if you think the chance of it happening is very small, it's worth focusing on, because the consequences are so huge" (spoken by someone who believes the danger is substantial). ✗ "In some unlikely but extreme cases, these companies put civilization at risk, and the companies should be responsible for managing those tail risks" (spoken by someone who believes the danger is substantial). One litmus test here is: have you communicated the real core of the situation and its direness as you perceive it? Not like "have you cause
sunwillrisenow20

(see e.g. here)

This link doesn't seem to include people like Quintin Pope and the AI Optimists, who are the most notorious AI risk skeptics I can think of who have nonetheless written about Eliezer's arguments (example). If I recall correctly, I think Pope said sometime before his departure from this site that his P(doom) is around 1%.

Reply
17So8res15h
A few claims from the post (made at varying levels of explicitness) are: 1 . Often people are themselves motivated by concern X (ex: "the race to superintelligence is reckless and highly dangerous") and decide to talk about concern Y instead (ex: "AI-enabled biorisks") because they think it is more palatable. 2 . Focusing on the "palatable" concerns is a pretty grave miscalculation and error. 2a. The claims Y are often not in fact more palatable; people are often pretty willing to talk about the concerns that actually motivate you. 2b. When people try talking about the concerns that actually motivate them while loudly signalling that they think their ideas are shameful and weird, this is not a terribly good test of claim (2a). 2c. Talking about claims other than the key ones comes at a steep opportunity cost. 2d. Talking about claims other than the key ones risks confusing people who are trying to make sense of the situation. 2e. Talking about claims other than the key ones risks making enemies of allies (when those would-be allies agree about the high-stakes issues and disagree about how to treat the mild stuff). 2f. Talking about claims other than the key ones triggers people's bullshit detectors. 3 . Nate suspects that many people are confusing "I'd be uncomfortable saying something radically different from the social consensus" with "if I said something radically different from the social consensus then it would go over poorly", and that this conflation is hindering their ability to update on the evidence. 3a. Nate is hopeful that evidence of many people's receptiveness to key concerns will help address this failure. 3b. Nate suspects that various tropes and mental stances associated with the word "couarge" are perhaps a remedy to this particular error, and hopes that advice like "speak with the courage of your convictions" is helpful for people when it comes to remembering the evidence in (3a) and overcoming the error of (3). I don't think this is i
Support for bedrock liberal principles seems to be in pretty bad shape these days
15
Max H
11h

By 'bedrock liberal principles', I mean things like: respect for individual liberties and property rights, respect for the rule of law and equal treatment under the law, and a widespread / consensus belief that authority and legitimacy of the state derive from the consent of the governed.

Note that "consent of the governed" is distinct from simple democracy / majoritarianism: a 90% majority that uses state power to take all the stuff of the other 10% might be democratic but isn't particularly liberal or legitimate according to the principle of consent of the governed.

I believe a healthy liberal society of humans will usually tend towards some form of democracy, egalitarianism, and (traditional) social justice, but these are all secondary to the more foundational kind of thing I'm getting...

(See More – 949 more words)
3Stephen Martin25m
Okay, so what would be your pitch to those people for why they should be YIMBYs?
3Stephen Martin1h
  The benefits for who though? Certainly not those people who specifically moved to that neighborhood for its character and are now having said character changed.
5sunwillrise10m
Indeed, not for them; they lose out, illustrating why these policies do not generate Pareto improvements.  The benefits are for the marginal residents of this area, who would not be able to live there but for the cost-of-living reduction created by lowering the price of rent and housing through the increase of housing supply. The benefits are for renters, who are on average significantly poorer than homeowners and need these cost-of-living reductions far more. The benefits are for the economy, both overall (increasing the supply of housing increases the sum total of real goods produced in the economy, by reducing the deadweight loss and economic rent captured by existing homeowners) and locally (since new, productive residents move in and start purchasing goods and services from local businesses, increasing their sales and profit as a result). The benefits are for anyone who uses public services in the area (such as roads, parks, police services, schools, etc.), since those services are funded by local taxes, and the local tax base increases when new residents can move in and engage in economic activities. The benefits are for new homeowners, who would like to purchase a house in a nice area where they have access to good jobs, but who often cannot because houses cost millions of dollars in the most sought-after neighborhoods (see: virtually all of San Francisco) due to artificial restrictions on supply. The benefits are for investors and construction companies, who would like to create economic value for themselves and their surrounding community but can't because of zoning regulations or other legal restrictions. The housing crisis has been going on for quite a while at this point, and its deleterious effects touch almost all aspects of urban and suburban life, particularly in the United States. Increasing the supply of housing through YIMBY policies is the only economically literate and sustainable way of fighting against it.
Stephen Martin3m10

Alright so there's an acknowledgement that at the very least, the people who originally occupied that nice area are losing out.

Your breakdown of the benefits seems more or less fair. The only thing I take issue with is "who would like to purchase a house in a nice area where they have access to good jobs". I don't think it's fair to take it for granted that the area will stay nice post-YIMBY change (in fact the core acknowledgement here is that the character of the neighborhood is going to change) or that jobs will stay.

Would you consider that a "load bear... (read more)

Reply
If Moral Realism is true, then the Orthogonality Thesis is false.
3
Eye You
3d

Claim: if moral realism is true, then the Orthogonality Thesis is false, and superintelligent agents are very likely to be moral.

I'm arguing against Armstrong's version of Orthogonality[1]:

The fact of being of high intelligence provides extremely little constraint on what final goals an agent could have (as long as these goals are of feasible complexity, and do not refer intrinsically to the agent’s intelligence).

Argument

1. Assume moral realism; there are true facts about morality.

2. Intelligence is causally[2] correlated with having true beliefs.[3]

3. Intelligence is causally correlated with having true moral beliefs.[4]

4. Moral beliefs constrain final goals; believing “X is morally wrong” is a very good reason and motivator for not doing X.[5]

5. Superintelligent agents will likely have final goals that cohere with their (likely true) moral beliefs. 

6. Superintelligent agents are likely to be moral.

  1. ^

    Though this argument basically applies to most other versions, including  strong form: "There can exist arbitrarily intelligent agents pursuing any kind of goal [and] there's no extra difficulty or complication in the existence of an intelligent agent that pursues a goal." 
    This argument as is does not work against Yudkowsky's weak form: "Since the goal of making paperclips is tractable, somewhere in the design space is an agent that optimizes that goal."

  2. ^

    Correction: there was a typo in the original post here. Instead of 'causally', it read 'casually'. 

  3. ^

    A different way of saying this: Intelligent agents tend towards a comprehensive understanding of reality; towards having true beliefs. As intelligence increases, agents will (in general, on average) be less wrong.

  4. ^

    A different way of saying this: Intelligent agents tend toward having true moral beliefs.
    To spell this out a bit more:

    A. Intelligent agents tend toward having true moral beliefs.
    B. Moral facts (under moral realism) are (an important!) part of reality.
    C. Intelligent agents tend toward having true moral beliefs.

  5. ^

    One could reject this proposition by taking a strong moral externalism stance. If moral claims are not intrinsically motivating and there is no general connection between moral beliefs and motivation, then this proposition does not follow. See here for discussion of moral interalism and orthogonality and for discussion of moral motivation. 

    As

...
lesswronguser12327m10

I think I agree with bostrom's 2012 position here:  (On how it's still a problem if moral realism is true; even though I think it's false— believing moral realism is true doesn't quite help people in designing friendly AI, as highlighted by other commentators)


The Orthogonality Thesis
Intelligence and final goals are orthogonal axes along which possible agents can freely
vary. In other words, more or less any level of intelligence could in principle be
combined with more or less any final goal 

[...]

The orthogonality thesis, as formulated here, makes

... (read more)
Reply
I'm scared.
78
Mass_Driver
15y

Recently, I've been ratcheting up my probability estimate of some of Less Wrong's core doctrines (shut up and multiply, beliefs require evidence, brains are not a reliable guide as to whether brains are malfunctioning, the Universe has no fail-safe mechanisms) from "Hmm, this is an intriguing idea" to somewhere in the neighborhood of "This is most likely correct."

This leaves me confused and concerned and afraid. There are two things in particular that are bothering me. On the one hand, I feel obligated to try much harder to identify my real goals and then to do what it takes to actually achieve them -- I have much less faith that just being a nice, thoughtful, hard-working person will result in me having a pleasant life, let alone in...

(See More – 194 more words)
1daijin3h
It's been 15 years. Did you figure out how to be less scared?
Mass_Driver1h20

I mostly just got older and therefore calmer. I've crossed off most of the highest-priority items from my bucket list, so while I would prefer to continue living for a good long while, my personal death and/or defeat doesn't seem so catastrophically bad anymore, and to cope with the loss of civilization/humanity I read a lot of history and sci-fi and anthropology and other works that help me zoom out and see that there has already been great loss and that while I do want to spend my resources fighting to reduce the risk of that loss, it's not something I need to spend a lot of time or energy personally suffering over, especially not in advance. Worry is interest paid on trouble before it's due.

Reply
Conciseness Manifesto
9
Vasyl Dotsenko
2h

This page is intentionally left blank

3Vasyl Dotsenko2h
But, seriously, in the age of generative AI and summarizers, there's no need to pretend. We might as well just say the thing we want to say without the need for the reader to invoke a summarizer on it.
CstineSublime1h10

How do you feel about conciseness and educational resources? I'm in two minds, on the one hand I am often as someone learning battling jargon defined by jargon - I see a word, I want to know what it means so I can keep reading. I don't want to read a chain of three wikipedia articles. And that filler and fluff can actually distract and make the lesson or knowledge harder to retain. The reader/student is more likely to misunderstand or to miss core information as they are assaulted by tangents.

On the other hand, I do accept that repetition can improve reten... (read more)

Reply
lesswronguser123's Shortform
lesswronguser123
2d
1lesswronguser12315h
Not that I'm aware of sometimes it's pure coincidence whilst searching for unrelated things, other times I remember some old context out of nowhere, or if I am looking into adjacent topics, which leads me to stumble upon it. Nowadays AI can also help, but I'm currently bad at prompt engineering.
CstineSublime1h10

I'm a bad prompt engineer myself and I get quite envious when people announce these amazing results they are getting from the LLM which feel like pulling teeth from a chicken to me.

The reason I ask is because if it is more a matter of incidental as you research other things, I was wondering if there's in a sense a way of reverse engineering that process of recognizing it after the fact?

Reply
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
GOOGLEGITHUB
Simulators Increase the Likelihood of Alignment by Default
14
Wuschel Schulz
2y

Alignment by Default is the idea that achieving alignment in artificial general intelligence (AGI) may be more straightforward than initially anticipated. When an AI possesses a comprehensive and detailed world model, it inherently represents human values within that model. To align the AGI, it's merely necessary to extract these values and direct the AI towards optimizing the abstraction it already comprehends.

In a summary of this concept, John Wentworth estimates a 10% chance of this strategy being successful, a perspective I generally agree with.

However, in light of recent advancements, I have revised my outlook, now believing that Alignment by Default has a higher probability of success, perhaps around 30%. This update was prompted by the accomplishments of ChatGPT, GPT-4, and subsequent developments. I believe these systems are approaching...

(Continue Reading – 1359 more words)
jsnider31h10

If alignment-by-default works for AGI, then we will have thousands of AGIs providing examples of aligned intelligence. This new, massive dataset of aligned behavior could then be used to train even more capable and robustly aligned models each of which would then add to the training data until we have data for aligned superintelligence.

If alignment-by-default doesn't work for AGI, then we will probably die before ASI.

Reply
zyansheep's Shortform
zyansheep
3h
Feedback wanted: Shortlist of AI safety ideas
3
mmKALLL
3h

Summary:

  • I made a prioritized list of 130 ideas on how to contribute to AI safety.
  • The top 12 ideas have similar overall scores, so I'd like to verify my assumptions.
  • In particular, I'd like to know which ones are already being worked on by other people or otherwise irrelevant.

Background:

Hello everyone! 👋

At the start of year, I was trying to figure out what my goals should be regarding AI safety. I ended up making a list of 130 small, concrete ideas ranging all kinds of domains. A lot of them are about gathering and summarizing information, skilling up myself, and helping others be more effective.

I needed a rough way to prioritize them, so I built a prioritization framework inspired by 80,000 Hours. I won't go into the details here, but...

(Continue Reading – 1269 more words)
Rohin Shah2d7818
A case for courage, when speaking of AI danger
While I disagree with Nate on a wide variety of topics (including implicit claims in this post), I do want to explicitly highlight strong agreement with this: > I have a whole spiel about how your conversation-partner will react very differently if you share your concerns while feeling ashamed about them versus if you share your concerns as if they’re obvious and sensible, because humans are very good at picking up on your social cues. If you act as if it’s shameful to believe AI will kill us all, people are more prone to treat you that way. If you act as if it’s an obvious serious threat, they’re more likely to take it seriously too. The position that is "obvious and sensible" doesn't have to be "if anyone builds it, everyone dies". I don't believe that position. It could instead be "there is a real threat model for existential risk, and it is important that society does more to address it than it is currently doing". If you're going to share concerns at all, figure out the position you do have courage in, and then discuss that as if it is obvious and sensible, not as if you are ashamed of it. (Note that I am not convinced that you should always be sharing your concerns. This is a claim about how you should share concerns, conditional on having decided that you are going to share them.)
gwern14hΩ122111
AI forecasting bots incoming
Update: Bots are still beaten by human forecasting teams/superforecasters/centaurs on truly heldout Metaculus problems as of early 2025: https://www.metaculus.com/notebooks/38673/q1-ai-benchmarking-results/ A useful & readable discussion of various methodological problems (including the date-range search problems above) which render all forecasting backtesting dead on arrival (IMO) was recently compiled as "Pitfalls in Evaluating Language Model Forecasters", Paleka et al 2025, and is worth reading if you are at all interested in the topic.
Vladimir_Nesov3d3019
The Industrial Explosion
I think biorobots (macroscopic biotech) should be a serious entry in a list like this, something that's likely easier to develop than proper nanotechnology, but already has key advantages over human labor or traditional robots for the purposes of scaling industry, such as an extremely short doubling time and not being dependent on complicated worldwide supply chains. Fruit flies can double their biomass every 1-3 days. Metamorphosis reassembles biomass from one form to another. So a large amount of biomass could be produced using the short doubling time of small "fruit fly" things, and then merged and transformed through metamorphosis into large functional biorobots, with capabilities that are at least at the level seen in animals. These biorobots can then proceed to build giant factories and mines of the more normal kind, which can manufacture compute, power, and industrial non-bio robots. Fusion power might let this quickly scale well past the mass of billions of humans. If the relevant kind of compute can be produced with biotech directly, then this scales even faster, instead of at some point being held back by not having enough AIs to control the biorobots and waiting for construction of fabs and datacenters. (The "fruit flies" are the source of growth, packed with AI-designed DNA that can specialize the cells to do their part in larger organisms reassembled with metamorphosis from these fast-growing and mobile packages of cells. Let's say there are 1000 "fruit flies" 1 mg each at the start of the industrial scaling process, and we aim to produce 10 billion 100 kg robots. The "fruit flies" double in number every 2 days, which is 1e15x more mass than the initial 1000 "fruit flies", and so might take as little as 100 days to produce. Each ~100 kg of "fruit flies" can then be transformed into a 100 kg biorobot on the timescale of weeks, with some help from the previous biorobots or initially human and non-bio robot labor.)
Load More
[Today]Bengaluru - LW/ACX Meetup - Jun 2025
[Today]June 29th Meetup - Muzeum kert
350A case for courage, when speaking of AI danger
So8res
2d
29
336A deep critique of AI 2027’s bad timeline models
titotal
10d
39
467What We Learned from Briefing 70+ Lawmakers on the Threat from AI
leticiagarcia
1mo
14
197Foom & Doom 1: “Brain in a box in a basement”
Ω
Steven Byrnes
6d
Ω
75
333the void
Ω
nostalgebraist
18d
Ω
98
532Orienting Toward Wizard Power
johnswentworth
1mo
141
659AI 2027: What Superintelligence Looks Like
Ω
Daniel Kokotajlo, Thomas Larsen, elifland, Scott Alexander, Jonas V, romeo
3mo
Ω
222
111The Industrial Explosion
rosehadshar, Tom Davidson
3d
32
156My pitch for the AI Village
Daniel Kokotajlo
5d
29
286Beware General Claims about “Generalizable Reasoning Capabilities” (of Modern AI Systems)
Ω
LawrenceC
18d
Ω
18
160Consider chilling out in 2028
Valentine
8d
127
226Distillation Robustifies Unlearning
Ω
Bruce W. Lee, Addie Foote, alexinf, leni, Jacob G-W, Harish Kamath, Bryce Woodworth, cloud, TurnTrout
16d
Ω
36
215Do Not Tile the Lightcone with Your Confused Ontology
Ω
Jan_Kulveit
5d
Ω
26
Load MoreAdvanced Sorting/Filtering
Yudkowsky's
here
132
X explains Z% of the variance in Y
Leon Lang
1d
19
215
Do Not Tile the Lightcone with Your Confused Ontology
Ω
Jan_Kulveit
5d
Ω
26