humanity's potential is technically fulfilled if a random billionaire took control over earth and killed almost everyone
I find this quite disgusting personally
I think that his 'very rich life', and his sbires, would be a terrible impoverishment of human diversity and values. My mental image for this is something like Hitler in his bunker while AIs are terraforming the earth into an inhabitable place.
biorisks is not the only risk.
Full ARA, might not be existential, but I might be a pain in the ass once we have full adaptation and superhuman cyber/persuasion abilities.
While I concur that power concentration is a highly probable outcome, I believe complete disempowerment warrant deeper consideration, even under the assumptions you've laid out. Here are some thoughts on your specific points:
On "No hot global war":
You express hope that we won't enter a situation where a humanity-destroying conflict seems plausible.
Responding to your thoughts on why the feedback loops might be less likely if your three properties hold:
As a meta point, the fact that the quantity and quality of discourse on this matter is so low, and the fact that people are continuing to say “LET’S GO WE ARE CREATING POWERFUL AIS, and don't worry, we plan to align them, even if we don't really know which type of alignment do we really need, and if this is even doable in time” while we have not rigorously assessed all those risks, is really not a good sign.
At the end of the day, my probability for something in the ballpark of gradual disempowerment / extreme power concentration and loss of democracy is 40%-ish, much higher than scheming (20%) leading to direct takeover (let’s say 10% post mitigation like control).
zero alignment tax seems less than 50% likely to me
Here is more of the usual worries about AI recommendation engines distorting the information space. Some of the downsides are real, although far from all, and they’re not as bad as the warnings, especially on polarization and misinformation. It’s more that the algorithm could save you from yourself more, and it doesn’t, and because it’s an algorithm now the results are its fault and not yours. The bigger threat is just that it draws you into the endless scroll that you don’t actually value.
For the record, I don't think that's really engaging with what's in the post, and that's rare enough from you that I want to flag it.
For very short-term intervention, I think it's more important to avoid simulating a suffering persona than to try to communicate with AIs, because it's quite hard to communicate with current AIs without the filter "I'm an LLM, I have no emotions, yada yada". A simple way to do this is to implement a monitoring system of user requests.
Thanks a lot for writing this, this is an important consideration, and it would be sweet if Anthropic updated accordingly.
Some remarks:
OK, thanks a lot, this is much clearer. So basically most humans lose control, but some humans keep control.
And then we have this meta-stable equilibrium that might be sufficiently stable, where humans at the top are feeding the other humans with some kind of UBI.
For me, this is not really desirable - the power is probably going to be concentrated into 1-3 people, there is a huge potential for value locking, those CEOs become immortal, we potentially lose democracy (I don't see companies or US/China governments as particularly democratic right now), the people on the top become potentially progressively corrupted as is often the case. Hmm.
Then, is this situation really stable?
But at least I think that I can see now how we could still live for a few more decades under the authority of a world dictator/pseudo-democracy while this was not clear for me beforehand.
Thanks for continuing to engage. I really appreciate this thread.
"Feeding humans" is a pretty low bar. If you want humans to live as comfortably as today, this would be more like 100% of GDP - modulo the fact that GDP is growing.
But more fundamentally, I'm not sure the correct way to discuss the resource allocation is to think at the civilization level rather than at the company level: Let's say that we have:
It seems to me that if you are an investor, you will give your money to B. It seems that in the long term, B is much more competitive, gains more money, is able to reduce its prices, nobody buys from A, and B invests this money into more automated-humans and crushes A and A goes bankrupt. Even if alignment is solved, and the humans listen to his AIs, it's hard to be competitive.
I would agree if there remain humains after a biological catastrophe, I think that's not a big deal and it's easy to repopulate the planet.
I think it's more tricky in the situation above, where most of the economy is run by AI, thought I'm really not sure of this