^{^}
All of humanity
^{^}
Go extinct
^{^}
Hindsight Experience Replay (HER), a technique for improving the reinforcement learning training signal
^{^}
Large-scale training and large model size
^{^}
Chain of Continuous Thought, a technique that makes model chain of thought much less interpretable but which allows the model to reason more efficiently

Reply

Double's Shortform

Double22d20

Is it possible that making an expected utility maximizer might be less dangerous than making something which isn't?
Consider as an alternative an expected log utility maximizer (an agent using the Kelly Criterion, or some approximation of it).

The sooner an AI wins, the more galaxies it can consume. The expected utility maximizer weighs those galaxies against the risk of failure, and is willing to take plans with much higher probabilities of failure. Like SBF, it would take bets which have a 50% chance of more-than-doubling its utility and 50% of losing it all. In many environments, this strategy will almost certainly result in failure, as the agent goes double-or-nothing until losing everything. That means that the effects of the AI are mitigated.

The log utility maximizer carefully plans and succeeds in most or all futures. That looks like humanity dying with near-certainty.

A hyper-expected utility maximizer (an AI which maximizes expected exp(utility) or similar) would be even safer. Instead of trying to deceive you into letting it out of the box, it asks nicely or does something crazy because if it works, it can work in less time than deception, which means more galaxies.

So if we were to choose between existing in the world of a superintelligent expected log(resources) maximizer, and a superintelligent expected utility maximizer, we should maybe go for the one which results in us being alive in more futures.

Of course, the expected-log-utility agent would also appear the most capable and useful. The hyper-expected utility maximizer would be near-useless.

Reply

Double's Shortform

Double1mo40

In addition to money, education, careers, and internal organs, citizens of wealthy countries have an additional valuable resource they could direct to effective causes: their hands in marriage, which can be effectively allocated in one of two ways.

For one, professionals are usually much more impactful doing their work in wealthy countries. Otherwise promising EAs in South Sudan have little chance to make a significant impact on existential risks, animal welfare, or even global poverty. The immigration process is difficult and often rejects or holds up good people. Offering to marry them is a more reliable solution.

Secondly, it is possible to be paid $10,000 by a foreigner for a green card marriage. (I learned this from a friend who does not want me to ask him how he knows) if you are a US Citizen.

According to AMF, that money can save around two human lives! (and with current US politics, the demand has likely increased!)

According to brides.com, a wedding ceremony takes between 20 and 30 minutes. Let's be conservative and say 30 minutes.

Therefore, you can make $20,000 an hour by marrying someone who would pay for a green card. That's quite a ways from Bezos level (he makes 3,715 a second) but I'm willing to guess that most EAs don't make $20k an hour.

Conclusion:
As always, EAers need to found a new org, Effective Green Card, to support and pursue this cause area.

Naturally, this also implies Effective Divorce, so that you can instead marry an Effective foreigner.

Reply

Patent Trolling to Save the World

Double4mo10

I’m pretty sure there’s no such use it or lose it law for patents, since patent trolls already exist.

Reply

Patent Trolling to Save the World

Double4mo120

Your argument about corporate secrets is sufficient to change my mind on activist patent trolling being a productive strategy against AI X-risk.

The part about funding would need to be solved with philanthropy. I don't believe that org exists, but I don't see why it couldn't.

I'm still curious whether there are other cases in which activist patent trolling can be a good option, such as animal welfare, chemistry, public health, or geoengineering (ie fracking).

Reply

Patent Trolling to Save the World

Double4mo10

That's fair enough and a good point.

I think that the key difference is that in the case of profitable-but-bad technologies, someone, somewhere, will probably invent them because there's great incentive to do so.

In the case of gain-of-function, if there stops being grants and the academics who do it become pariahs, then the incentive to do the gain-of-function research is gone.

Reply

Double's Shortform

Double4mo80

One of the most powerful capabilities an AGI will have is its ability to copy itself. Among other things, this allows it to easily avoid shutdown, make use of more compute resources, and collaborate with copies of itself.

Is there research into ways to deny this capability to AI, making them uncopyable? Preferably something harder to circumvent than "just don't give the AI the permissions," since we know people are going to give them root access immediately.

Reply

(The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser

Double5mo1913

I'd be interested in buying official LessWrong merch. I know you have some great designers and could make things that look really cool.
The type of thing I'd be most likely to buy would be a baseball cap.

Reply

I played the AI box game as the Gatekeeper — and lost

Double5mo10

IIRC, officially the Gatekeeper pays the AI if the AI wins, but no transfer if the Gatekeeper wins. Gives the Gatekeeper more motivation not to give in.

Reply

Double's Shortform

Double7mo30

Just found out about this paper from about a year ago: "Explainability for Large Language Models: A Survey"
(They "use explainability and interpretability interchangeably.")
It "aims to comprehensively organize recent research progress on interpreting complex language models".

I'll post anything interesting I find from the paper as I read.

Have any of you read it? What are your thoughts?

Reply