Wiki Contributions

Comments

mishka109

The podcast is here: https://www.dwarkeshpatel.com/p/john-schulman?initial_medium=video

From reading the first 29 min of the transcript, my impression is: he is strong enough to lead an org to an AGI (it seems many people are strong enough to do this from our current level, the conversation does seem to show that we are pretty close), but I don't get the feeling that he is strong enough to deal with issues related to AI existential safety. At least, that's what my initial impression is :-(

mishka112

Jan Leike confirms: https://twitter.com/janleike/status/1790603862132596961

Dwarkesh is supposed to release his podcast with John Schulman today, so we can evaluate the quality of his thinking more closely (he is mostly known for reinforcement learning, https://scholar.google.com/citations?user=itSa94cAAAAJ&hl=en, although he has some track record of safety-related publications, including Unsolved Problems in ML Safety, 2021-2022, https://arxiv.org/abs/2109.13916 and Let's Verify Step by Step, https://arxiv.org/abs/2305.20050 which includes Jan Leike and Ilya Sutskever among its co-authors).

No confirmation of him becoming the new head of Superalignment yet...

mishka159

Ilya departure is momentous.

What do we know about those other departures? The NYT article has this:

Jan Leike, who ran the Super Alignment team alongside Dr. Sutskever, has also resigned from OpenAI. His role will be taken by John Schulman, another company co-founder.

I have not been able to find any other traces of this information yet.

We do know that Pavel Izmailov has joined xAI: https://izmailovpavel.github.io/

Leopold Aschenbrenner still lists OpenAI as his affiliation everywhere I see. The only recent traces of his activity seem to be likes on Twitter: https://twitter.com/leopoldasch/likes

mishka20

Thanks!

Interesting. I see a lot of people reporting their coding experience improving compared to GPT-4, but it looks like this is not uniform, that experience differs for different people (perhaps, depending on what they are doing)...

mishka10

What's your setup? Are you using it via ChatGPT interface or via API and a wrapper?

mishka141

This also points out that Arena tells you what model is Model A and what is Model B. That is unfortunate, and potentially taints the statistics.

No, https://chat.lmsys.org/ says this:

  • Ask any question to two anonymous models (e.g., ChatGPT, Claude, Llama) and vote for the better one!
  • You can chat for multiple turns until you identify a winner.
  • Votes won't be counted if model identities are revealed during the conversation.

So one can choose to know the names of the models one is talking with, but then one's votes will not be counted for the statistics.

mishka10

Nobody currently knows how to align strongly superhumanly smart AIs to human interests, and we need way more time to solve this problem. Making incremental progress on AI capabilities is shortening the timeline we have left to figure out how to align AI and is thus making human extinction more likely. Thus by far the best action is to stop advancing AI capabilities.

It seems that not much research is done into studying invariant properties of rapidly self-modifying ecosystems. At least, when I did some search and also asked here a few months ago, not much came up: https://www.lesswrong.com/posts/sDapsTwvcDvoHe7ga/what-is-known-about-invariants-in-self-modifying-systems.

It's not possible to have a handle on the dynamics of rapidly self-modifying ecosystems without better understanding how to think about properties conserved during self-modification. And ecosystems with rapidly increasing capabilities will be strongly self-modifying.

However, any progress in this direction is likely to be dual-use. Knowing how to think about self-modification invariants is very important for AI existential safety and is also likely to be a strong capability booster.

This is a very typical conundrum for AI existential safety. We can try to push harder to make sure that the research into invariant properties of self-modifying (eco)systems is an active research area again, but the likely side-effect of better understanding properties of potentially fooming systems is making it easier to bring these systems into existence. And we don't have good understanding of proper ways to handle this kind of situations (although the topic of dual-use is discussed here from time to time).

mishka54

No, OpenAI (assuming that it is a well-defined entity) also uses a probability distribution over timelines.

(In reality, every member of its leadership has their own probability distribution, and this translates to OpenAI having a policy and behavior formulated approximately as if there is some resulting single probability distribution).

The important thing is, they are uncertain about timelines themselves (in part, because no one knows how perplexity translates to capabilities, in part, because there might be difference with respect to capabilities even with the same perplexity, if the underlying architectures are different (e.g. in-context learning might depend on architecture even with fixed perplexity, and we do see a stream of potentially very interesting architectural innovations recently), in part, because it's not clear how big is the potential of "harness"/"scaffolding", and so on).

This does not mean there is no political infighting. But it's on the background of them being correctly uncertain about true timelines...


Compute-wise, inference demands are huge and growing with popularity of the models (look how much Facebook did to make LLama 3 more inference-efficient).

So if they expect models to become useful enough for almost everyone to want to use them, they should worry about compute, assuming they do want to serve people like they say they do (I am not sure how this looks for very strong AI systems; they will probably be gradually expanding access, and the speed of expansion might depend).

mishka55

I think having a probability distribution over timelines is the correct approach. Like, in the comment above:

I think I'm more likely to be better calibrated than any of these opinions, because most of them don't seem to focus very much on "hedging" or "thoughtful doubting", whereas my event space assigns non-zero probability to ensembles that contain such features of possible futures (including these specific scenarios).

mishka32

However, none of them talk about each other, and presumably at most one of them can be meaningfully right?

Why at most one of them can be meaningfully right?

Would not a simulation typically be "a multi-player game"?

(But yes, if they assume that their "original self" was the sole creator (?), then they would all be some kind of "clones" of that particular "original self". Which would surely increase the overall weirdness.)

Load More