Carl Feynman

I was born in 1962 (so I’m in my 60s).  I was raised rationalist, more or less, before we had a name for it.  I went to MIT, and have a bachelors degree in philosophy and linguistics, and a masters degree in electrical engineering and computer science.  I got married in 1991, and have two kids.  I live in the Boston area.  I’ve worked as various kinds of engineer: electronics, computer architecture, optics, robotics, software.

Around 1992, I was delighted to discover the Extropians.  I’ve enjoyed being in that kind of circles since then.  My experience with the Less Wrong community has been “I was just standing here, and a bunch of people gathered, and now I’m in the middle of a crowd.”  A very delightful and wonderful crowd, just to be clear.  

I‘m signed up for cryonics.  I think it has a 5% chance of working, which is either very small or very large, depending on how you think about it.

I may or may not have qualia, depending on your definition.  I think that philosophical zombies are possible, and I am one.  This is a very unimportant fact about me, but seems to incite a lot of conversation with people who care.

I am reflectively consistent, in the sense that I can examine my behavior and desires, and understand what gives rise to them, and there are no contradictions I‘m aware of.  I’ve been that way since about 2015.  It took decades of work and I’m not sure if that work was worth it.

Posts

Sorted by New

Wiki Contributions

Comments

When I brought up sample inefficiency, I was supporting Mr. Helm-Burger‘s statement that “there's huge algorithmic gains in …training efficiency (less data, less compute) … waiting to be discovered”.  You’re right of course that a reduction in training data will not necessarily reduce the amount of computation needed.  But once again, that’s the way to bet.

Here are two arguments for low-hanging algorithmic improvements.

First, in the past few years I have read many papers containing low-hanging algorithmic improvements.  Most such improvements are a few percent or tens of percent.  The largest such improvements are things like transformers or mixture of experts, which are substantial steps forward.  Such a trend is not guaranteed to persist, but that’s the way to bet.

Second, existing models are far less sample-efficient than humans.  We receive about a billion tokens growing to adulthood.  The leading LLMs get orders of magnitude more than that.  We should be able to do much better.  Of course, there’s no guarantee that such an improvement is “low hanging”.  

This question is two steps removed from reality.  Here’s what I mean by that.  Putting brackets around each of the two steps:

what is the threshold that needs meeting [for the majority of people in the EA community] [to say something like] "it would be better if EAs didn't work at OpenAI"?
 

Without these steps, the question becomes 

What is the threshold that needs meeting before it would be better if people didn’t work at OpenAI?

Personally, I find that a more interesting question.  Is there a reason why the question is phrased at two removes like that?  Or am I missing the point?

Some comments:

The word for a drug that causes loss of memory is “amnestic”, not “amnesic”.  The word “amnesic” is a variant spelling of “amnesiac”, which is the person who takes the drug.  This made reading the article confusing.

Midazolam is the benzodiazepine most often prescribed as an amnestic.  The trade name is Versed (accent on the second syllable, like vurSAID).  The period of not making memories lasts less than an hour, but you’re relaxed for several hours afterward.  It makes you pretty stupid and loopy, so I would think the performance on an IQ test would depend primarily on how much Midazolam was in the bloodstream at the moment, rather than on any details of setting.

An interesting question!  I looked in “Towards Deep Learning Models Resistant to Adversarial Attacks” to see what they had to say on the question.  If I’m interpreting their Figure 6 correctly, there’s a negligible increase in error rate as epsilon increases, and then at some point the error rate starts swooping up toward 100%.  The transition seems to be about where the perturbed images start to be able to fool humans.  (Or perhaps slightly before.).  So you can’t really blame the model for being fooled, in that case.  If I had to pick an epsilon to train with, I would pick one just below the transition point, where robustness is maximized without getting into the crazy zone.

All this is the result of a cursory inspection of a couple of papers.  There’s about a 30% chance I’ve misunderstood.

Here’s an event that would change my p(doom) substantially:

Someone comes up with an alignment method that looks like it would apply to superintelligent entities.  They get extra points for trying it and finding that it works, and extra points for society coming up with a way to enforce that only entities that follow the method will be created.

So far none of the proposed alignment methods seem to stand up to a superintelligent AI that doesn’t want to obey them.  They don’t even stand up to a few minutes of merely human thought.  But it‘s not obviously impossible, and lots of smart people are working on it.

In the non-doom case, I think one of the following will be the reason:

—Civilization ceases to progress, probably because of a disaster.

—The governments of the world ban AI progress.

—Superhuman AI turns out to be much harder than it looks, and not economically viable.

—The above happy circumstance, giving us the marvelous benefits of superintelligence without the omnicidal drawbacks.

You write:

…But I think people can be afraid of heights without past experience of falling…

I have seen it claimed that crawling-age babies are afraid of heights, in that they will not crawl from a solid floor to a glass platform over a yawning gulf.  And they’ve never fallen into a yawning gulf.  At that age, probably all the heights they’ve fallen from have been harmless, since the typical baby is both bouncy and close to the ground.

Various sailors made important discoveries back when geography was cutting-edge science.  And they don't seem particularly bright.

Vasco De Gama discovered that Africa was circumnavigable.

Columbus was wrong about the shape of the Earth, and he discovered America.  He died convinced that his newly discovered islands were just off the coast of Asia, so that's a negative sign for his intelligence (or a positive sign for his arrogance, which he had in plenty.)

Cortez discovered that the Aztecs were rich and easily conquered.

Of course, lots of other would-be discoverers didn't find anything, and many died horribly.

So, one could work in a field where bravery to the point of foolhardiness is a necessity for discovery.

We've learned a lot about the visual system by looking at ways to force it to wrong conclusions, which we call optical illusions or visual art.  Can we do a similar thing for this postulated social cognition system?  For example, how do actors get us to have social feelings toward people who don't really exist?  And what rules do movie directors follow to keep us from getting confused by cuts from one camera angle to another?

I would highly recommend getting someone else to debug your subconscious for you.  At least it worked for me.  I don’t think it would be possible for me to have debugged myself.
 

My first therapist was highly directive.  He’d say stuff like “Try noticing when you think X, and asking yourself what happened immediately before that.  Report back next week.” And listing agenda items and drawing diagrams on a whiteboard.  As an engineer, I loved it.  My second therapist was more in the “providing supportive comments while I talk about my life” school.  I don’t think that helped much, at least subjectively from the inside.

Here‘s a possibly instructive anecdote about my first therapist.  Near the end of a session, I feel like my mind has been stretched in some heretofore-unknown direction.  It’s a sensation I’ve never had before.  So I say, “Wow, my mind feels like it’s been stretched in some heretofore-unknown direction.  How do you do that?”  He says, “Do you want me to explain?”  And I say, “Does it still work if I know what you’re doing?”  And he says, “Possibly not, but it’s important you feel I’m trustworthy, so I’ll explain if you want.”  So I say “Why mess with success?  Keep doing the thing. I trust you.”  That’s an example of a debugging procedure you can’t do to yourself.

Load More