What am I missing about AI?
Last month I blogged about how the mainstream media is focusing on the wrong parts of the Artificial Intelligence/ChatGPT story.
One of the comments left on the post was:
I encourage you to dig a little deeper. If LLM’s were just probability machines, no one would be raising any flags. Hinton, Bengio, Tegmark and many others are not simpletons. It is the fact that the architecture and specific training (deep NN, back prop / gradient descend) produces a system with emergent properties, beyond just a probability machine, when the system size reaches some thresholds, that has them spooked. They do understand mathematics and stats and probabilities, i assure you. It is just that you may have only read the layman’s articles and not the scientific ones
I confess: I haven’t made much progress in this regard. I gave Vicky Boykis' Embeddings a go, and started to get a handle on the math, but honestly had a hard time following it. I’m open to suggestions from anyone with a few good recommendations for scientific papers accessible to non-math professionals, particularly ones that explain the “emergent” properties and what that means.
Meanwhile, regardless of the scientific truths or falsehoods around chat GPT, the mainstream media continues to miserably fail in helping the rest of us understand the implications of this technology.
Most recently, I listend to This American Life’s “First Contact” (part of their “Greetings People of Earth” show).
They interviewed several Microsft AI researchers who first experimented with ChatGPT 4 prior to it’s big release.
The focus of the researchers was: can we demonstrate chat GPT’s general intelligence ability by presenting it with logic problems it could not possibly have encountered before? And the answer: YES!
The two examples were:
Stacking: the researcher asked chat GPT how to stack a number of odd objects in a stable way (a book, a dozen eggs, a nail, etc) and chat GPT gave both the correct answer and a reasonable explanation of why.
Hidden state: the researcher described two people in a room with a cat. One person put the cat in a basket and left. The other moved the cat to a box ad left. And, remarkably, chat GPT could explain that when they returned, the first person would think the cat is in the basket and the second person would know it’s in the box.
I thought this was pretty cool. So I fired up chat GPT (and even ponied up for chat GPT version 4). I asked it my own stacking question and, hm, chat GPT thought a plate should be placed on top of a can of soda instead of beneath it. So, well, mostly right but I’m pretty sure any reasonable human would put the can of soda on the plate not the other way around (chat GPT 3.5 wanted the can of soda to be balanced on the tip of the nail).
I then asked it my own simple version of the cat problem and it got it right. Very good. But when I asked it a much more complicated and weird version of the cat problem (involving beetles in a mansion with a movie theater and changing movies and a butler with a big mustache) it got the answer flat out wrong.
Did anyone at This American Life try this? Really? It seems like a basic responsibility of journalism to fact check the experts. Maybe the scientists would have had a convincing response? Or maybe scientists are just like everyone else and can get caught up in the excitement and make mistakes?
I am amazed and awed by what chat GPT can do - it truly is remarkable. And, I think that a lot of human intelligence is synthesizing what we’ve seen and simply regurgitating it in a different context - a task that chat GPT is way better at doing than we are.
But the overriding message of most mainstream media stories is that chat GPT is somehow going beyond word synthesis and probability and magically tapping into a form of logic. If the scientific papers are demonstrating this remarkable feat, I think the media needs to do a way better job reporting it.