AI The Leadership Brief Time Magazine How IBM CEO Arvind Krishna Is Thinking About AI and Quantum Computing CM NewsMarch 2, 202500 views Table of Contents IBM built Deep Blue, the first chess AI to beat a human champion, in the 1990s. Then, in 2011, IBM’s Watson was the first to win the game show Jeopardy. But today, IBM isn’t training large AI systems in the same way as OpenAI or Google. Can you explain why the decision was made to take a backseat from the AI race?But one of the central takeaways of the last 10 years in deep learning seems to be that you can get more out of AI systems by just trying to make them generalist than you can by trying to make them specialized in a single area. Right? That’s what’s referred to as “the bitter lesson.”Are the major economic benefits from AI going to accrue to the biggest companies that train the foundation models? Or to the smaller companies who apply those models to specific use cases?How has your answer to that question influenced the direction of your business?Let’s talk about quantum computing. IBM is a big investor in quantum. What’s your bigger picture strategy there?If you can make the huge breakthrough that you say you hope to make by the end of the decade, where does that put IBM as a business? Does that leave you in a dominant position over the next wave of technology? (To receive weekly emails of conversations with the world’s top CEOs and decisionmakers, click here.) IBM was one of the giants of 20th-century computing. It helped design the modern PC, and created the first AI to defeat a human champion in the game of chess. [time-brightcove not-tgx=”true”] But when you think of AI, IBM might not be the first, or even the tenth, company to spring to mind. It doesn’t train big models, and doesn’t make consumer-facing products any more, focusing instead on selling to other businesses. “We are a B2B company, and explaining what we do to the average reader—we’ll take all the help we can get,” IBM CEO Arvind Krishna joked ahead of a recent interview with TIME. Still, there’s an interesting AI story lurking inside this storied institution. IBM does indeed build AI models—not massive ones like OpenAI’s GPT4-o or Google’s Gemini, but smaller ones designed for use in high-stakes settings, where accuracy comes at a premium. As the AI business matures, this gets at a critical unanswered question on the minds of Wall Street and Silicon Valley investors: will the economic gains from AI mostly accrue to the companies that train massive “foundation models” like OpenAI? Or will they flow instead to the companies—like IBM—that can build the leanest, cheapest, most accurate models that are tailored for specific use-cases? The future of the industry could depend on it. TIME spoke with Krishna in early February, ahead of a ceremony during which he was awarded a TIME100 AI Impact Award. This interview has been condensed and edited for clarity. IBM built Deep Blue, the first chess AI to beat a human champion, in the 1990s. Then, in 2011, IBM’s Watson was the first to win the game show Jeopardy. But today, IBM isn’t training large AI systems in the same way as OpenAI or Google. Can you explain why the decision was made to take a backseat from the AI race? When you look at chess and Jeopardy, the reason for taking on those challenges was the right one. You pick a thing that people believe computers cannot do, and then if you can do it, you’re conveying the power of the technology. Here was the place where we went off: We started building systems that I’ll call monolithic. We started saying, let’s go attack a problem like cancer. That turned out to be the wrong approach. Absolutely it is worth solving, so I don’t fault what our teams did at that point. However, are we known for being medical practitioners? No. Do we understand how hospitals and protocols work? No. Do we understand how the regulator works in that area? No. With hindsight, I wish we had thought about that just for a couple of minutes at the beginning. So then we said, OK, you can produce larger and larger models, and they’ll take more and more compute. So option one, take a billion dollars of compute and you produce a model. Now to get a return on it, you’ve got to charge people a certain amount. But can we distill it down to a much smaller model that may not take as much compute, and is much, much cheaper to run, but is a fit-for-purpose model for a task in a business context? That is what led to the business lens. But one of the central takeaways of the last 10 years in deep learning seems to be that you can get more out of AI systems by just trying to make them generalist than you can by trying to make them specialized in a single area. Right? That’s what’s referred to as “the bitter lesson.” I might politely disagree with that. If you’re willing to have an answer that’s only 90% accurate, maybe. But if I’d like to control a blast furnace, it needs to be correct 100% of the time. That model better have some idea of time-series analysis baked into it. It’s not a generalist machine that decided to somehow intuit Moby Dick to come up with its answer. So with respect, no. If you are actually trying to get to places where you need much higher accuracy, you actually may do much better with a smaller model. I actually believe there will be a few very large models. They’ll cost a couple of billion dollars to train, or maybe even more. And there’s going to be thousands of smaller models that are fit-for-purpose. They’ll leverage the big ones for teaching, but not really for their inherent knowledge. Are the major economic benefits from AI going to accrue to the biggest companies that train the foundation models? Or to the smaller companies who apply those models to specific use cases? I think it’s an exact “and.” I think the analogy of AI is probably closest to the early days of the internet. So on the internet, ask yourself the question, is it useful only for very large companies or for very small companies? Take two opposite examples. If I’m going to build a video streaming business, the more content you have, the more people you can serve. You get a network effect, you get an economy of scale. On the other hand, you have a shopfront like Etsy. Suddenly the person who’s an artisan who makes two items a year can still have a presence because the cost of distribution is extremely low. How has your answer to that question influenced the direction of your business? We thought deeply about it. Back in 2020, we said: should we put all our investments into trying to build one very large model? If it’s a very large model, the cost of running these models is, let’s call it, the square of the size of the model. So if I have a 10 billion parameter model and I have a 1 trillion parameter model, it’s going to be 10,000 times more expensive to run the very big model. Then you turn around and ask the question, if it’s only 1% better, do I really want to pay 10,000 times more? And that answer in the business world is almost always no. But if it can be 10 times smaller, hey, that’s well worth it, because that drops more than 90% of the cost of running it. That is what drove our decision. Let’s talk about quantum computing. IBM is a big investor in quantum. What’s your bigger picture strategy there? So we picked quantum as an area for investment more than 10 years ago. We came to the conclusion that it’s an engineering problem more than it’s a science problem. The moment it’s an engineering problem, now you have to ask yourself the question, can you solve the two fundamental issues that are there? One, the error rates are really high, but so are normal computers’. What people don’t recognize is: there are techniques that make it appear error free. There are errors deep down at the very fundamental level even on the machines we are on, but they correct themselves, and so we don’t see them. Two, because quantum by its nature is operating at a quantum level, very tiny amounts of energy can cause what’s called coherence loss. So they don’t work for very long. We believed if we could get close to a millisecond, you can do some really, really careful computations. And so we went down a path and we think we have made a lot of progress on the error correction. We’re probably at a tenth of a millisecond, not quite at a millisecond yet, on the coherence times. We feel over the next three, four, five years—I give myself till the end of the decade—we will see something remarkable happen on that front and I’m really happy where our team is. If you can make the huge breakthrough that you say you hope to make by the end of the decade, where does that put IBM as a business? Does that leave you in a dominant position over the next wave of technology? There is hardware, and then there is all the people who will exploit it. So let me first begin with this: The people who will exploit it will be all our clients. They will get the value, whether it’s material discovery or better batteries or better fertilizers or better drugs, that value will be accrued by our clients. But who can give them a working quantum computer? I think that assuming the timeline and the breakthroughs I’m talking about happen, I think that gives us a tremendous position and the first-mover advantage in that market, to a point where I think that we would become the de-facto answer for those technologies. Technology has always been additive. The smartphone didn’t remove the laptop. I think quantum will be additive. But much like we helped invent mainframes in the PC, maybe on quantum we’ll occupy that same position for quite a while. Source link