The Killer App for Machine Learning: In Conversation with Pedro Domingos, Head of Machine Learning, D.E. Shaw

Best-selling author, Professor of Computer Science at the University of Washington, recent recipient of the prestigious IJCAI John McCarthy Award for excellence in artificial intelligence research (among other awards) and Head of the Machine Learning Research group at D.E. Shaw:  Pedro Domingos has one of the most incredible resumes in the world of AI, and we were thrilled to host him for a fireside chat at our most recent Data Driven NYC. 

We covered a bunch of things, including why finance is a killer app for machine learning, his much-lauded book, ‘The Master Algorithm’ and what’s truly scary about AI (hint: not the Terminator).

The video is below, followed by some conversation notes.

Finance is “the killer app for machine learning”:

  • Financial markets are the world’s most advanced information processing systems, so the field of riches of what you can do with machine learning is vast, it’s an “amazing playground”.  Also finance is incredibly important – “the brain of the economy” – and Pedro wanted his work in machine learning to have impact. Hence his taking the reins as Head of Machine Learning at DE Shaw last year.
  • For the application of machine learning in finance, it’s still very early days. Some of the stuff people have been doing in finance for a long time is simple machine learning, and some people were using neural networks back in the 80s and 90s.   But now we have a lot more data and a lot more computing power, so with our creativity in machine learning research, “We are so much in the beginning that we can’t even picture where we’re going to be 20 years from now”
  • Alternative data is a perfect example of this – there’s data from the market themselves, but now we have those rivers of data.  Some of it is irrelevant, some is relevant in less obvious ways… for some the value appears if you combine datasets… now we have data that people couldn’t imagine 20 years ago, let alone all of the issues we are seeing in machine learning are there, except on a much larger scale.

The Master Algorithm:

  • Pedro’s book “The Master Algorithm” takes readers on a journey through the five dominant paradigms of machine learning research on a quest for the master  algorithm. Along the way, Pedro wanted to abstract away from the mechanics so that a broad audience, from the CXO to the consumer, can understand how machine learning is shaping our lives. 
  • This is not covered in the talk, but for context, the 5 paradigms from the book are:
    • Rule based learning (Decision trees, Random Forests, etc)
    • Connectivism (neural networks, etc)
    • Bayesian (Naive Bayes, Bayesian Networks, Probabilistic Graphical Models)
    • Analogy (KNN & SVMs)
    • Unsupervised Learning (Clustering, dimensionality reduction, etc)
  • In short, a master algorithm can theoretically learn anything “in the same way that a master key can open any lock” and each of these five paradigms have proposed master algorithms .
  • The master algorithm doesn’t  exist yet
  • Some of the most prominent supporters of the master algorithm theory have different candidates for which algorithm could be the master – Rich Sutton (for reinforcement learning) and Geoffrey Hinton (for artificial neural networks).
  • Backpropagation has gone from strength to strength although, unlike many optimists, he does not  think that more data, and more GPUs will be enough to make backprop the master algorithm. While it solves the very difficult problem of credit assignment, at the end of the day a  real master algorithm has to solve many problems not just one.
  • Ultimately, Pedro believes that success will come from unifying the different major types of learning and their master algorithms –not just combining, but unifying them such that “it feels like using one thing”.

What is next in AI?

  • While many leaders in the space think that things are running out of steam,  Pedro believes that with orders of magnitude more people working on solving problems, there’s orders of magnitude more progress every year. 
  • Pedro asks: what’s missing?  Firstly, even if we succeeded in completely unifying the five major paradigms we would still be very far from done because when you compare machine learning with human learning we have an ability to generalize from data that dwarfs machines’ ability to learn from data e.g. OpenAI and the Rubik’s Cube and DeepMind with AlphaGo required the equivalent of thousands of years of gameplay to achieve those milestones. 
  •  At some point, AI is going to have to stop just porting ideas from other fields, and mature with its own native ideas.

Terminator vs Real Concerns

  • While Pedro says most concerns and criticisms around AI are about the wrong things, and are overexaggerated by media and by people falsely anthropomorphizing AI, there are some legitimate concerns.
  • Something that wasn’t and should have been on our radar 10 years ago is that while AI can be a great tool for democracy, it is also “an amazing tool for authoritarianism”. 
  • The other real danger of AI is not that computers will get too smart and take over the world, but  that they are too stupid and have already taken over the world. With computers making decisions about us,  we really need to make AI more competent and stop AI from being a black box – there needs to be better interaction between us and AI so it can understand us better.

Let a  thousand flowers bloom – audience questions:

  •  “I hear quite a bit about the inaccurate comparison between humans and machines – we have millions of years of transferred learning through DNA, billions of neuron connections compared to a few for deep learning… – if we only had a few we would be declared  brain dead, yet we make this connection?”

–      Pedro alluded to this several times in our talk: many of the comparisons people make between humans and AIs are inaccurate in many different ways. Having said that if you look at the number  of connections that the state of the art machine learning systems for some of these problems have, they’re more than many animals – they have many hundreds of millions or billions of connections. 

–       And as far as the evolutionary side of things, nature had 500 million years to evolve starting with the Cambrian explosion and this was an incredibly inefficient process. What we’re trying to do with AI is just to repeat that but a million times faster, or even 10  or 100 million times faster – and I think we can. But for evolution there is a bottleneck, which is the size of your genome.  

  • Your brain is the master algorithm in a sense, so how many lines of code will that master algorithm have?  Pedro believes somebody will write that code but how long will that program be? 100 lines, 10,000, a million? He doesn’t think it will be 1000, but probably not 1 million either, which means Microsoft windows is already very much bigger than that with much less intelligence.
  • As the state of machine learning  advances, will the ability to explain these algorithms and understand them at  the human level be possible? Or will we just have to deal with the decisions they make?

–       There is a tradeoff between explainability and accuracy.  In the long run, the thing that is exciting about machine learning is that it can learn things that are beyond our capability to learn or even our ability to understand.

–       There was this period of a couple of 100 years where we understood our technology.  Now we just have to learn live in a world where we don’t understand the machines that work  for us, we just have to be confident they are working for us and doing their best. 

–       We’ve always lived in a world which we didn’t completely understand but now we’re living in a world designed by us – for Pedro, that’s actually an improvement.

  • I’m a neuroscience researcher and I’m on the side of Hinton  that we need to learn from the brain to figure out how to bring these machine learning models  to the next level… Do you agree that we need to learn from the brain?

–       There is a full spectrum of opinions on this,  but we should let a thousand flowers bloom. People who criticize any of these approaches always have the upper hand because we  really don’t know how “to do” machine learning yet – we should let these flowers bloom and let people explore.

–       Pedro is a big believer in being inspired by the brain, but saying we need to follow the brain is too strong.

–       But at the end of the day, what we know about neuroscience today is not enough to determine what we do in AI, it’s only enough to give us ideas.  In fact it’s a two way street – AI can help us to learn how the brain works and this loop between the two disciplines is a very important one and is growing very rapidly.

Leave a Reply

Your email address will not be published. Required fields are marked *