Sign up for our newsletter and get the latest HPC news and analysis.
Send me information from insideHPC:


Heroes of Deep Learning: Andrew Ng interviews Ian Goodfellow from Google Brain

In this video from the Heroes of Deep Learning series, Andrew Ng interviews Ian Goodfellow from the Google Brain Project.

Transcript:

Andrew NG: Hi, Ian. Thanks a lot for joining us today.

Ian Goodfellow: Thank you for inviting me, Andrew. I am glad to be here.

Andrew NG: Today, you are one of the world’s most visible deep learning researchers. Let us share a bit about your personal story. So, how do you end up doing this work that you now do?

Ian Goodfellow: I guess I first became interested in machine learning right before I met you, actually. I had been working on neuroscience and my undergraduate adviser, Jerry Cain, at Stanford encouraged me to take your Intro to AI class. Oh, I didn’t know that. Okay. So I had always thought that AI was a good idea, but that in practice, the main, I think, idea that was happening was like game AI, where people have a lot of hard-coded rules for non-player characters in games to say different scripted lines at different points in time. And then, when I took your Intro to AI class and you covered topics like linear regression and the variance decomposition of the error of linear regression, I started to realize that this is a real science and I could actually have a scientific career in AI rather than neuroscience.

Andrew NG: I see. Great. And then what happened?

Ian Goodfellow: Well, I came back and I’d TA to your course later. So a really big turning point for me was while I was TA-ing that course, one of the students, my friend Ethan Dreifuss, got interested in Geoff Hinton’s deep belief net paper. And the two of us ended up building one of the first GPU CUDA-based machines at Stanford in order to run Watson machines in our spare time over winter break. And at that point, I started to have a very strong intuition that deep learning was the way to go in the future, that a lot of the other algorithms that I was working with, like support vector machines, didn’t seem to have the right asymptotics, that you add more training data and they get slower, or for the same amount of training data, it’s hard to make them perform a lot better by changing other settings. At that point, I started to focus on deep learning as much as possible.

Andrew NG: And I remember Richard Reyna’s very old GPU paper acknowledges you for having done a lot of early work.

Ian Goodfellow: Yeah. Yeah. That was written using some of the machines that we built. Yeah. The first machine I built was just something that Ethan and I built at Ethan’s mom’s house with our own money, and then later, we used lab money to build the first two or three for the Stanford lab.

Andrew NG: Wow that’s great. I never knew that story. That’s great. And then, today, one of the things that’s really taken the deep learning world by storm is your invention of GANs. So how did you come up with that?

Ian Goodfellow: I’ve been studying generative models for a long time, so GANs are a way of doing generative modeling where you have a lot of training data and you’d like to learn to produce more examples that resemble the trading data, but they’re imaginary. They’ve never been seen exactly in that form before. There were several other ways of doing generative models that had been popular for several years before I had the idea for GANs. And after I’d been working on all those other methods throughout most of my Ph.D., I knew a lot about the advantages and disadvantages of all the other frameworks like Boltzmann machines and sparse coding and all the other approaches that have been really popular for years. I was looking for something that avoid all these disadvantages at the same time. And then finally, when I was arguing about generative models with my friends in a bar, something clicked into place, and I started telling them, You need to do, this, this, and this and I swear it will work. And my friends didn’t believe me that it would work. I was supposed to be writing the deep learning textbook at the time, I see. But I believed strongly enough that it would work that I went home and coded it up the same night and it worked.

Andrew NG: So it took you one evening to implement the first version of GANs?

Ian Goodfellow: I implemented it around midnight after going home from the bar where my friend had his going-away party. And the first version of it worked, which is very, very fortunate. I didn’t have to search for hyperparameters or anything.

Andrew NG: There was a story, I read it somewhere, where you had a near-death experience and that reaffirmed your commitment to AI. Tell me that one.

Ian Goodfellow: So, yeah. I wasn’t actually near death but I briefly thought that I was. I had a very bad headache and some of the doctors thought that I might have a brain hemorrhage. And during the time that I was waiting for my MRI results to find out whether I had a brain hemorrhage or not, I realized that most of the thoughts I was having were about making sure that other people would eventually try out the research ideas that I had at the time. I see. I see. In retrospect, they’re all pretty silly research ideas. But at that point, I realized that this was actually one of my highest priorities in life, was carrying out my machine learning research work.

Andrew NG: That’s great, that when you thought you might be dying soon, you’re just thinking how to get the research done. That’s commitment. So today, you’re still at the center of a lot of the activities with GANs, with Generative Adversarial Networks. So tell me how you see the future of GANs.

Ian Goodfellow: Right now, GANs are used for a lot of different things, like semi-supervised learning, generating training data for other models and even simulating scientific experiments. In principle, all of these things could be done by other kinds of generative models. So I think that GANs are at an important crossroads right now. Right now, they work well some of the time, but it can be more of an art than a science to really bring that performance out of them. It’s more or less how people felt about deep learning in general 10 years ago. And back then, we were using deep belief networks with Boltzmann machines as the building blocks, and they were very, very finicky.

Over time, we switched to things like rectified linear units and batch normalization, and deep learning became a lot more reliable. If we can make GANs become as reliable as deep learning has become, then I think we’ll keep seeing GANs used in all the places they’re used today with much greater success. If we aren’t able to figure out how to stabilize GANs, then I think their main contribution to the history of deep learning is that they will have shown people how to do all these tasks that involve generative modeling, and eventually, we’ll replace them with other forms of generative models. So I spend maybe about 40 percent of my time right now working on stabilizing GANs.

Andrew NG: Okay. Oh, and so just as a lot of people that joined deep learning about 10 years ago, such as yourself, wound up being pioneers, maybe the people that join GANs today, if it works out, could end up the early pioneers.

Ian Goodfellow: Yeah. A lot of people already are early pioneers of GANs, and I think if you wanted to give any kind of history of GANs so far, you’d really need to mention other groups like Indico and Facebook and Berkeley for all the different things that they’ve done. So in addition to all your research, you also coauthored a book on deep learning.

Andrew NG: How is that going?

Ian Goodfellow: That’s right, with Yoshua Bengio and Aaron Courville, who are my Ph.D. co-advisers. We wrote the first textbook on the modern version of deep learning, and that has been very popular, both in the English edition and the Chinese edition. We’ve sold about, I think around 70,000 copies total between those two languages. And I’ve had a lot of feedback from students who said that they’ve learned a lot from it. One thing that we did a little bit differently than some other books is we start with a very focused introduction to the kind of math that you need to do in deep learning. I think one thing that I got from your courses at Stanford is that linear algebra and probability are very important, that people get excited about the machine learning algorithms, but if you want to be a really excellent practitioner, you’ve got to master the basic math that underlies the whole approach in the first place. So we make sure to give a very focused presentation of the math basics at the start of the book. That way, you don’t need to go ahead and learn all that linear algebra, that you can get a very quick crash course in the pieces of linear algebra that are the most useful for deep learning.

So even someone whose math is a little shaky or haven’t seen the math for a few years will be able to start from the beginning of your book and get that background and get into deep learning. All of the facts that you would need to know are there. It would definitely take some focused effort to practice making use of them.

Andrew NG: If someone’s really afraid of math, it might be a bit of a painful experience. But if you’re ready for the learning experience and you believe you can master it, I think all the tools that you need are there. As someone that worked in deep learning for a long time, I’d be curious, if you look back over the years. Tell me a bit about how you’re thinking of AI and deep learning has evolved over the years.

Ian Goodfellow: Ten years ago, I felt like, as a community, the biggest challenge in machine learning was just how to get it working for AI-related tasks at all. We had really good tools that we could use for simpler tasks, where we wanted to recognize patterns in how to extract features, where a human designer could do a lot of the work by creating those features and then hand it off to the computer. Now, that was really good for different things like predicting which ads a user would click on or different kinds of basic scientific analysis. But we really struggled to do anything involving millions of pixels in an image or a raw audio wave form where the system had to build all of its understanding from scratch.

We finally got over the hurdle really thoroughly maybe five years ago. And now, we’re at a point where there are so many different paths open that someone who wants to get involved in AI, maybe the hardest problem they face is choosing which path they want to go down.

Andrew NG: Do you want to make reinforcement learning work as well as supervised learning works? Do you want to make unsupervised learning work as well as supervised learning works? Do you want to make sure that machine learning algorithms are fair and don’t reflect biases that we’d prefer to avoid? Do you want to make sure that the societal issues surrounding AI work out well, that we’re able to make sure that AI benefits everyone rather than causing social upheaval and trouble with loss of jobs?

Ian Goodfellow: I think right now, there’s just really an amazing amount of different things that can be done, both to prevent downsides from AI but also to make sure that we leverage all of the upsides that it offers us.

Andrew NG: And so today, there are a lot of people wanting to get into AI. So, what advice would you have for someone like that?

Ian Goodfellow: I think a lot of people that want to get into AI start thinking that they absolutely need to get a Ph.D. or some other kind of credential like that. I don’t think that’s actually a requirement anymore. One way that you could get a lot of attention is to write good code and put it on GitHub. If you have an interesting project that solves a problem that someone working at the top level wanted to solve, once they find your GitHub repository, they’ll come find you and ask you to come work there. A lot of the people that I’ve hired or recruited at OpenAI last year or at Google this year, I first became interested in working with them because of something that I saw that they released in an open-source forum on the Internet.

Writing papers and putting them on Archive can also be good. A lot of the time, it’s harder to reach the point where you have something polished enough to really be a new academic contribution to the scientific literature, but you can often get to the point of having a useful software product much earlier. So read your book, practice the materials and post on GitHub and maybe on Archive. I think if you learned by reading the book, it’s really important to also work on a project at the same time, to either choose some way of applying machine learning to an area that you are already interested in. Like if you’re a field biologist and you want to get into deep learning, maybe you could use it to identify birds, or if you don’t have an idea for how you’d like to use machine learning in your own life, you could pick something like making a Street View house numbers classifier, where all the data sets are set up to make it very straightforward for you. And that way, you get to exercise all of the basic skills while you read the book or while you watch Coursera videos that explain the concepts to you.

Andrew NG: So over the last couple of years, I’ve also seen you do one more work on adversarial examples. Tell us a bit about that.

Ian Goodfellow: Yeah. I think adversarial examples are the beginning of a new field that I call machine learning security. In the past, we’ve seen computer security issues where attackers could fool a computer into running the wrong code. That’s called application-level security. And there’s been attacks where people can fool a computer into believing that messages on a network come from somebody that is not actually who they say they are. That’s called network-level security.

Now, we’re starting to see that you can also fool machine-learning algorithms into doing things they shouldn’t, even if the program running the machine-learning algorithm is running the correct code, even if the program running the machine-learning algorithm knows who all the messages on the network really came from. And I think, it’s important to build security into a new technology near the start of its development. We found that it’s very hard to build a working system first and then add security later. So I am really excited about the idea that if we dive in and start anticipating security problems with machine learning now, we can make sure that these algorithms are secure from the start instead of trying to patch it in retroactively years later.

Andrew NG: Thank you. That was great. There’s a lot about your story that I thought was fascinating and that, despite having known you for years, I didn’t actually know, so thank you for sharing all that.

Ian Goodfellow: You are very welcome. Thank you for inviting me.

Check out the Deep Learning class on Coursera

Sign up for our insideHPC Newsletter

Resource Links: