Deep Learning and Automatic Differentiation from Theano to PyTorch

December 26, 2017 by Doug Black

In this video from CSCS-ICS-DADSi Summer School, Atilim Güneş Baydin presents: Deep Learning and Automatic Differentiation from Theano to PyTorch.

Inquisitive minds want to know what causes the universe to expand, how M-theory binds the smallest of the small particles or how social dynamics can lead to revolutions. In recent centuries, developments in science and technology brought us closer to explore the expanding universe, discover unknown particles like bosons or find out how and why a society interacts and reacts. To explain the fascinating phenomena of nature, Natural scientists develop complex ‘mechanistic models’ of deterministic or stochastic nature. But the hard question is how to choose the best model for our data or how to calibrate the model given the data.

The way that statisticians answer these questions is with Approximate Bayesian Computation (ABC), which we learn on the first day of the summer school and which we combine with High Performance Computing. The second day focuses on a popular machine learning approach ‘Deep-learning’ which mimics the deep neural network structure in our brain, in order to predict complex phenomena of nature. The summer school takes a route of open discussion and brainstorming sessions where we explore two cornerstones of today’s data-science, ABC and Deep Learning being accelerated by HPC with hands on examples and exercises.

We are ready to start with you a journey towards unveiling the mysteries of nature, sharing and integrating ideas from ABC and Deep Learning.

Sign up for our insideHPC Newsletter

Comments

Peter Foelsche says

December 26, 2017 at 12:21 pm

I’m an AD expert since around 2000. I did my first forward AD implementation with C++ in 2002 (dual numbers). This procedure got optimized since then — and it is still being optimized. The problem of different places in the code depending on a different subset of independent variables has been solved (by meta programming) in around 2009. I also exploit mixing chain-rule with normal forward differentiation to minimize the number of derivatives being carried. From my experience dual numbers results in much better performance than source code transformation for a variety of reasons. For higher order derivatives I’m using truncated Taylor series since a couple of years — in C++. I’m very curious that people today (2017) are using python to perform automatic differentiation! What a waste of CPU time.
- Rabk says
  
  December 27, 2017 at 6:37 pm
  
  Can you expand on your comment of meta programming. I think I understand but want to get your fully.elaborated explanation.
  - Peter Foelsche says
    
    December 30, 2017 at 12:48 pm
    
    Here is my reply in a readable way:
    
    https://1drv.ms/b/s!AjPouTovor8CitQA8dUCveHQ4ToXDA

Deep Learning and Automatic Differentiation from Theano to PyTorch

Sponsored Guest Articles

Microsoft and NVIDIA Together Advance AI

White Papers

Energy efficiency drives HPC to the cloud

Comments

Featured RSS Feed

More News from insideBIGDATA

Deep Learning and Automatic Differentiation from Theano to PyTorch

Sponsored Guest Articles

Microsoft and NVIDIA Together Advance AI

White Papers

Energy efficiency drives HPC to the cloud

Join Us On Social Media

Comments

Related Posts

Featured RSS Feed

More News from insideBIGDATA