In this video from PyCon SG 2015, Anand S. presents: Faster Data Processing in Python. This talk covers methods to process and analyze visualize data faster in Python. The primary focus is on the technique (should you optimize? what to optimize? how to optimize?) while covering libraries that help with this (line_profiler, Pandas, Numba, etc.)
Working with data in Python requires making a number of choices, ranging from the simple to the complex.
- Should I use pickle, CSV or JSON? (Ans: CSV).
- What do I read it with: csv.DictReader or csv.reader? (Ans: Pandas).
- How should I parse dates? (Ans: Anything but Pandas / dateutil
- How do I optimize numpy calculations? (Ans: Learn vector algebra)
- How do I run a function in parallel?
- How to make my program restartable?
- How do I use multiple cores?
This session will explain how to benchmark code and share insights on the patterns of programming that make your application faster.