Miroslav Šedivý


Using Python to make the sun shine and the wind blow. Greedy polyglot, sustainable urbanist, unicode collector, PSF Fellow.

Solving Two Hard Problems in Computer Science Using Pandas Talk

English language

Miroslav Šedivý

There are 2 hard problems in computer science: cache invalidation, naming things, and off-by-1 errors. -- Leon Bambrick

When working with timeseries representing the production data of power plants, the fundamental step is to define a good data structure. The shape of the dataframes must be a good balance between speed and memory usage to work with the right and up-to-date amount of data (cache invalidation). The parameters must have good names for fast and unambiguous identification (naming things). The time intervals must be the correct fit for observed physics to prevent mismatches (off-by-1 errors).

I'll show you a few Python tricks and my experience with how I solved these problems in my projects.