The gold standard for validating any scientific assumption is to run an experiment. Data science isn’t any different. Unfortunately, it’s not always possible to design the perfect experiment. In this talk, we’ll take a realistic look at what’s achievable by data scientists through experimentation. We’ll look at experimental design, explore some of the reasons a perfect experiment isn’t always possible, and look to the social sciences for some alternatives to conduct quasi-experiments with observational data. Finally, we’ll be honest about what the methods we have at our disposal can do.
Skipper Seabold is Director of Data Science R&D at Civis Analytics in Chicago. He leads a team of data scientists from all walks of life from physicists and civil engineers to statisticians and computer scientists. Together they drive the data science behind the products Civis offers and push the capabilities of solutions that Civis provides to its clients. He is an economist by training and has a decade of experience working in the Python data open source community. He started and led the statsmodels Python project, was formerly on the core pandas team, and has contributed to many projects in Python data stack. He holds strong opinions about writing and barbecue.