Prof. Luke Stein
This repository has material that supplements what is posted on the Babson FIN 6200 Canvas page. It exists mainly to provide publicly accessible URLs for shared data files (in /data/
) and template notebook files (in /templates/
), a course schedule, and links to external resources.
Python notebooks can run in the cloud using Google Colab or Binder, but you will probably want a local installation. I strongly recommend using the Anaconda Python distribution.
Anaconda includes (almost) everything you need to get going, but in line with these recommendations, I prefer to work in Visual Studio Code with some add-in extensions.
These are the critical packages we will rely on; if you need a package not included with Anaconda, you should first try to install it using conda install
and only if that doesn’t work, install using pip
Additional packages that may be useful include YData Profiling (automated EDA), Pyjanitor (data cleaning), and dataprep (data cleaning and automated EDA)
Introduction to Modern Statistics (1st ed., and repo, data repo), Mine Çetinkaya-Rundel and Johanna Hardin
Data Analytics Using Microsoft Excel With Accounting and Finance Datasets (v2.0 for Excel 2016 or v3.0 for Excel 365), Joseph M. Manzo
Relevant resources TBA
Course designed with significant advice/help/inspiration from Don Bowen, Michael Goldstein, Grant McDermott, Cameron Pfiffer, Seth Pruitt, and Arthur Turrell