Hands on tutorials demonstrating the concepts of Prediction engineering, Feature engineering and automation in data science. In a series of notebooks, we show how we can build predictive models from raw data within a day - all using open source software.
pandasis an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.Featuretoolsis a DARPA sponsored open source software that enables data scientists to automatically extract features from time varying temporal data.scikit-learnis a free software machine learning library for the Python programming language.
Prediction engineeringFeature engineering
NYC-Taxi-Dataset-Learn feature engineeringRetail-Dataset- Learn prediction engineering
Linux
sh install_linux.sh
source venv/bin/activate
pip install -r requirements.txt
jupyter notebookMac
sh install_osx.sh
source venv/bin/activate
pip install -r requirements.txt
jupyter notebook