- The programming language used in this project is python 3.
- The packages used in this project including: pandas, numpy, sklearn, scipy, matplotlib, seaborn, pytorch, skorch, xgboost, lightgbm, etc.
Codes about Loading data and Data preprocessing, original data and processed data are kept in dataprocessing folder.
- Run
Previous 2-hour train.ipynb, it can generate '2h_route_avg_train.xlsx'. - Run
Previous 2-hour test.ipynb, it can generate2h_route_avg_test.xlsx. - Run
5002_data_preprocessing.ipynb, it can generatetraining_nn.csvandtest_data.csv. - These files already exist in
processed_datafolder, to rewrite these files, you need to uncomment the corresponding code block.
Codes about Prediction algoritm and Performance and our models are kept in models folder.
- Run
Training partinTrain and Ensemble Models.ipynb, it can generate.pklfiles of our models. - Run
Ensemble model and Evaluation in validation datasetpart inTrain and Ensemble Models.ipynb, it can load pre-trained models and get responding results and perform prediction on validation data and compute MAPE of the ensemble model. - Run
Prediction on test datapart inTrain and Ensemble Models.ipynb, it can generatesubmission.csv.
Our submission file is kept in submission folder named submission.csv.