This project was carried out as part of a computer technology workshop training course in which the task was to analyze data on arbitrary datasets, as well as test various statistical hypotheses. The entire implementation was integrated via Jupyter Notebook, so the final version looks in the format .ipynb and .html. Separate source code implementation files are also provided.
- Directories:
- Methods (Functions):
-
Python:
-
extract_sport(string): function to extract sport category;
-
grubbs_test(array): function implements Grubbs test;
-
q_dixon_test(array): function implements Dixon-Q Test;
-
plot_ecdf(array, label, ax): function for creation ECDF with seaborn.
-
custom_ecdf(array): function creates ECDF of data;
-
envelope method(array, n): function implements envelope methods using bootstrap algorithm and function 'custom_ecdf(data)';
-
perform_normality_tests(array, name):: function implements tests to check hypothesis about normality of data;
-
f_test_variance(array_x, array_y, alpha):: function implements F test to check hypothesis the equality of variances;
-
compute_chi2_statistic(table)(tble_name):: function calculates chi-square statistics;
-
fit_polynomial_regression(degree):: function fits polynomial regression with polynoms with degree.
-
-
R:
- envelope_ecdf <- function(data) {...}__: function creates polygon for envelope method via ECDF;
- other methods are the same...
-
Main code realizations in .ipynb files in 1-2 steps directory.
All files for downlodad on the Yandex disk
Test of Hypotheses using statistics
MITlicense
