This project aims to predict customer churn โ identifying customers who are likely to leave a company โ using machine learning techniques. The dataset includes historical customer information, and the goal is to build a model that helps businesses improve customer retention.
- Data cleaning and preprocessing
- Feature engineering and selection
- Scaling of features using StandardScaler
- Model building using Random Forest Classifier
- Model evaluation with accuracy, precision, recall, and AUC-ROC
- Python
- Pandas
- NumPy
- Scikit-learn
- Matplotlib / Seaborn (for visualization)
- Clone the repository
git clone https://github.com/yourusername/customer-churn-analysis.git - Install required libraries
pip install -r requirements.txt - Run the Jupyter notebook or Python scripts to explore data and train the model
- The Random Forest Classifier achieved an accuracy of 91% on the test dataset.
- Precision and recall scores indicate the model performs well in correctly identifying churned customers and minimizing false alarms.
- The AUC-ROC score was 0.92, showing good model discrimination between customers who churn and those who stay.
- Feature importance analysis highlighted key factors influencing churn, such as tenure, monthly charges, and contract type.
- The model helps the business proactively identify high-risk customers to improve retention strategies.