Skip to content

πŸ“Š End-to-end data analytics project analyzing a bank’s loan portfolio to identify profitable segments, high-risk borrowers, and strategic insights using Python, Jupyter Notebook, and data visualization.

Notifications You must be signed in to change notification settings

Harsh-Belekar/Bank-Loan-Analysis-Python

Repository files navigation

🏦 Bank Loan Analysis (Data Analytics Project)

A complete end-to-end loan portfolio analysis designed to help the bank understand loan performance, borrower behavior, portfolio risk, and profitability.
This project uses Python (Pandas, Matplotlib, Seaborn) for data cleaning, KPI derivation, exploratory data analysis (EDA), and visualization.

Python Jupyter Notebook Pandas Matplotlib Seaborn NumPy Data Analysis


πŸ“š Table of Contents


πŸ“Œ Business Problem / Problem Statement

The bank receives thousands of loan applications across different states, income groups, employment backgrounds, and loan purposes. However, the bank lacks clear visibility into:

  • Borrower repayment behavior
  • Loan profitability & losses
  • Seasonal trends in loan demand
  • High-risk vs. low-risk customer groups
  • Operational lending KPIs

To solve these challenges, the bank requires an in-depth analysis of its loan portfolio across multiple KPIs, borrower segments, loan types, and repayment patterns.
The goal is to strengthen underwriting decisions, reduce charge-offs, improve profitability, and optimize lending strategy.


🧩 Project Objectives

βœ” Calculate all core lending KPIs
βœ” Compare Good Loans vs Bad Loans
βœ” Identify high-risk & profitable borrower segments
βœ” Analyze trends by month, state, term, employment length, purpose & home ownership
βœ” Provide business recommendations to reduce losses and increase ROI


πŸ“Š Results Snapshot

  • βœ… Portfolio Net Profit: $37.31 Million
  • πŸ“‰ Total Loss from Bad Loans: $28.25 Million
  • πŸ’Ό Good Loan Success Rate: 86.18%
  • πŸ† Top Low-Risk Groups: Mortgage holders & 10+ year employees
  • ⚠️ High-Risk Groups: Renters & <1 year employment
  • 🌍 Highest-performing state: California (CA)

πŸ”— Quick Links

Access all important project files instantly:


πŸ› οΈ Tools & Technologies Used

Tool Purpose
Python Data analysis & visualization
Pandas Data cleaning, preprocessing, aggregation
Matplotlib / Seaborn KPI charts & EDA visualizations
Jupyter Notebook Exploratory analysis & reporting
CSV / Excel Dataset source

πŸ“Š Dataset Description

The dataset contains information on borrower demographics, financial metrics, loan attributes, and repayment status.

Preprocessing performed:

  • Removed missing or invalid values
  • Standardized formats (dates, categories, percentages)
  • Converted DTI, income, term & interest rate into numeric formats
  • Derived month & year columns
  • Categorized loans into Good Loans (Fully Paid) and Bad Loans (Charged Off)
  • Filtered incomplete or irrelevant rows
  • Prepared aggregated datasets for KPIs & charts

πŸ“ˆ BRD 1 β€” KPI Requirements

KPI Value
Total Loan Applications 38,576
Total Funded Amount $435.76M
Total Amount Received $473.07M
Net Portfolio Return $37.31M
Average Interest Rate 12.05%
Average DTI 13.33%

πŸ” Insights

  • Strong demand with 38k+ loan applications
  • Healthy repayment inflow exceeding total funded amount
  • 12% interest rate indicates moderate lending risk
  • Low DTI reflects financially stable borrowers

πŸ”„ BRD 1 β€” Good Loan vs Bad Loan Analysis

βœ… Good Loans (Fully Paid)

  • Applications: 33,243
  • Funded Amount: $370.22M
  • Amount Received: $435.79M
  • Share: 86.18%
  • Profit: $65.56M

Insight:
Good Loans form the bank’s profitable foundation but are highly concentrated in specific states and loan purposes.


❌ Bad Loans (Charged Off)

  • Applications: 5,333
  • Funded Amount: $65.53M
  • Amount Received: $37.28M
  • Share: 13.82%
  • Loss: $28.25M

Insight: Bad Loans are the main source of losses and require tighter underwriting and better borrower assessment.


πŸ“‰ BRD 2 β€” Visualization Requirements & Chart Insights

1️⃣ Total Funded Amount by Month

Total Amount Received by Month Β  Insight: Funding is stable with a strong rise in December, indicating peak demand.

2️⃣ Total Received Amount by Month

Total Amount Received by Month Β  Insight: Repayments follow the same pattern β€” highest collection in December.

3️⃣ Total Loan Applications by Month

Total Amount Received by Month Β  Insight: Consistent demand with a noticeable year-end increase.

4️⃣ Total Funded Amount by State

Total Amount Received by Month Β  Insight: California dominates funding β€” a major regional concentration risk.

5️⃣ Total Amount Received by State

Total Amount Received by Month Β  Insight:
CA also generates the highest repayments β€” confirms over-dependency.

6️⃣ Total Funded Amount by Term

Total Amount Received by Month Β  Insight: 36-month loans are the preferred and most funded option.

7️⃣ Total Amount Received by Term

Total Amount Received by Month Β  Insight: Shorter-term loans produce maximum repayments.

8️⃣ Total Funded Amount by Employee Length

Total Amount Received by Month Β  Insight: 10+ year employees receive the most funding; <1 year employees remain high-risk.

9️⃣ Total Amount Received by Employee Length

Total Amount Received by Month Β  Insight: Long-term employees generate reliable and high repayments.

πŸ”Ÿ Total Funded Amount by Loan Purpose

Total Amount Received by Month Β  Insight:
Debt Consolidation dominates β€” a single point of product risk.

1️⃣1️⃣ Total Amount Received by Loan Purpose

Total Amount Received by Month Β  Insight: Debt Consolidation also leads revenue β€” increasing dependency risk.

1️⃣2️⃣ Total Funded Amount by Home Ownership

Total Amount Received by Month Β  Insight: Mortgage holders receive the most funding β€” lowest risk group.

1️⃣3️⃣ Total Amount Received by Home Ownership

Total Amount Received by Month Β  Insight: Mortgage owners drive the highest repayments.


🧠 Key Insights Summary

  • Portfolio is profitable with $37.31M net return
  • 86% Good Loans demonstrate strong lending practices
  • Bad Loans cause $28.25M loss β€” key risk area
  • Heavy dependency on CA, Debt Consolidation, 36-month loans
  • Most reliable customers: Mortgage holders + 10+ year employees
  • Highest-risk customers: Renters + <1 year employment
  • December is the peak month for applications, funding, and repayments

🏦 Business Recommendations

πŸ“Œ 1. Improve Risk Control

  • Stricter underwriting for renters & new employees
  • Apply risk-based pricing for high-risk groups
  • Strengthen DTI and income verification

πŸ“Œ 2. Reduce Concentration Risk

  • Expand lending into TX, NY, FL, and other states
  • Reduce dependency on Debt Consolidation loans
  • Diversify product offerings

πŸ“Œ 3. Strengthen Collections

  • Improve early-stage collection reminders
  • Increase recovery efforts for high-risk borrowers

πŸ“Œ 4. Optimize Lending Profitability

  • Focus on long-term employees & mortgage holders
  • Promote 36-month loans β€” highest repayment efficiency

🧾 Final Conclusion: Loan Portfolio Risk & Strategy 🎯

The analysis confirms the bank’s loan business is in a phase of rapid, profitable growth, but with significant risk concentration in a few key areas.

Key Takeaways:

  • Profitability: The bank earns 37.31M Dollar net profit despite $28.25M losses.
  • Risk Concentration: Over-reliance on Debt Consolidation loans and California market poses serious exposure.
  • Customer Insights:
    • Reliable: 10+ years employed, Mortgage holders
    • Risky: Renters, <1 year employed

πŸ”‘ Recommended Actions:

  1. Tighten Underwriting for Debt Consolidation loans.
  2. Implement Risk-Based Pricing for Renters and short-term employees.
  3. Diversify Markets beyond California to reduce regional dependency.

βœ… Outcome:

By improving risk control and portfolio balance, the bank can increase profits, reduce default losses, and build a sustainable, data-driven lending strategy.


πŸ“ Project Structure

β”œβ”€β”€ Bank Loan Analysis.ipynb      # Main analysis notebook
β”œβ”€β”€ Bank_loan_data.csv            # Dataset file
β”œβ”€β”€ Business Problem              # Business Problem
β”œβ”€β”€ Bank Loan Analysis Report.pdf # Full Project Report
β”‚
β”œβ”€β”€ images/                       # Folder containing chart images
β”‚ β”œβ”€β”€ 01_Total_Funded_Amount_by_Month.png
β”‚ β”œβ”€β”€ 02_Total_Received_Amount_by_Month.png
β”‚ β”œβ”€β”€ 03_Total_Loan_Applications_by_Month.png
β”‚ β”œβ”€β”€ 04_Total_Funded_Amount_by_State.png
β”‚ β”œβ”€β”€ 05_Total_Amount_Received_by_State.png
β”‚ β”œβ”€β”€ 06_Total_Funded_Amount_by_Term.png
β”‚ β”œβ”€β”€ 07_Total_Amount_Received_by_Term.png
β”‚ β”œβ”€β”€ 08_Total_Funded_Amount_by_Employee_Length.png
β”‚ β”œβ”€β”€ 09_Total_Amount_Received_by_Employee_Length.png
β”‚ β”œβ”€β”€ 10_Total_Funded_Amount_by_Loan_Purpose.png
β”‚ β”œβ”€β”€ 11_Total_Amount_Received_by_Loan_Purpose.png
β”‚ β”œβ”€β”€ 12_Total_Funded_Amount_by_Home_Ownership.png
β”‚ └── 13_Total_Amount_Received_by_Home_Ownership.png
β”‚
└── README.md                     # Project documentation

πŸ§‘β€πŸ’» Author

πŸ‘€ Harsh Belekar
πŸ“ Data Analyst | Python | SQL | Power BI | Excel | Data Visualization
πŸ“¬ LinkedIn | πŸ”—GitHub

πŸ“§ [email protected]


⭐ If you found this project helpful, feel free to star the repo and connect with me for collaboration!

About

πŸ“Š End-to-end data analytics project analyzing a bank’s loan portfolio to identify profitable segments, high-risk borrowers, and strategic insights using Python, Jupyter Notebook, and data visualization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published