A complete end-to-end loan portfolio analysis designed to help the bank understand loan performance, borrower behavior, portfolio risk, and profitability.
This project uses Python (Pandas, Matplotlib, Seaborn) for data cleaning, KPI derivation, exploratory data analysis (EDA), and visualization.
- π¦ Bank Loan Analysis (Data Analytics Project)
- π Business Problem / Problem Statement
- π§© Project Objectives
- π Results Snapshot
- π Quick Links
- π οΈ Tools & Technologies Used
- π Dataset Description
- π BRD 1 β KPI Requirements
- π BRD 1 β Good Loan vs Bad Loan Analysis
- π BRD 2 β Visualization Requirements & Chart Insights
- π§ Key Insights Summary
- π¦ Business Recommendations
- π§Ύ Final Conclusion: Loan Portfolio Risk & Strategy
- π Project Structure
- π§βπ» Author
The bank receives thousands of loan applications across different states, income groups, employment backgrounds, and loan purposes. However, the bank lacks clear visibility into:
- Borrower repayment behavior
- Loan profitability & losses
- Seasonal trends in loan demand
- High-risk vs. low-risk customer groups
- Operational lending KPIs
To solve these challenges, the bank requires an in-depth analysis of its loan portfolio across multiple KPIs, borrower segments, loan types, and repayment patterns.
The goal is to strengthen underwriting decisions, reduce charge-offs, improve profitability, and optimize lending strategy.
β Calculate all core lending KPIs
β Compare Good Loans vs Bad Loans
β Identify high-risk & profitable borrower segments
β Analyze trends by month, state, term, employment length, purpose & home ownership
β Provide business recommendations to reduce losses and increase ROI
- β Portfolio Net Profit: $37.31 Million
- π Total Loss from Bad Loans: $28.25 Million
- πΌ Good Loan Success Rate: 86.18%
- π Top Low-Risk Groups: Mortgage holders & 10+ year employees
β οΈ High-Risk Groups: Renters & <1 year employment- π Highest-performing state: California (CA)
Access all important project files instantly:
- π Project Report (PDF): Bank Loan Analysis Report
- π§© Business Problem Document (PDF): Business Problem
- π Jupyter Notebook: Bank Loan Analysis.ipynb
- π Dataset: Bank_loan_data.csv
- πΌοΈ Visualization Images: Images Folder
- π GitHub Repository: Bank-Loan-Analysis-Python
| Tool | Purpose |
|---|---|
| Python | Data analysis & visualization |
| Pandas | Data cleaning, preprocessing, aggregation |
| Matplotlib / Seaborn | KPI charts & EDA visualizations |
| Jupyter Notebook | Exploratory analysis & reporting |
| CSV / Excel | Dataset source |
The dataset contains information on borrower demographics, financial metrics, loan attributes, and repayment status.
Preprocessing performed:
- Removed missing or invalid values
- Standardized formats (dates, categories, percentages)
- Converted DTI, income, term & interest rate into numeric formats
- Derived month & year columns
- Categorized loans into Good Loans (Fully Paid) and Bad Loans (Charged Off)
- Filtered incomplete or irrelevant rows
- Prepared aggregated datasets for KPIs & charts
| KPI | Value |
|---|---|
| Total Loan Applications | 38,576 |
| Total Funded Amount | $435.76M |
| Total Amount Received | $473.07M |
| Net Portfolio Return | $37.31M |
| Average Interest Rate | 12.05% |
| Average DTI | 13.33% |
- Strong demand with 38k+ loan applications
- Healthy repayment inflow exceeding total funded amount
- 12% interest rate indicates moderate lending risk
- Low DTI reflects financially stable borrowers
- Applications: 33,243
- Funded Amount: $370.22M
- Amount Received: $435.79M
- Share: 86.18%
- Profit: $65.56M
Insight:
Good Loans form the bankβs profitable foundation but are highly concentrated in specific states and loan purposes.
- Applications: 5,333
- Funded Amount: $65.53M
- Amount Received: $37.28M
- Share: 13.82%
- Loss: $28.25M
Insight: Bad Loans are the main source of losses and require tighter underwriting and better borrower assessment.
Β
Insight:
Funding is stable with a strong rise in December, indicating peak demand.
Β
Insight:
Repayments follow the same pattern β highest collection in December.
Β
Insight:
Consistent demand with a noticeable year-end increase.
Β
Insight:
California dominates funding β a major regional concentration risk.
Β
Insight:
CA also generates the highest repayments β confirms over-dependency.
Β
Insight:
36-month loans are the preferred and most funded option.
Β
Insight:
Shorter-term loans produce maximum repayments.
Β
Insight:
10+ year employees receive the most funding; <1 year employees remain high-risk.
Β
Insight:
Long-term employees generate reliable and high repayments.
Β
Insight:
Debt Consolidation dominates β a single point of product risk.
Β
Insight:
Debt Consolidation also leads revenue β increasing dependency risk.
Β
Insight:
Mortgage holders receive the most funding β lowest risk group.
Β
Insight:
Mortgage owners drive the highest repayments.
- Portfolio is profitable with $37.31M net return
- 86% Good Loans demonstrate strong lending practices
- Bad Loans cause $28.25M loss β key risk area
- Heavy dependency on CA, Debt Consolidation, 36-month loans
- Most reliable customers: Mortgage holders + 10+ year employees
- Highest-risk customers: Renters + <1 year employment
- December is the peak month for applications, funding, and repayments
- Stricter underwriting for renters & new employees
- Apply risk-based pricing for high-risk groups
- Strengthen DTI and income verification
- Expand lending into TX, NY, FL, and other states
- Reduce dependency on Debt Consolidation loans
- Diversify product offerings
- Improve early-stage collection reminders
- Increase recovery efforts for high-risk borrowers
- Focus on long-term employees & mortgage holders
- Promote 36-month loans β highest repayment efficiency
The analysis confirms the bankβs loan business is in a phase of rapid, profitable growth, but with significant risk concentration in a few key areas.
- Profitability: The bank earns 37.31M Dollar net profit despite $28.25M losses.
- Risk Concentration: Over-reliance on Debt Consolidation loans and California market poses serious exposure.
- Customer Insights:
- Reliable: 10+ years employed, Mortgage holders
- Risky: Renters, <1 year employed
- Tighten Underwriting for Debt Consolidation loans.
- Implement Risk-Based Pricing for Renters and short-term employees.
- Diversify Markets beyond California to reduce regional dependency.
By improving risk control and portfolio balance, the bank can increase profits, reduce default losses, and build a sustainable, data-driven lending strategy.
βββ Bank Loan Analysis.ipynb # Main analysis notebook
βββ Bank_loan_data.csv # Dataset file
βββ Business Problem # Business Problem
βββ Bank Loan Analysis Report.pdf # Full Project Report
β
βββ images/ # Folder containing chart images
β βββ 01_Total_Funded_Amount_by_Month.png
β βββ 02_Total_Received_Amount_by_Month.png
β βββ 03_Total_Loan_Applications_by_Month.png
β βββ 04_Total_Funded_Amount_by_State.png
β βββ 05_Total_Amount_Received_by_State.png
β βββ 06_Total_Funded_Amount_by_Term.png
β βββ 07_Total_Amount_Received_by_Term.png
β βββ 08_Total_Funded_Amount_by_Employee_Length.png
β βββ 09_Total_Amount_Received_by_Employee_Length.png
β βββ 10_Total_Funded_Amount_by_Loan_Purpose.png
β βββ 11_Total_Amount_Received_by_Loan_Purpose.png
β βββ 12_Total_Funded_Amount_by_Home_Ownership.png
β βββ 13_Total_Amount_Received_by_Home_Ownership.png
β
βββ README.md # Project documentation
π€ Harsh Belekar
π Data Analyst | Python | SQL | Power BI | Excel | Data Visualization
π¬ LinkedIn | πGitHub
π§ [email protected]
β If you found this project helpful, feel free to star the repo and connect with me for collaboration!