-
Notifications
You must be signed in to change notification settings - Fork 3
Updates to KDE for income distribution #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #37 +/- ##
===========================================
- Coverage 72.81% 41.02% -31.79%
===========================================
Files 3 3
Lines 103 195 +92
===========================================
+ Hits 75 80 +5
- Misses 28 115 +87
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@john-p-ryan I've added some code to the Here's some code I was using to test the results: # %%
import numpy as np
import pandas as pd
from iot import iot_user
# %%
biden2020_path = "https://raw.githubusercontent.com/PSLmodels/examples/main/psl_examples/taxcalc/Biden2020.json"
b = iot_user.iot_comparison(
policies=[biden2020_path],
labels=["Biden 2020"],
years=[2021],
dist_type="kde", # "log_normal",
kde_bw=1.5,
data="CPS",
)
# %%
# Plot theta_z
theta_plot = b.plot(var="theta_z")
theta_plot.show()
# %%
# Plot g_z
theta_plot = b.plot(var="g_z")
theta_plot.show()
# %%
# Plot f
theta_plot = b.plot(var="f")
theta_plot.show()
# %%
# Plot f
theta_plot = b.plot(var="f_prime")
theta_plot.show()
# %%
# plot income dist
# NOTE: this is not a function of the dist_type, just raw data
inc_plot = b.SaezFig2(upper_bound=2_000_000)
inc_plot.show()
# %%A few notes:
|
|
With latest changes to this branch, here's what the results are looking like (with a @john-p-ryan thoughts? In my view, |
|
@jdebacker I think this looks really good. Just so I'm understanding, what this does is it directly calculates the Pareto |
|
@john-p-ryan asks:
That's correct. The value it gives does vary with the KDE. But the KDE with a bandwidth of 1.5 (which are used for the plots above) gives an It did seem that a higher bandwidth would give less curvature on the KDE at the cutoff and the |
|
@john-p-ryan Here's what the same plots look like with a cutoff of $200k (rather than $350k) to splice the Pareto tail. The |








This PR updates how the KDE of the income distribution is handled. It does 3 main things:
bw_methodargument.