Tax Function Estimation Functions#
txfunc.py modules
ogcore.txfunc#
tau_{s,t} is the effective tax rate, marginal tax rate on labor income, or the marginal tax rate on capital income, for a given age (s) in a particular year (t). x is total labor income, and y is total capital income. ————————————————————————
- ogcore.txfunc.find_outliers(sse_mat, age_vec, se_mult, start_year, varstr, graph=False)[source]#
This function takes a matrix of sum of squared errors (SSE) from tax function estimations for each age (s) in each year of the budget window (t) and marks estimations that have outlier SSE.
- Parameters:
sse_mat (Numpy array) – SSE for each estimated tax function, size is BW x S
age_vec (numpy array) – vector of ages, length S
se_mult (scalar) – multiple of standard deviations before consider estimate an outlier
start_year (int) – first year of budget window
varstr (str) – name of tax function being evaluated
graph (bool) – whether to output graphs
- Returns:
- indicators of whether tax function
is outlier, size is BW x S
- Return type:
sse_big_mat (bool array_like)
- ogcore.txfunc.get_tax_rates(params, X, Y, wgts, tax_func_type, rate_type, analytical_mtrs=False, mtr_capital=False, for_estimation=True)[source]#
Generates tax rates given income data and the parameters of the tax functions.
- Parameters:
params (list) – list of parameters of the tax function, or nonparametric function for tax function type “mono”
X (array_like) – labor income data
Y (array_like) – capital income data
wgts (array_like) – weights for data observations
tax_func_type (str) – functional form of tax functions
rate_type (str) – type of tax rate: mtrx, mtry, etr
analytical_mtrs (bool) – whether to compute marginal tax rates from the total tax function (for DEP functions only)
mtr_capital (bool) – whether analytical mtr on capital income
for_estimation (bool) – whether the results are used in estimation, if True, then tax rates are computed as deviations from the mean
- Returns:
model tax rates for each observation
- Return type:
txrates (array_like)
- ogcore.txfunc.monotone_spline(x, y, weights, bins=None, lam=12, kap=10000000.0, incl_uncstr=False, show_plot=False, method='eilers', splines=None, plot_start=0, plot_end=100)[source]#
- Parameters:
method (string) – ‘eilers’ or ‘pygam’
splines (None or array-like) – for ‘pygam’ only (otherwise set None), number of splines used for each feature, if None use default
plot_start/plot_end (number between 0, 100) – for ‘pygam’ only if show_plot = True, start and end for percentile of data used in plot, can result in better visualizations if original data has strong outliers
- Returns:
- 2d with second dimension m, first
dimension is product of elements in bins, with each entry representative of bin across all the features
- yNew (numpy array): 1d with length same as first dimension
of xWeight, weighted average of y’s corresponding to each entry of xWeight
- weightsNew (numpy array): 1d with length same as yNew, weight
corresponding to each xNew, yNew row
- Return type:
xNew (numpy array)
- ogcore.txfunc.replace_outliers(param_list, sse_big_mat)[source]#
This function replaces outlier estimated tax function parameters with linearly interpolated tax function tax function parameters
- Parameters:
param_list (list) – estimated tax function parameters or nonparametric functions, size is BW x S x #TaxParams
sse_big_mat (bool, array_like) – indicators of whether tax function is outlier, size is BW x S
- Returns:
- estimated and interpolated tax function
parameters, size BW x S x #TaxParams
- Return type:
param_arr_adj (array_like)
- ogcore.txfunc.tax_data_sample(data, max_etr=0.65, min_income=5, max_mtr=0.99)[source]#
Function to create sample tax data for estimation by dropping observations with extreme values.
- Parameters:
data (DataFrame) – raw data from microsimulation model
- Returns:
selected sample
- Return type:
data (DataFrame)
- ogcore.txfunc.tax_func_estimate(micro_data, BW, S, starting_age, ending_age, start_year=2025, analytical_mtrs=False, tax_func_type='DEP', age_specific=False, desc_data=False, graph_data=False, graph_est=False, client=None, num_workers=1, tax_func_path=None)[source]#
This function performs analysis on the source data from microsimulation model and estimates functions for the effective tax rate (ETR), marginal tax rate on labor income (MTRx), and marginal tax rate on capital income (MTRy).
- Parameters:
micro_data (dict) – Dictionary of DataFrames with micro data
BW (int) – number of years in the budget window (the period over which tax policy is assumed to vary)
S (int) – number of model periods a model agent is economically active for
starting_age (int) – minimum age to estimate tax functions for
ending_age (int) – maximum age to estimate tax functions for
start_yr (int) – first year of budget window
analytical_mtrs (bool) – whether to use the analytical derivation of the marginal tax rates (and thus only need to estimate the effective tax rate functions)
tax_func_type (str) – functional form of tax functions
age_specific (bool) – whether to estimate age specific tax functions
client (Dask client object) – client
num_workers (int) – number of workers to use for parallelization with Dask
tax_func_path (str) – path to save pickle with estimated tax function parameters to
- Returns:
dictionary with tax function parameters
- Return type:
dict_param (dict)
- ogcore.txfunc.tax_func_loop(t, data, start_year, s_min, s_max, age_specific, tax_func_type, analytical_mtrs, desc_data, graph_data, graph_est, output_dir, numparams)[source]#
Estimates tax functions for a particular year. Looped over.
- Parameters:
t (int) – year of tax data to estimated tax functions for
data (Pandas DataFrame) – tax return data for year t
start_yr (int) – first year of budget window
s_min (int) – minimum age to estimate tax functions for
s_max (int) – maximum age to estimate tax functions for
age_specific (bool) – whether to estimate age specific tax functions
tax_func_type (str) – functional form of tax functions
analytical_mtrs (bool) – whether to use the analytical derivation of the marginal tax rates (and thus only need to estimate the effective tax rate functions)
desc_data (bool) – whether to print descriptive statistics
graph_data (bool) – whether to plot data
graph_est (bool) – whether to plot estimated coefficients
output_dir (str) – path to save output to
numparams (int) – number of parameters in tax functions
- Returns:
tax function estimation output:
TotPop_yr (int): total population derived from micro data
- Pct_age (Numpy array): fraction of observations that are
in each age bin
AvgInc (scalar): mean income in the data
AvgETR (scalar): mean effective tax rate in data
- AvgMTRx (scalar): mean marginal tax rate on labor income
in data
- AvgMTRy (scalar): mean marginal tax rate on capital income
in data
- frac_tax_payroll (scalar): fraction of total tax revenue
the comes from payroll taxes
- etrparam_arr (Numpy array): parameters of the effective
tax rate functions
- etr_wsumsq_arr (Numpy array): weighted sum of squares from
estimation of the effective tax rate functions
- etr_obs_arr (Numpy array): weighted sum of squares from
estimation of the effective tax rate functions
- mtrxparam_arr (Numpy array): parameters of the marginal
tax rate on labor income functions
- mtrx_wsumsq_arr (Numpy array): weighted sum of squares
from estimation of the marginal tax rate on labor income functions
- mtrx_obs_arr (Numpy array): weighted sum of squares from
estimation of the marginal tax rate on labor income functions
- mtryparam_arr (Numpy array): parameters of the marginal
tax rate on capital income functions
- mtry_wsumsq_arr (Numpy array): weighted sum of squares
from estimation of the marginal tax rate on capital income functions
- mtry_obs_arr (Numpy array): weighted sum of squares from
estimation of the marginal tax rate on capital income functions
- Return type:
(tuple)
- ogcore.txfunc.txfunc_est(df, s, t, rate_type, tax_func_type, numparams, output_dir, graph, params_init=None, global_opt=False)[source]#
This function uses tax tax rate and income data for individuals of a particular age (s) and a particular year (t) to estimate the parameters of a Cobb-Douglas aggregation function of two ratios of polynomials in labor income and capital income, respectively.
- Parameters:
df (Pandas DataFrame) – 11 variables with N observations of tax rates
s (int) – age of individual, >= 21
t (int) – year of analysis, >= 2016
rate_type (str) – type of tax rate: mtrx, mtry, etr
tax_func_type (str) – functional form of tax functions
numparams (int) – number of parameters in the tax functions
output_dir (str) – output directory for saving plot files
graph (bool) – whether to plot the estimated functions compared to the data
- Returns:
tax function estimation output:
params (Numpy array or function object): vector of estimated
parameters or nonparametric function object * wsse (scalar): weighted sum of squared deviations from minimization * obs (int): number of observations in the data, > 600
- Return type:
(tuple)
- ogcore.txfunc.wsumsq(params, *args)[source]#
This function generates the weighted sum of squared deviations of predicted values of tax rates (ETR, MTRx, or MTRy) from the tax rates from the data for the Cobb-Douglas functional form of the tax function.
- Parameters:
params (tuple) – tax function parameter values
args (tuple) – contains (fixed_tax_func_params, X, Y, txrates, wgts, tax_func_type, rate_type)
fixed_tax_func_params (tuple) – value of parameters of tax functions that are not estimated
X (array_like) – labor income data
Y (array_like) – capital income data
txrates (array_like) – tax rates data
wgts (array_like) – weights for data observations
tax_func_type (str) – functional form of tax functions
rate_type (str) – type of tax rate: mtrx, mtry, etr
- Returns:
weighted sum of squared deviations, >0
- Return type:
wssqdev (scalar)