# Tax Function Estimation Functions#

txfunc.py modules

## ogcore.txfunc#

tau_{s,t} is the effective tax rate, marginal tax rate on labor income, or the marginal tax rate on capital income, for a given age (s) in a particular year (t). x is total labor income, and y is total capital income. ————————————————————————

ogcore.txfunc.find_outliers(sse_mat, age_vec, se_mult, start_year, varstr, graph=False)[source]#

This function takes a matrix of sum of squared errors (SSE) from tax function estimations for each age (s) in each year of the budget window (t) and marks estimations that have outlier SSE.

Parameters:
• sse_mat (Numpy array) – SSE for each estimated tax function, size is BW x S

• age_vec (numpy array) – vector of ages, length S

• se_mult (scalar) – multiple of standard deviations before consider estimate an outlier

• start_year (int) – first year of budget window

• varstr (str) – name of tax function being evaluated

• graph (bool) – whether to output graphs

Returns:

indicators of whether tax function

is outlier, size is BW x S

Return type:

sse_big_mat (bool array_like)

ogcore.txfunc.get_tax_rates(params, X, Y, wgts, tax_func_type, rate_type, analytical_mtrs=False, mtr_capital=False, for_estimation=True)[source]#

Generates tax rates given income data and the parameters of the tax functions.

Parameters:
• params (list) – list of parameters of the tax function, or nonparametric function for tax function type “mono”

• X (array_like) – labor income data

• Y (array_like) – capital income data

• wgts (array_like) – weights for data observations

• tax_func_type (str) – functional form of tax functions

• rate_type (str) – type of tax rate: mtrx, mtry, etr

• analytical_mtrs (bool) – whether to compute marginal tax rates from the total tax function (for DEP functions only)

• mtr_capital (bool) – whether analytical mtr on capital income

• for_estimation (bool) – whether the results are used in estimation, if True, then tax rates are computed as deviations from the mean

Returns:

model tax rates for each observation

Return type:

txrates (array_like)

ogcore.txfunc.monotone_spline(x, y, weights, bins=None, lam=12, kap=10000000.0, incl_uncstr=False, show_plot=False, method='eilers', splines=None, plot_start=0, plot_end=100)[source]#
Parameters:
• method (string) – ‘eilers’ or ‘pygam’

• splines (None or array-like) – for ‘pygam’ only (otherwise set None), number of splines used for each feature, if None use default

• plot_start/plot_end (number between 0, 100) – for ‘pygam’ only if show_plot = True, start and end for percentile of data used in plot, can result in better visualizations if original data has strong outliers

Returns:

2d with second dimension m, first

dimension is product of elements in bins, with each entry representative of bin across all the features

yNew (numpy array): 1d with length same as first dimension

of xWeight, weighted average of y’s corresponding to each entry of xWeight

weightsNew (numpy array): 1d with length same as yNew, weight

corresponding to each xNew, yNew row

Return type:

xNew (numpy array)

ogcore.txfunc.replace_outliers(param_list, sse_big_mat)[source]#

This function replaces outlier estimated tax function parameters with linearly interpolated tax function tax function parameters

Parameters:
• param_list (list) – estimated tax function parameters or nonparametric functions, size is BW x S x #TaxParams

• sse_big_mat (bool, array_like) – indicators of whether tax function is outlier, size is BW x S

Returns:

estimated and interpolated tax function

parameters, size BW x S x #TaxParams

Return type:

ogcore.txfunc.tax_data_sample(data, max_etr=0.65, min_income=5, max_mtr=0.99)[source]#

Function to create sample tax data for estimation by dropping observations with extreme values.

Parameters:

data (DataFrame) – raw data from microsimulation model

Returns:

selected sample

Return type:

data (DataFrame)

ogcore.txfunc.tax_func_estimate(micro_data, BW, S, starting_age, ending_age, start_year=2021, analytical_mtrs=False, tax_func_type='DEP', age_specific=False, desc_data=False, graph_data=False, graph_est=False, client=None, num_workers=1, tax_func_path=None)[source]#

This function performs analysis on the source data from microsimulation model and estimates functions for the effective tax rate (ETR), marginal tax rate on labor income (MTRx), and marginal tax rate on capital income (MTRy).

Parameters:
• micro_data (dict) – Dictionary of DataFrames with micro data

• BW (int) – number of years in the budget window (the period over which tax policy is assumed to vary)

• S (int) – number of model periods a model agent is economically active for

• starting_age (int) – minimum age to estimate tax functions for

• ending_age (int) – maximum age to estimate tax functions for

• start_yr (int) – first year of budget window

• analytical_mtrs (bool) – whether to use the analytical derivation of the marginal tax rates (and thus only need to estimate the effective tax rate functions)

• tax_func_type (str) – functional form of tax functions

• age_specific (bool) – whether to estimate age specific tax functions

• client (Dask client object) – client

• num_workers (int) – number of workers to use for parallelization with Dask

• tax_func_path (str) – path to save pickle with estimated tax function parameters to

Returns:

dictionary with tax function parameters

Return type:

dict_param (dict)

ogcore.txfunc.tax_func_loop(t, data, start_year, s_min, s_max, age_specific, tax_func_type, analytical_mtrs, desc_data, graph_data, graph_est, output_dir, numparams)[source]#

Estimates tax functions for a particular year. Looped over.

Parameters:
• t (int) – year of tax data to estimated tax functions for

• data (Pandas DataFrame) – tax return data for year t

• start_yr (int) – first year of budget window

• s_min (int) – minimum age to estimate tax functions for

• s_max (int) – maximum age to estimate tax functions for

• age_specific (bool) – whether to estimate age specific tax functions

• tax_func_type (str) – functional form of tax functions

• analytical_mtrs (bool) – whether to use the analytical derivation of the marginal tax rates (and thus only need to estimate the effective tax rate functions)

• desc_data (bool) – whether to print descriptive statistics

• graph_data (bool) – whether to plot data

• graph_est (bool) – whether to plot estimated coefficients

• output_dir (str) – path to save output to

• numparams (int) – number of parameters in tax functions

Returns:

tax function estimation output:

• TotPop_yr (int): total population derived from micro data

• Pct_age (Numpy array): fraction of observations that are

in each age bin

• AvgInc (scalar): mean income in the data

• AvgETR (scalar): mean effective tax rate in data

• AvgMTRx (scalar): mean marginal tax rate on labor income

in data

• AvgMTRy (scalar): mean marginal tax rate on capital income

in data

• frac_tax_payroll (scalar): fraction of total tax revenue

the comes from payroll taxes

• etrparam_arr (Numpy array): parameters of the effective

tax rate functions

• etr_wsumsq_arr (Numpy array): weighted sum of squares from

estimation of the effective tax rate functions

• etr_obs_arr (Numpy array): weighted sum of squares from

estimation of the effective tax rate functions

• mtrxparam_arr (Numpy array): parameters of the marginal

tax rate on labor income functions

• mtrx_wsumsq_arr (Numpy array): weighted sum of squares

from estimation of the marginal tax rate on labor income functions

• mtrx_obs_arr (Numpy array): weighted sum of squares from

estimation of the marginal tax rate on labor income functions

• mtryparam_arr (Numpy array): parameters of the marginal

tax rate on capital income functions

• mtry_wsumsq_arr (Numpy array): weighted sum of squares

from estimation of the marginal tax rate on capital income functions

• mtry_obs_arr (Numpy array): weighted sum of squares from

estimation of the marginal tax rate on capital income functions

Return type:

(tuple)

ogcore.txfunc.txfunc_est(df, s, t, rate_type, tax_func_type, numparams, output_dir, graph, params_init=None, global_opt=False)[source]#

This function uses tax tax rate and income data for individuals of a particular age (s) and a particular year (t) to estimate the parameters of a Cobb-Douglas aggregation function of two ratios of polynomials in labor income and capital income, respectively.

Parameters:
• df (Pandas DataFrame) – 11 variables with N observations of tax rates

• s (int) – age of individual, >= 21

• t (int) – year of analysis, >= 2016

• rate_type (str) – type of tax rate: mtrx, mtry, etr

• tax_func_type (str) – functional form of tax functions

• numparams (int) – number of parameters in the tax functions

• output_dir (str) – output directory for saving plot files

• graph (bool) – whether to plot the estimated functions compared to the data

Returns:

tax function estimation output:

• params (Numpy array or function object): vector of estimated

parameters or nonparametric function object * wsse (scalar): weighted sum of squared deviations from minimization * obs (int): number of observations in the data, > 600

Return type:

(tuple)

ogcore.txfunc.wsumsq(params, *args)[source]#

This function generates the weighted sum of squared deviations of predicted values of tax rates (ETR, MTRx, or MTRy) from the tax rates from the data for the Cobb-Douglas functional form of the tax function.

Parameters:
• params (tuple) – tax function parameter values

• args (tuple) – contains (fixed_tax_func_params, X, Y, txrates, wgts, tax_func_type, rate_type)

• fixed_tax_func_params (tuple) – value of parameters of tax functions that are not estimated

• X (array_like) – labor income data

• Y (array_like) – capital income data

• txrates (array_like) – tax rates data

• wgts (array_like) – weights for data observations

• tax_func_type (str) – functional form of tax functions

• rate_type (str) – type of tax rate: mtrx, mtry, etr

Returns:

weighted sum of squared deviations, >0

Return type:

wssqdev (scalar)