RegressionTables.jl
This package provides publication-quality regression tables for use with FixedEffectModels.jl and GLM.jl, as well as any package that implements the RegressionModel abstraction.
In its objective it is similar to (and heavily inspired by) the Stata command esttab
and the R package stargazer
.
Table of Contents
Installation
To install the package, type in the Julia command prompt
] add RegressionTables
A brief demonstration
using RegressionTables, DataFrames, FixedEffectModels, RDatasets, CategoricalArrays
df = dataset("datasets", "iris")
df[!,:SpeciesDummy] = categorical(df[!,:Species])
rr1 = reg(df, @formula(SepalLength ~ SepalWidth + fe(SpeciesDummy)))
rr2 = reg(df, @formula(SepalLength ~ SepalWidth + PetalLength + fe(SpeciesDummy)))
rr3 = reg(df, @formula(SepalLength ~ SepalWidth + PetalLength + PetalWidth + fe(SpeciesDummy)))
rr4 = reg(df, @formula(SepalWidth ~ SepalLength + PetalLength + PetalWidth + fe(SpeciesDummy)))
regtable(rr1,rr2,rr3,rr4; renderSettings = asciiOutput())
yields
----------------------------------------------------------
SepalLength SepalWidth
------------------------------ ----------
(1) (2) (3) (4)
----------------------------------------------------------
SepalWidth 0.804*** 0.432*** 0.496***
(0.106) (0.081) (0.086)
PetalLength 0.776*** 0.829*** -0.188*
(0.064) (0.069) (0.083)
PetalWidth -0.315* 0.626***
(0.151) (0.123)
SepalLength 0.378***
(0.066)
----------------------------------------------------------
SpeciesDummy Yes Yes Yes Yes
----------------------------------------------------------
Estimator OLS OLS OLS OLS
----------------------------------------------------------
N 150 150 150 150
R2 0.726 0.863 0.867 0.635
----------------------------------------------------------
LaTeX output can be generated by using
regtable(rr1,rr2,rr3,rr4; renderSettings = latexOutput())
which yields
\begin{tabular}{lrrrr}
\toprule
& \multicolumn{3}{c}{SepalLength} & \multicolumn{1}{c}{SepalWidth} \\
\cmidrule(lr){2-4} \cmidrule(lr){5-5}
& (1) & (2) & (3) & (4) \\
\midrule
SepalWidth & 0.804*** & 0.432*** & 0.496*** & \\
& (0.106) & (0.081) & (0.086) & \\
PetalLength & & 0.776*** & 0.829*** & -0.188* \\
& & (0.064) & (0.069) & (0.083) \\
PetalWidth & & & -0.315* & 0.626*** \\
& & & (0.151) & (0.123) \\
SepalLength & & & & 0.378*** \\
& & & & (0.066) \\
\midrule
SpeciesDummy & Yes & Yes & Yes & Yes \\
\midrule
Estimator & OLS & OLS & OLS & OLS \\
\midrule
$N$ & 150 & 150 & 150 & 150 \\
$R^2$ & 0.726 & 0.863 & 0.867 & 0.635 \\
\bottomrule
\end{tabular}
Similarly, HTML tables can be created with htmlOutput()
.
Send the output to a text file by passing the destination file string to the asciiOutput()
, latexOutput()
, or htmlOutput()
functions:
regtable(rr1,rr2,rr3,rr4; renderSettings = latexOutput("myoutputfile.tex"))
then use \input
in LaTeX to include that file in your code. Be sure to use the booktabs
package:
\documentclass{article}
\usepackage{booktabs}
\begin{document}
\begin{table}
\label{tab:mytable}
\input{myoutputfile}
\end{table}
\end{document}
regtable()
can also print TableRegressionModel
's from GLM.jl (and output from other packages that produce TableRegressionModel
's):
using GLM
dobson = DataFrame(Counts = [18.,17,15,20,10,20,25,13,12],
Outcome = categorical(repeat(["A", "B", "C"], outer = 3)),
Treatment = categorical(repeat(["a","b", "c"], inner = 3)))
lm1 = fit(LinearModel, @formula(SepalLength ~ SepalWidth), df)
gm1 = fit(GeneralizedLinearModel, @formula(Counts ~ 1 + Outcome + Treatment), dobson,
Poisson())
regtable(rr1,lm1,gm1; renderSettings = asciiOutput())
yields
---------------------------------------------
SepalLength Counts
------------------- --------
(1) (2) (3)
---------------------------------------------
(Intercept) 6.526*** 6.526*** 3.045***
(0.479) (0.479) (0.171)
SepalWidth -0.223 -0.223
(0.155) (0.155)
Outcome: B -0.454
(0.202)
Outcome: C -0.293
(0.193)
Treatment: b 0.000
(0.200)
Treatment: c 0.000
(0.200)
---------------------------------------------
Estimator OLS OLS NL
---------------------------------------------
N 150 150 9
R2 0.014 0.014
---------------------------------------------
Printing of StatsBase.RegressionModel
s is experimental; please file as issue if you encounter problems printing them.
Function Reference
Function Arguments
-
rr::Union{FixedEffectModel,DataFrames.TableRegressionModel}...
are theFixedEffectModel
s fromFixedEffectModels.jl
(orTableRegressionModel
s fromGLM.jl
) that should be printed. Only required argument. -
regressors
is aVector
of regressor names (String
s) that should be shown, in that order. Defaults to an empty vector, in which case all regressors will be shown. -
fixedeffects
is aVector
of FE names (String
s) that should be shown, in that order. Defaults to an empty vector, in which case all FE's will be shown. Note that strings need to match the displayed label exactly, otherwise they will not be shown. -
align
is aSymbol
from the set[:l,:c,:r]
indicating the alignment of results columns (default:r
right-aligned). Currently affects only latex and ASCII output. -
labels
is aDict
that contains displayed labels for variables (strings) and other text in the table. If no label for a variable is found, it default to variable names. See documentation for special values. -
estimformat
is aString
that describes the format of the estimate. Defaults to "%0.3f". -
estim_decoration
is aFunction
that takes the formatted string and the p-value, and applies decorations (such as the beloved stars). Defaults to (* p<0.05, ** p<0.01, *** p<0.001). -
statisticformat
is aString
that describes the format of the number below the estimate (se/t). Defaults to "%0.4f". -
below_statistic
is aSymbol
that describes a statistic that should be shown below each point estimate. Recognized values are:blank
,:se
,:tstat
, and:none
.:none
suppresses the line. Defaults to:se
. -
below_decoration
is aFunction
that takes the formatted statistic string, and applies a decorations. Defaults to round parentheses. -
regression_statistics
is aVector
ofSymbol
s that describe statistics to be shown at the bottom of the table. Recognized symbols are:nobs
,:r2
,:r2_a
,:r2_within
,:f
,:p
,:f_kp
,:p_kp
, and:dof
. Defaults to[:nobs, :r2]
. -
custom_statistics
is aNamedTuple
that takes user specified statistics to be shown just aboveregression_statistics
. By default each statistic will be labelled by its key (e.g.__LABEL_STATISTIC_mystat__
for the statisticmystat
). Defaults tomissing
. Seetest/RegressionTables.jl
for an example of how to use this. -
number_regressions
is aBool
that governs whether regressions should be numbered. Defaults totrue
. -
number_regressions_decoration
is aFunction
that governs the decorations to the regression numbers. Defaults tos -> "($s)"
. -
groups
is aVector
of labels used to group regressions. This can be useful if results are shown for different data sets or sample restrictions. Defaults to[]
. -
print_fe_section
is aBool
that governs whether a section on fixed effects should be shown. Defaults totrue
. -
print_estimator_section
is aBool
that governs whether to print a section on which estimator (OLS/IV) is used. Defaults totrue
. -
print_result
is aBool
that governs whether the table should be printed tostdout
. Defaults totrue
. -
standardize_coef
is aBool
that governs whether the table should show standardized coefficients. Note that this only works withTableRegressionModel
s, and that only coefficient estimates and thebelow_statistic
are being standardized (i.e. the R^2 etc still pertain to the non-standardized regression). -
out_buffer
is anIOBuffer
that the output gets sent to (unless an output file is specified, in which case the output is only sent to the file). -
renderSettings::RenderSettings
is aRenderSettings
composite type that governs how the table should be rendered. Standard supported types are ASCII (viaasciiOutput(outfile::String)
) and LaTeX (vialatexOutput(outfile::String)
). If no argument to these two functions are given, the output is sent to STDOUT. Defaults to ASCII with STDOUT. -
transform_labels
is a function or aDict
that is used to transform labels. Defaults toidentity
.Some common use cases can be achieved by passing a
Symbol
instead::latex
,:ampersand
,:underscore
,:underscore2space
. For illustration, here are the three ways to escape forbidden LaTeX characters.# Option 1 regtable(rr; renderSettings = latexOutput(), transform_labels = :latex) # Option 2 repl_dict = Dict("&" => "\\&", "%" => "\\%", "\$" => "\\\$", "#" => "\\#", "_" => "\\_", "{" => "\\{", "}" => "\\}") regtable(rr; renderSettings = latexOutput(), transform_labels = repl_dict) # Option 3 function transform(s, repl_dict=repl_dict) for (old, new) in repl_dict s = replace.(s, Ref(old => new)) end s end regtable(rr; renderSettings = latexOutput(), transform_labels = transform)
Label Codes
The following is the exhaustive list of strings that govern the output of labels. Use e.g.
labels = Dict("__LABEL_STATISTIC_N__" => "Number of observations")
to change the label for the row showing the number of observations in each regression.
-
__LABEL_ESTIMATOR__
(default: "Estimator") -
__LABEL_ESTIMATOR_OLS__
(default: "OLS") -
__LABEL_ESTIMATOR_IV__
(default: "IV") -
__LABEL_ESTIMATOR_NL__
(default: "NL") -
__LABEL_FE_YES__
(default: "Yes") -
__LABEL_FE_NO__
(default: "") -
__LABEL_STATISTIC_N__
(default: "N" inasciiOutput()
) -
__LABEL_STATISTIC_R2__
(default: "R2" inasciiOutput()
) -
__LABEL_STATISTIC_R2_A__
(default: "Adjusted R2" inasciiOutput()
) -
__LABEL_STATISTIC_R2_WITHIN__
(default: "Within-R2" inasciiOutput()
) -
__LABEL_STATISTIC_F__
(default: "F" inasciiOutput()
) -
__LABEL_STATISTIC_P__
(default: "F-test p value" inasciiOutput()
) -
__LABEL_STATISTIC_F_KP__
(default: "First-stage F statistic" inasciiOutput()
) -
__LABEL_STATISTIC_P_KP__
(default: "First-stage p value" inasciiOutput()
) -
__LABEL_STATISTIC_DOF__
(default: "Degrees of Freedom" inasciiOutput()
)
Frequently Asked Questions
What's the best way to render regression tables in Pluto.jl?
Use renderSettings = htmlOutput()
and print_result = false
, and print the resulting String
as text/html
. This page shows an example.