***************************************
*Margins/ Model Visualization in Stata*
**************Alex Kirss***************
***************************************

* Presented December 2018

* Some of this code has been cynically repurposed from Alex Fisher's notes circa Fall 2016

* It is slightly harder to visualize model results in Stata than in R
* This is mainly because in Stata you can only have one data frame in memory at a time
* In order to 'crack' and 'pack' model results, therefore, you have to first 'crack' and save model outputs out to a specified file path
* You then 'pack' these saved files together using the 'append' command

* The main benefits of estimating models in Stata are: 
* 1) It is easier to estimate non linear fixed effects models 
* 2) It is easier to calculate complicated marginal effects (e.g. w/ interaction terms)
* 3) Standard error calculations can be easier

******************
*Note on Graphics*
******************

* Stata graphics are harder to customize than ggplot
* You can just set your scheme permanently to 'plotplainblind' from Dan Bischof if you don't want to be as picky

clear
ssc install blindschemes, replace all
set scheme plotplainblind, permanently

* To go back to the default just use 's2color' as your scheme, note however that it looks BAD

set scheme s2color, permanently

* For schemes helps see

h schemes

**************************************
*Example Using Data From Alexis Blanc* 
**************************************

*GOAL: Estimate the effect of ballistic missile possession on crisis initiation

* First, set your working directory to a "Figures" folder that you can save output to

clear
cd "/Users/alexanderkirss/Desktop/Ongoing Projects/StatsWorkshops/181213Margins/Figures"

* Then load your data

use "/Users/alexanderkirss/Desktop/Ongoing Projects/StatsWorkshops/181213Margins/MettlerReiterJCRreplication_27 Oct 2018.dta"

* Then estimate your model
* I also generally use 'qui' to model it quietly (i.e. not display output)

*Model 1: Mettler & Reiter's original model
qui probit chall sqrtcmindist contig cincparity majpowa jointdem abmrange nukadum bbmrange nukbdum pcyrs pcyrs2 pcyrs3, robust cluster (id)

*****************************
*Estimating Marginal Effects*
*****************************

* What are Margins?
* The 'margins' command computes so-called margins of responses
* A “margin” is a statistic computed from predictions of a model while manipulating the values of the covariates
* “conditional margin”: a prediction from a model where all covariates are set to fixed values
* “predictive margin”: if some covariates are not fixed

* For a continuous covariate, margins computes the first derivative of the response with respect to the covariate
* For a discrete covariate, margins computes the effect of a discrete change of the covariate (discrete change effects)
* MEM: marginal effects at mean values, AME: average marginal effects at underlyings values, MER: marginal effects at representative values that you choose

* Since this is a nonlinear model, I want to extract the marginal effect of my treatment variable rather than the coefficient
* I.e. what is effect on the predicted probability in my model of a change in my treatment variable?
* Make sure to "post" the marginal effect in order to export it (no one knows exactly what this does)

* Since this is a binary treatment, we take the first difference using the 'dydx' indicator
* Margins can take a while to calculate, so be patient

margins, dydx(abmrange) post

* Can plot margins using the 'marginsplot' command

marginsplot

* This looks dumpy, so let's clean it up

marginsplot, horizontal unique recastci(rspike) plotop(m(O) mc(gs0)) ciop(lc(gs0)) ///
	title("Figure 1: Marginal Effect of Ballistic Missiles on Conflict Initiation") ///
	xlab(, nogrid) ylab(, nogrid)

* Alternatively, save out the marginal effect using the 'parmest' command

ssc install parmest

* If I hadn't set up a working directory then I would need to write out the full file path

parmest, label list(parm estimate min* max* p) saving(marg1, replace)

* Does the effect vary across values of another covariate?
* You need a model in your memory to calculate
* Use the 'at' command in order to specify which values of the covariate to sample
* This will give you predicted probabilities at the specified values
* Weirdly I have to make sure to indicate the binary and continuous variables in this margins command

qui probit chall c.sqrtcmindist contig cincparity majpowa jointdem i.abmrange nukadum bbmrange nukbdum pcyrs pcyrs2 pcyrs3, robust cluster (id)

* Calculating this margin takes a while, because we have seven representative values

qui margins i.abmrange, at(c.sqrtcmindist=(0(20)140))

marginsplot, unique recast(scatter) recastci(rspike) plotop(m(O)) ///
	title("Figure 2: Marginal Effect of Ballistic Missiles on Challenge Over Various Ranges") ///
	xlab(, nogrid) ylab(, nogrid)
	
graph export "Figure2.png", replace

* Now THAT is a cool plot

* Let's calculate the AME of the difference at these values

qui probit chall c.sqrtcmindist contig cincparity majpowa jointdem i.abmrange nukadum bbmrange nukbdum pcyrs pcyrs2 pcyrs3, robust cluster (id)
qui margins, dydx(abmrange) at(c.sqrtcmindist=(0(20)140))

marginsplot, unique recast(scatter) recastci(rspike) plotop(m(O)) ///
	title("Figure 3: Marginal Effect of Ballistic Missiles on Challenge Over Various Ranges") ///
	xlab(, nogrid) ylab(, nogrid)

graph export "Figure3.png", replace

	
* Now that is a REALLY cool plot

***************************
*Another Graphical Wrapper*
***************************

*Model 2: Mettler & Reiter's original model (FFT, 150km, no range maximum) with better Year and Range data, and fixed Nuk dummy variable 

*NOTE FOR ALEXIS: in "Probit H1-H2" file this *doesn't* have abmbbm in the controls, in "Estsimp Probit H3-5" it does

qui probit chall sqrtcmindist contig cincparity majpowa jointdem ab_MR_IR_150_FFT_a nukadum_df ab_MR_IR_150_FFT_b nukbdum_df abmbbm ///
	pcyrs pcyrs2 pcyrs3, robust cluster (id)

* Let's calculate and save out multiple margins

margins, dydx(ab_MR_IR_150_FFT_a ab_MR_IR_150_FFT_b abmbbm) post
parmest, label list(parm estimate min* max* p) saving(marg2, replace)

* We will be using the 'eclplot' command as a graphical wrapper
ssc install eclplot

* It is slightly more customizable than the 'marginsplot' command, which generally is used for only one model
* Can also play around with the 'coefplot' package if you are interested

clear
use "marg2.dta"

* You can see we now have a data frame in our memory with our parameter estimates of interest

* Label your models with a destringed identifier

replace label = "1" in 1
replace label = "2" in 2
replace label = "3" in 3

destring label, generate(label2) 

* Then generate the eclplot, saving it out to your figures folder

* What is in this code?
* 1) The first line sets up the actual plot, highlighting the parameter of interest (estimate) the CI (min95 max95) and label (label2)
* 2) Next line determines the plot type (rspike) and colors the points and CIs
* 3) Third line puts a line at zero and flips the plot on its side
* 4) Fourth, I label my labels, placing text so I can tell each variable apart (could also label different models if those are my estimates)
* 5) Fifth, I turn off the grid lines
* 6) Finally I title both the axes and whole figure

eclplot estimate min95 max95 label2, ///
	rplottype(rspike) estopts(m(O) mc(gs0)) ciopts(lc(gs0)) ///
	hori xline(0, lpattern(shortdash) lc(gs8)) ///
	ylabel(1 "ab_MR_IR_150_FFT_a" 2 "ab_MR_IR_150_FFT_b" 3 "abmbbm") ///
	xlab(, nogrid) ylab(, nogrid) ///
	ytitle("Challenge") xtitle(Marginal Effect (AME)) ///
	title("Figure 1: Effect of Ballistic Missile Possession on Challenge")

graph export "Figure1.png", replace
	
	
********************************
*Doing this with Simpler Models*
********************************

* Going back to "The Other Guys" Database from Emrich and Kirss

clear
cd "/Users/alexanderkirss/Desktop/Ongoing Projects/StatsWorkshops/181213Margins/Figures2"
use "/Users/alexanderkirss/Desktop/Ongoing Projects/StatsWorkshops/181213Margins/togd.dta"

* Run your model and calculate margins
* I generally use the 'eststo' command first in order to store the full model results

ssc install estout

eststo: qui logit PresRun ForeignCom Party Age AgeQuad

margins, dydx(ForeignCom) post

parmest, label list(parm estimate min* max* p) saving(marg3, replace)

* Iterate this across your various models sequentially

* Model 2
eststo: qui logit PresRun ForeignCom Party Age AgeQuad Chamberseni
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(marg4, replace)

* Model 3
eststo: qui logit PresRun ForeignCom Party Age AgeQuad AveragePresApproval
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(marg5, replace)

* Model 4
eststo: qui logit PresRun ForeignCom Party Age AgeQuad Chamberseni AveragePresApproval
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(marg6, replace)

* You can output the full model results in tabular form using 'esttab'

esttab using logit.csv, replace
eststo clear

* Build the plot
clear

use "marg3.dta"
append using marg4.dta
append using marg5.dta
append using marg6.dta

replace label = "1" in 1
replace label = "2" in 2
replace label = "3" in 3
replace label = "4" in 4

destring label, generate(label2) 

eclplot estimate min95 max95 label2, ///
	rplottype(rspike) estopts(m(O) mc(gs0)) ciopts(lc(gs0)) ///
	hori xline(0, lpattern(shortdash) lc(gs8)) ///
	xlab(, nogrid) ylab(, nogrid) ///
	ylabel(1 "Model 1 (Logit)" 2 "Model 2 (Logit)" 3 "Model 3 (Logit)" 4 "Model 4 (Logit)") ///
	ytitle("P(Presidential Run)") xtitle(Marginal Effect (AME)) ///
	title("Figure 1: Effect of SFRC Membership on Presidential Run")

graph export "Figure1.png", replace

********************************
*Fixed Effects Logit Estimation*
********************************

* It is FAR easier to calculate fixed effect logit models in Stata than R
* Just use the 'xtlogit' command and the code from above

* Have to reload data

clear
use "/Users/alexanderkirss/Desktop/Ongoing Projects/StatsWorkshops/181213Margins/togd.dta"

* First, define your indicators using 'xtset'
xtset ICPSR

* Then calculate your models as above, saving out the marginal effects

* Model 1
eststo: qui xtlogit PresRun ForeignCom Party Age AgeQuad, fe
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(margFE1, replace)

* Model 2
eststo: qui xtlogit PresRun ForeignCom Party Age AgeQuad Chamberseni, fe
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(margFE2, replace)

* Model 3
eststo: qui xtlogit PresRun ForeignCom Party Age AgeQuad AveragePresApproval, fe
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(margFE3, replace)

* Model 4
eststo: qui xtlogit PresRun ForeignCom Party Age AgeQuad Chamberseni AveragePresApproval, fe
margins, dydx(ForeignCom) post
parmest, label list(parm estimate min* max* p) saving(margFE4, replace)

esttab using logitfe.csv, replace
eststo clear

clear
use "marg3.dta"
append using marg4.dta
append using marg5.dta
append using marg6.dta
append using margFE1.dta
append using margFE2.dta
append using margFE3.dta
append using margFE4.dta


replace label = "1" in 1
replace label = "2" in 2
replace label = "3" in 3
replace label = "4" in 4
replace label = "5" in 5
replace label = "6" in 6
replace label = "7" in 7
replace label = "8" in 8
destring label, generate(label2)


eclplot estimate min95 max95 label2, ///
	rplottype(rspike) estopts(m(O) mc(gs0)) ciopts(lc(gs0)) hori xline(0, lpattern(shortdash) lc(gs8)) ///
	xlab(, nogrid) ylab(, nogrid) ///
	ylabel(1 "Model 1 (Logit)" 2 "Model 2 (Logit)" 3 "Model 3 (Logit)" 4 "Model 4 (Logit)" 5 "Model 1 (FE)" 6 "Model 2 (FE)" 7 "Model 3 (FE)" 8 "Model 4 (FE)") ///
	ytitle("P(Presidential Run)") xtitle(Marginal Effect (AME)) title("Figure 2: Effect of SFRC Membership on Presidential Run, Logit vs. FE")

graph export "Figure2.png", replace


* As usual, check out the help files for 'margins,' 'marginsplot,' and 'eclplot' for more information 

h margins

h marginsplot

h eclplot