ht5

Dmitriy Tsimokha

07.12.20

Turn the dark theme if you are reading it at night! (button with sun at the left)

library(foreign); library(knitr); library(ggplot2); library(kableExtra)
library(jtools); library(memisc); library(sjPlot); library(ggstance)
library(broom.mixed); library(stargazer); library(ggthemr)
df <- read.spss("dataht5.sav", use.value.labels = T, to.data.frame = T)
df <- na.omit(df[c("happy", "gndr", "inwyys", "yrbrn", "sclmeet")])

1. Visualizing distributions

Really nice graphing guide!

Even nicer amount of themes for plotting!

I just can’t stop choosing between different themes, help

Happy (dependent variable)

g <- ggplot(df, aes(x = happy))
g + geom_bar(stat = 'count') + 
  labs(x = "Level of happiness", y = "Count")

Level of happiness looks like skewed to the right distribution, but can be considered as quite close to normal distribution.

Gender

g <- ggplot(df, aes(x = gndr))
g + geom_bar(stat = 'count') + 
  labs(x = "Gender", y = "Count")

There are more female respondents than male ones - nearly on one fourth larger sample.

Age

df$age <- as.numeric(levels(df$inwyys))[df$inwyys] - as.numeric(levels(df$yrbrn))[df$yrbrn]
ggplot(df, aes(x = age)) + 
  geom_density(fill = "salmon", stat = 'density') + 
  stat_function(fun = dnorm, args = list(mean = mean(df$age), sd = sd(df$age))) +
  labs(x = "Age", y = "Density")

Distrubution of age have two peakes at ~30 and at ~60 but still can be considered as close to equal (if we really want to).

Meeting with friends and colleagues frequency

ggplot(df, aes(x = sclmeet)) + 
  geom_bar(stat = 'count') + 
  labs(x = "Frequency of meeting", y = "Count")

Too can be considered as normal distribution (if look from far distance).

2. Simple model with control variables (age & gender)

Great visualization tool for regression!

model1 <- lm(data = df, as.numeric(happy) ~ age + gndr)
tab_model(model1)

	as.numeric(happy)
Predictors	Estimates	CI	p
(Intercept)	8.25	8.00 – 8.49	<0.001
age	-0.02	-0.02 – -0.01	<0.001
gndr [Female]	-0.13	-0.32 – 0.05	0.145
Observations	2346
R² / R² adjusted	0.019 / 0.018

And gender is not significant! It can be seen not only by p-value larger than 0.05, but also that confident interval includes 0.

Even though age is significant it still causes very small effect at first glance - but if we consider that age don’t start from 0 and difference can be 50 or more years difference in effect can be really large (per -0.02 for each year).

Looks like the older we get the less happier we feel :(

3. More complex model with meeting frequency

model2 <- lm(data = df, as.numeric(happy) ~ age + gndr + sclmeet)
# https://strengejacke.github.io/sjPlot/articles/tab_model_estimates.html
tab_model(model2)

	as.numeric(happy)
Predictors	Estimates	CI	p
(Intercept)	7.15	6.67 – 7.64	<0.001
age	-0.01	-0.02 – -0.01	<0.001
gndr [Female]	-0.14	-0.32 – 0.04	0.122
sclmeet [Less than once a month]	0.59	0.15 – 1.02	0.008
sclmeet [Once a month]	0.80	0.34 – 1.25	0.001
sclmeet [Several times a month]	0.94	0.51 – 1.37	<0.001
sclmeet [Once a week]	1.19	0.74 – 1.64	<0.001
sclmeet [Several times a week]	1.20	0.74 – 1.66	<0.001
sclmeet [Every day]	1.20	0.70 – 1.70	<0.001
Observations	2346
R² / R² adjusted	0.038 / 0.035

The second model is more complex and include frequency of meeting with friends and colleagues: - gender again is not significant! - frequency of meeting is significant and have strong positive effect on level of happiness!

stargazer(anova(model1, model2), type = "html", title="ANoVA", intercept.bottom = FALSE,
          single.row = TRUE, align=TRUE, ci=TRUE, ci.level=0.95, no.space=TRUE)

**ANoVA**

Statistic	N	Mean	St. Dev.	Min	Pctl(25)	Pctl(75)	Max

Res.Df	2	2,340.000	4.243	2,337	2,338.5	2,341.5	2,343
RSS	2	10,810.390	147.594	10,706.020	10,758.210	10,862.570	10,914.750
Df	1	6.000		6.000	6.000	6.000	6.000
Sum of Sq	1	208.730		208.730	208.730	208.730	208.730
F	1	7.594		7.594	7.594	7.594	7.594
Pr(> F)	1	0.00000		0.00000	0.00000	0.00000	0.00000

The model2 with frequency of meeting is significantly better than model1: lower RSS shows that.

tab_model(model1, model2)

	as.numeric(happy)			as.numeric(happy)
Predictors	Estimates	CI	p	Estimates	CI	p
(Intercept)	8.25	8.00 – 8.49	<0.001	7.15	6.67 – 7.64	<0.001
age	-0.02	-0.02 – -0.01	<0.001	-0.01	-0.02 – -0.01	<0.001
gndr [Female]	-0.13	-0.32 – 0.05	0.145	-0.14	-0.32 – 0.04	0.122
sclmeet [Less than once a month]				0.59	0.15 – 1.02	0.008
sclmeet [Once a month]				0.80	0.34 – 1.25	0.001
sclmeet [Several times a month]				0.94	0.51 – 1.37	<0.001
sclmeet [Once a week]				1.19	0.74 – 1.64	<0.001
sclmeet [Several times a week]				1.20	0.74 – 1.66	<0.001
sclmeet [Every day]				1.20	0.70 – 1.70	<0.001
Observations	2346			2346
R² / R² adjusted	0.019 / 0.018			0.038 / 0.035

Still, each model describes only up to 0.04 variance max at the best case and that’s too bad!

4. Interpretation

Assumptions

plot_model(model2, type = "diag")[[1]]

We don’t have any multicollinearity!

plot_model(model2, type = "diag")[[2]]

And even residuals distributed normally and there are no major outliers!

plot_model(model2, type = "diag")[[3]]

Here again - residuals at the best shape and very nicely fits under the normal distribution line.

plot_model(model2, type = "diag")[[4]]

And here the baddest picture - homoscedasticity. Looks as bad as \[R^2\] of our models.

5. Visualization of the better model

shine bright!

plot_summs(model2, plot.distributions = TRUE, inner_ci_level = .9)

Again, here can be seen coefficients with confidence intervals of the best (second) model: - age is nearly on the line - gender crosses the line with it’s confidence interval - and for different levels of meeting frequency everything great!

Effects

plot_model(model2, type = "pred")[[1]]

Age has significant negative effect and with greater age level of happiness expected to lower.

plot_model(model2, type = "pred")[[2]]

Gender is not significant so that picture can be scipped.

plot_model(model2, type = "pred")[[3]]

And level of happiness significantly affected by meeting frequency with positive direction!

Summary

It does not matter what gender are you - just spend time with your friends more while you are young and be happy!