ht5

Dmitriy Tsimokha

07.12.20

Turn the dark theme if you are reading it at night! (button with sun at the left)

library(foreign); library(knitr); library(ggplot2); library(kableExtra)
library(jtools); library(memisc); library(sjPlot); library(ggstance)
library(broom.mixed); library(stargazer); library(ggthemr)
df <- read.spss("dataht5.sav", use.value.labels = T, to.data.frame = T)
df <- na.omit(df[c("happy", "gndr", "inwyys", "yrbrn", "sclmeet")])

1. Visualizing distributions

Really nice graphing guide!

Even nicer amount of themes for plotting!

I just can’t stop choosing between different themes, help

Happy (dependent variable)

g <- ggplot(df, aes(x = happy))
g + geom_bar(stat = 'count') + 
  labs(x = "Level of happiness", y = "Count")

Level of happiness looks like skewed to the right distribution, but can be considered as quite close to normal distribution.

Gender

g <- ggplot(df, aes(x = gndr))
g + geom_bar(stat = 'count') + 
  labs(x = "Gender", y = "Count")

There are more female respondents than male ones - nearly on one fourth larger sample.

Age

df$age <- as.numeric(levels(df$inwyys))[df$inwyys] - as.numeric(levels(df$yrbrn))[df$yrbrn]
ggplot(df, aes(x = age)) + 
  geom_density(fill = "salmon", stat = 'density') + 
  stat_function(fun = dnorm, args = list(mean = mean(df$age), sd = sd(df$age))) +
  labs(x = "Age", y = "Density")

Distrubution of age have two peakes at ~30 and at ~60 but still can be considered as close to equal (if we really want to).

Meeting with friends and colleagues frequency

ggplot(df, aes(x = sclmeet)) + 
  geom_bar(stat = 'count') + 
  labs(x = "Frequency of meeting", y = "Count")

Too can be considered as normal distribution (if look from far distance).

2. Simple model with control variables (age & gender)

Great visualization tool for regression!

model1 <- lm(data = df, as.numeric(happy) ~ age + gndr)
tab_model(model1)
  as.numeric(happy)
Predictors Estimates CI p
(Intercept) 8.25 8.00 – 8.49 <0.001
age -0.02 -0.02 – -0.01 <0.001
gndr [Female] -0.13 -0.32 – 0.05 0.145
Observations 2346
R2 / R2 adjusted 0.019 / 0.018

And gender is not significant! It can be seen not only by p-value larger than 0.05, but also that confident interval includes 0.

Even though age is significant it still causes very small effect at first glance - but if we consider that age don’t start from 0 and difference can be 50 or more years difference in effect can be really large (per -0.02 for each year).

Looks like the older we get the less happier we feel :(

3. More complex model with meeting frequency

model2 <- lm(data = df, as.numeric(happy) ~ age + gndr + sclmeet)
# https://strengejacke.github.io/sjPlot/articles/tab_model_estimates.html
tab_model(model2)
  as.numeric(happy)
Predictors Estimates CI p
(Intercept) 7.15 6.67 – 7.64 <0.001
age -0.01 -0.02 – -0.01 <0.001
gndr [Female] -0.14 -0.32 – 0.04 0.122
sclmeet [Less than once a
month]
0.59 0.15 – 1.02 0.008
sclmeet [Once a month] 0.80 0.34 – 1.25 0.001
sclmeet [Several times a
month]
0.94 0.51 – 1.37 <0.001
sclmeet [Once a week] 1.19 0.74 – 1.64 <0.001
sclmeet [Several times a
week]
1.20 0.74 – 1.66 <0.001
sclmeet [Every day] 1.20 0.70 – 1.70 <0.001
Observations 2346
R2 / R2 adjusted 0.038 / 0.035

The second model is more complex and include frequency of meeting with friends and colleagues: - gender again is not significant! - frequency of meeting is significant and have strong positive effect on level of happiness!

stargazer(anova(model1, model2), type = "html", title="ANoVA", intercept.bottom = FALSE,
          single.row = TRUE, align=TRUE, ci=TRUE, ci.level=0.95, no.space=TRUE)
ANoVA
Statistic N Mean St. Dev. Min Pctl(25) Pctl(75) Max
Res.Df 2 2,340.000 4.243 2,337 2,338.5 2,341.5 2,343
RSS 2 10,810.390 147.594 10,706.020 10,758.210 10,862.570 10,914.750
Df 1 6.000 6.000 6.000 6.000 6.000
Sum of Sq 1 208.730 208.730 208.730 208.730 208.730
F 1 7.594 7.594 7.594 7.594 7.594
Pr(> F) 1 0.00000 0.00000 0.00000 0.00000 0.00000

The model2 with frequency of meeting is significantly better than model1: lower RSS shows that.

tab_model(model1, model2)
  as.numeric(happy) as.numeric(happy)
Predictors Estimates CI p Estimates CI p
(Intercept) 8.25 8.00 – 8.49 <0.001 7.15 6.67 – 7.64 <0.001
age -0.02 -0.02 – -0.01 <0.001 -0.01 -0.02 – -0.01 <0.001
gndr [Female] -0.13 -0.32 – 0.05 0.145 -0.14 -0.32 – 0.04 0.122
sclmeet [Less than once a
month]
0.59 0.15 – 1.02 0.008
sclmeet [Once a month] 0.80 0.34 – 1.25 0.001
sclmeet [Several times a
month]
0.94 0.51 – 1.37 <0.001
sclmeet [Once a week] 1.19 0.74 – 1.64 <0.001
sclmeet [Several times a
week]
1.20 0.74 – 1.66 <0.001
sclmeet [Every day] 1.20 0.70 – 1.70 <0.001
Observations 2346 2346
R2 / R2 adjusted 0.019 / 0.018 0.038 / 0.035

Still, each model describes only up to 0.04 variance max at the best case and that’s too bad!

4. Interpretation

Assumptions

plot_model(model2, type = "diag")[[1]]

We don’t have any multicollinearity!

plot_model(model2, type = "diag")[[2]]

And even residuals distributed normally and there are no major outliers!

plot_model(model2, type = "diag")[[3]]

Here again - residuals at the best shape and very nicely fits under the normal distribution line.

plot_model(model2, type = "diag")[[4]]

And here the baddest picture - homoscedasticity. Looks as bad as \[R^2\] of our models.

5. Visualization of the better model

shine bright!

plot_summs(model2, plot.distributions = TRUE, inner_ci_level = .9)

Again, here can be seen coefficients with confidence intervals of the best (second) model: - age is nearly on the line - gender crosses the line with it’s confidence interval - and for different levels of meeting frequency everything great!

Effects

plot_model(model2, type = "pred")[[1]]

Age has significant negative effect and with greater age level of happiness expected to lower.

plot_model(model2, type = "pred")[[2]]

Gender is not significant so that picture can be scipped.

plot_model(model2, type = "pred")[[3]]

And level of happiness significantly affected by meeting frequency with positive direction!

Summary

It does not matter what gender are you - just spend time with your friends more while you are young and be happy!