Category Archives for "data driven decision making"

Who Suffers Most From Travel Stress?

​Mo' Travel Mo' Problems.

Do you think travel stress increases with the number of trips per year?

Only half the time.

​Stress can increase or decrease with frequency of travel. It normalizes over time. Infrequent travelers show a wild variability in the amount of stress they experience. Some experience a high level of stress while reshuffling home and work responsibilities whereas others who rarely travel may view the experience as a treat.

Regardless of where they start, most people's stress normalizes over time. The novelty wears off. Reshuffling gives way to routine.​

Difference in Difference Estimation [Notes]

difference in difference estimation

Difference in Difference estimation is a linear regression methodology used to analyse the effect of some event in time (the treatment) by comparing results over time. The idea is to compare data before and after the treatment. If the treatment was effective then you will find material differences in the outcomes for the target variable, or Y.

Before and After. This is a basic comparison of means for the time periods before and after treatment. For this specification we assume the target (Y) in the post-treatment period is equal to the target (Y) in the pre-treatment period in the absence of treatment. Thus, any change is attributable to the treatment.

Difference in Difference estimation is the natural extension of the before and after analysis is to include a control group for comparison. The difference in difference specification allows us to do this. We can compare Y as we did for the before and after analysis. We can also compare Ys between treated and untreated groups. To complete a difference in difference specification we use two dummy variables that partition the sample into four groups. The first dummy variable, treatment, partitions the sample in two halves based on their treatment status. The second dummy variable, post, partitions the halves in quarters based on the time period. We then interact treatment and post (Post*Treatment); the coefficient on Post*Treatment estimates the statistical difference in Y.

​The Model:

Y = B0 + B1*Post + B2*Treatment + B3*Post*Treatment.

  • Y, is the target variable we are interested in estimating.
  • Post, is a binary dummy variable indicating whether an observation is in the post treatment period.
  • Treatment, is a binary dummy variable indicating whether an received treatment or not.
  • B3 is the estimator, the coefficient on Post*Treatment.