turnover. The dataset we will use (see the Excel file named “Regression Project”) contains a wide variety of workforce data (employee demographics, and attitudes) on approximately 1000 employees. The primary dependent variables are “Attrition” and “Probability of Turnover.”

Your CEO wants to better understand the factors driving employee turnover, and she has asked you to take the lead in conducting the analyses.

You should begin with data cleaning and range checks. Expect that there are problems here, as with most any large dataset! Please address any problems that you find and document any changes that you have made in your memo to me.

Then, move on to the basics (e.g., are departing employees older, younger, have higher education levels, lower job satisfaction, etc.)? You should also seek to determine whether or not there are any differences in attrition across departments and if so, why.

Once you have outlined the basics, develop a multivariate regression model to determine which factors appear to be the most important predictors of Probability of Turnover and/or Turnover (be sure to use a logistic regression model if you focus on the latter). Note that you have a lot of discretion in how you approach this problem, so I am intentionally not providing step-by-step details on what you should do with these projects. I want you to show me how you would approach the problem.

Please summarize your findings in a (maximum) five-page, double spaced memo to the CEO. Any tables, figures, etc., can be placed in an Appendix to your memo.

