Find a dataset consisting of a categorical response variable having just two categories, several (3 to 6) potential
predictors, with a large enough sample size. At least two of the predictors should be quantitative variables - others can be 0/1 indicators or categorical. Do not use a dataset from the textbook or Stat2Data package.
1) Briefly describe the variables in your dataset and where you found the data.
2) Was your data in a format that you could immediately upload into RStudio and analyze, or was some
manipulation of the data needed? If you manipulated the data, explain what you did and how it was done.
1) Choose a single quantitative predictor and construct a logistic regression model.
2) Plot the raw data and the logistic curve on the same axes.
3) Construct an empirical logit plot and comment on the linearity of the data.
4) Use the summary of your logistic model to perform a hypothesis test to determine if there is significant evidence
of a relationship between the response and predictor variable. State your hypotheses and conclusion.
5) Construct a confidence interval for the odds ratio and include a sentence interpreting the interval in the
6) Compute the G-statistic and use it to test the effectiveness of your model.
7) Repeat (1)-(6) for a second model with a different single quantitative predictor.
8) Compare the effectiveness of your two models to each other (a formal test is not required).