a part, b part, and f part are already done. I am struggling with messy_impute() function for a few hours, but would probably know how to turn it into tidy_impute()
(a) Create a simulated dataset in R called gradebook that represents a possible gradebook in
the basic format given above:
• Each row of the gradebook should contain measurements for a single student. • Each column should contain scores for individual assignments.
• The last column should be “Section.”
The simulated gradebook should contain the grades for at least 150 students (80 in section A and 70 in section B) and scores for 13 assignments. Set the seed for simulating your dataset to be your UID.
(b) Randomly replace 10% of the scores in the Homework_10 and the Exam_3 by NA values. For each section, print out one student with NA value in the Homework_10, one with NA value in the Exam_3, and one without NA values in both columns. You will use those six students for the rest of the problems to demonstrate your results.
Imputation is the process of replacing missing values by estimated values. The simplest (far from preferred) method to impute values is to replace missing values by the most typical (or “average”) value.
(c) Write a function messy_impute() that will impute missing values in a data frame (or ntbble) that is organized in the same non-tidy way as grade book. You may present pseudocode or flowchart here.
The mess_impute() function should have two optional arguments:
- The center argument specifies whether to impute using the mean or the median.
- The margin argument specifies one of two ways to input values:
◦ Impute the missing values using the center of the observed (non-missing) values in the column.
◦ Impute the missing values using the center of the observed values in the row.
- The range argument specifies the columns/the rows for computing the typical value.
It could be names or numeric indices
(d) Using the gradebook variable, without reshaping or tidying, impute the missing homework and exam scores with messy_impute() using both the mean and the median of the observed homework and exam scores for sections A and B respectively.
(e) Using the gradebook variable, without reshaping or tidying, impute each missing homework and exam score with messy_impute() using both the mean and the median of the individual student’s observed homework and exam scores.
(f) Transform the gradebook variable into tidy format. Call the transformed variablegradebook_tidy.
(g) Write a function tidy_impute() that will impute missing values from a specified column in a tibble that is organized in the same tidy way as gradebook_tidy. Thetidy_impute() function should have optional arguments to impute values that correspond to imputing in the same ways as in the messy_impute() function. You may present your pseudo code or flowchart here.
(h) Using the gradebook_tidy variable impute the missing homework and exam scores withtidy_impute() using both the mean and the median of the observed homework and exam scores for sections A and B respectively.
(i) Using the gradebook_tidy variable, impute each missing homework and exam score withtidy_impute() using both the mean and the median of the individual student’s observed homework and exam scores.
6 freelanceria on tarjonnut keskimäärin 31$ tähän työhön
Data Scientist with more than 4 years of programming experience in R and have completed more than 30 projects in R on this platform. I can help you with the issue you are facing with the function.
hi i have understand the project requirement & interested to work for this project. I will start right now and I will do it within your time . Please send a message so we can discuss more & start the project. Thanks.