Author Login
Post Reply
Hi everyone,
Im looking for a clever bit of code to replace NAs with a specific score
depending on an indicator variable.
I can see how to do it using lots of if statements but Im sure there most
be a neater, better way of doing it.
Any ideas at all will be much appreciated, Im dreading coding up all those
if statements!!!!!
My problem is as follows:
I have a data set with lots of missing data:
EG Raw Data Set
Category variable1 variable2 variable3
1 5 NA
NA
1 NA
3 4
2 NA
7 NA
etc
Now I want to replace the NAs with the average for each category, so if
these averages were:
EG Averages
Category variable1 variable2 variable3
1 4.5
3.2 2.5
2 3.5
7.4 5.9
So Id like my data set to look like the following once Ive replaced the
NAs with the appropriate category average:
EG Imputed Data Set
Category variable1 variable2 variable3
1 5 3.2
2.5
1 4.5
3 4
2 3.5
7 5.9
etc
Any ideas would be very much appreciated!!!!!
thankyou
Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP development, Data Analysis,
Modelling, and Training
(mobile) 0410 689 945
(fax / office) (+618) 8952 7878
chris@(protected)
[[alternative HTML version deleted]]
______________________________________________
R-help@(protected)
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.