I have an individual record-level database of deaths in Florida for one year. Most of the records contain the census tract of the individual. All records contain race and hispanic origin. However, some could not be geocoded and only the county is noted. I also have data, at the census tract level on population by race and hispanic origin. The task is to probabilistically assign the non-geocoded deaths in each count to census tracts. So, if a county has 300 deaths that are not geocoded, they should be assigned to the tracts where they are most likely to have occurred, based on the race of the person and the population of the county. (The way I thought about writing the code was a loop. Calculate the difference between each tract's share of the population of a certain type and that tract's deaths of that type. Assign an unassigned death to the maximum likelihood census tract. Update the share differences, proceed to the next unassigned death). There are 600000 deaths of which about 12000 are unassigned to census tracts. Currently these are just ascii files.
Must be an experienced programmer.
9 freelancers are bidding on average $561 for this job
Hi- I have 15+ years of SAS programming, and a M.S. in Statistics. From what I've read, you're going to need both programming and stats help with this merge. I'd be happy to help - Stephanie