[url removed, login to view]
This the link where you can get dataset. The first step is to do kmeans clustering and create 8 clusters of 8 different years which is from 2010 to 2017. Compare the clusters with the Risk, Results and Violation Code column data. Second step is to do decision tree classification by training 80% of data and test 20%of data of Risk column in dataset based on the 8 clusters of 8 different years and consider the risk levels. Third step is to apply Apriori algorithm to find the frequent violations in the specific years considering the Violation Code column from dataset based on the clusters of 8 different years. Such that it shows which has a specific numbers of frequent violations so that inspectors can monitor the violations in next inspection. Using the longitude n longitude, create the heatmap of Chicago of the frequent Violations happening every years(8 clusters). Visualization of each algorithms by charts, graphs and tables considering the specific years.
28 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_26% %project_currencyDetails_sign_sub_27% tähän työhön
Hi, I am a data scientist, so I do this kind of work everyday. I am using R in most of my work too. So, I think I am a perfect candidate for this project. Look forward to working with you. Thanks, Worrawat
Hi, I have worked with R software since 5 years. I have experience on clustering (K-means, hierarchical) and trees (regression and classification), and machine learning with random forest.