Write a Java program to output the following statistics of dataset:
1. For a nominal attribute
(a) output the number of possible attribute values
(b) the number of instances for each attribute value
2. For a numeric attribute
(a) output the maximum and minimum value
(b) split the range from minimum to maximum into two equal-length intervals (say lower and
upper), and output the number of instances lower and upper intervals, respectively.
You will need to read the documentation of the package called [login to view URL], and typically the [login to view URL] to complete the this part.
In the second part, you will first implement the above two distance measurements, and then give an experimental evaluation of these measurements. Usually, evaluation is performed in a specific data mining 2 tasks, such as classification, clustering. Since we currently have not yet got into these topics, we will simply check the consistency between two measurements based on random data sets. Specifically, you will need to
1. generate a random dataset of two-attribute instances in Gaussian distribution
2. build the covariance matrix Σ
3. set counter = 0
4. for each instance,
(a) Compute the nearest neighbor using Euclidean distance
(b) Compute the nearest neighbor using Mahalanobis distance
(c) if the nearest neighbors are same, count++
5. output the consistency ratio (count / number-of-instances)
You will need to read the documentation of the packages [login to view URL] and [login to view URL], and
typically the class Matrix to complete this part. To generate data in Gaussian distribution, you may
refer to the standard Java class Random, and use the Java method nextGaussian().
10 freelanceria on tarjonnut keskimäärin %project_bid_stats_avg_sub_26% %project_currencyDetails_sign_sub_27% tähän työhön
Hello , I can help you with this , I'm a professional developper with years of experience , Expérience et Compétences appropriées java Étapes proposées $55 USD - ----