Urgenthomework logo
UrgentHomeWork
Live chat

Loading..

Data Mining Homework COSC2110/COSC2111 Data Mining

RMIT University

School of Computer Science and Information Technology

COSC2110/COSC2111 Data Mining

Assignment 2

This assignment counts for 25% of the total marks in this course.

Submit through canvas

You can work on this assignment individually or in a group of 2. If you are working in a group please establish a group in Assignment 2 Group on Canvas

In this assignment you are asked to explore the use of neural networks for classification and numeric prediction. You are also asked to carry out a data mining investigation on a real-world data file. You are required to write a report on your findings. You will be assessed on methodology, analysis of results and conclusions.

PART 1: CLASSIFICATION WITH NEURAL NETWORKS 15 marks

This part involves the following file: heart-v1.arff in the directory:

/KDrive/SEH/SCSIT/Students/Courses/COSC2111/DataMining/data/arff/UCI/ For the neural network training runs build a table with the following headings:

Run

Archi-

Param

Train

Train

Epochs

Test

Test

No

tecture-

eters

MSE

Error

MSE

Error

1

23-10-5

lr=.2

0.5

30%

500

0.6

40%

1. Describe the data encoding that is required for this task. How many outputs andhow many inputs will there be?

2. Develop a script to generate the necessary training, validation and test files. Youmight want to normalize the numeric attributes with Weka beforehand. Include your data preparation script as an appendix (not part of the page count).

3. Determine the “analyze” strategy that you will use.

4. Using Javanns carry out 5 train and rest runs for a network with 10 hidden nodes. Comment on the variation in the training runs and the degree of overfitting.

5. Experiment with different numbers of hidden nodes. What seems to be the rightnumber of hidden nodes for this problem?

6. For 10 hidden nodes, explore different values of the learning rate. What do youconclude?

7. [Optional] Change the learning function to backprop-momentum. Explore different combinations of learning rate and momentum. What do you conclude?

8. Perform a run with 10 hidden nodes and no validation data. Stop training whenthe MSE is no longer changing. Get the classification error on the training and test data. Comment on the degree of overfitting.

9. Compare the classification accuracy of the neural classifiers with the classification accuracy of Weka J48 and MultilayerPerceptron.

Report Length Up to two pages.

Copyright © 2009-2023 UrgentHomework.com, All right reserved.