Urgenthomework logo
UrgentHomeWork
Live chat

Loading..

Ait 664 Information For Representation Assessment Answers

You want to organize information stored in several documents, to identify patterns relevant to a better understanding of the data. So, use a data mining system of clustering, using Weka tool, to glean patterns of the similar information documents, followed by providing a visual analysis of your discovered patterns, using a tool of your choice.

Answer:

Introduction

Main objective of this project is analysis the provided data file by using the data mining tools. This project divided into five tasks such as data acquisition, data pre-processing, mining tool preparation, clustering analysis and visualization. In data acquisition, user needs to download the project data file like Ebola Discussion. In Data pre-processing, user needs to extract the substring on the each field. This process is used to preserve the data mining analysis and it improve the performance like tokenization, steaming, name entity recognition and stop word removal. It also impute the missing values in the each fields. In mining tool preparation, user needs to download and install the Weka explorer. After, install the Explorer. Then, open the provided data file. Finally, remove the attributes or fields that user think are not meaningful for pattern analysis. In clustering Analysis, user needs to cluster the provided data file. Finally, user needs to provide the visualization of the provided data file. These are will be discussed and analysed in detail.

Data Acquisition

In data acquisition, user needs to download the project data file like Ebola Discussion. The Provided data file is illustrated as below (Han, Kamber & Pei, 2012).

Data Pre-processing

In Data pre-processing, user needs to extract the substring on the each field. This process is used to preserve the data mining analysis and it improve the performance like tokenization, steaming, name entity recognition and stop word removal. It also impute the missing values in the each fields. The provided data file is successfully completed the data pro-processing process (Hancock, 2012).

Mining Tool Preparation

The Weka is one of data mining software which is used to provide effective data mining process and it uses a collection of machine leaning algorithms to provide the effective mining process. Weka is a collection of tools for:

  • Regression
  • Clustering
  • Association
  • Data pre-processing
  • Classification
  • Visualisation

Here, user needs to download and install the Weka explorer. After, install the Explorer Men. Then, open the provided data file. It is illustrated as below (Mitsa, 2010).

Finally, remove the attributes or fields that user think are not meaningful for pattern analysis by using the below steps. Choose Filter to apply the String to Word Vector, for transforming MESSAGE string into a vector of words. 

Clustering Analysis

The cluster analysis is used to identify the occurrences groups and similarities within the provided data file that is Ebola Discussion. Basically, the cluster analysis uses the training set, percentage split, and classes and supplied set. Also, clustering analysis has options to ignore the some attributes the from the provided data file based on the requirements. The clustering algorithms has the following schemes such as farthest first, x-means, EM, K-Means and cobweb. Here, we are using the k-Means analysis to analysis the Ebola Discussion data file. Generally, the clustering allows a user to create the groups of data to determine the data patterns on the given data file based on the project requirements. The clustering has one defining benefit compared to the classification is that every attributes are used to analyse the provided data (Stahlbock, Abou-Nasr & Weiss, 2018).

In clustering Analysis, user needs to cluster the provided data file by using the below steps.

K Means

======

Number of iterations: 2

Within cluster sum of squared errors: 60131.00000000001

Initial starting points (random):

Cluster 0: 'A professor in U S is telling Liberians that the Defense Department manufactured Ebola _URL_ via','Mon Sep 29 13:51:10 +0000 2014'

Cluster 1: 'Goodluck Jonathan We Conquered Ebola We ll Crush Boko Haram President says President Goodluck Jonathan sai _URL_','Mon Sep 29 12:35:57 +0000 2014'

Missing values globally replaced with mean/mode

Final cluster centroids:

Time taken to build model (full training data) : 0.07 seconds

=== Model and evaluation on training set ===

Clustered Instances

0 30434 (100%)

1 4 (0%) 

Visualization of K Means is illustrated as below. 

The K means results windows is used to display the centroid of each cluster as well as statistics on the number and percentage of instances assigned to different clusters. Cluster centroids are the mean vectors for each cluster. Thus, centroids can be used to characterize the clusters. Finally, we want to adjust the attributes of our cluster algorithm by clicking Simple K-Means. The output of simple K means algorithms shows the cluster 0 and cluster 1. The cluster 0 is used to shows the A professor in U S is telling Liberians that the Defense Department manufactured Ebola _URL_ via and the cluster 1 is used to shows the information about the Goodluck Jonathan We Conquered Ebola We ll Crush Boko Haram President says President Goodluck Jonathan sai _URL_. Each cluster shows us a type of behaviour in provided data file. The evaluation of training set is provided the following results (Veart, 2013).

Clustered Instances

0

30434 (100%)

1

4 (0%)

Conclusion

This project successfully analysed the provided data file by using the data mining tools. This project divided into five tasks such as data acquisition, data pre-processing, mining tool preparation, clustering analysis and visualization. In data acquisition, user successfully downloaded the project data file like Ebola Discussion. In Data pre-processing, user effectively extract the substring on the each field. This process is used to preserve the data mining analysis and it also improve the performance like tokenization, steaming, name entity recognition and stop word removal. It also impute the missing values in the each fields. In mining tool preparation, user successfully downloaded and installed the Weka explorer. After, installed the Explorer. Then, open the provided data file. Finally, removed the attributes or fields that user think are not meaningful for pattern analysis. In clustering Analysis, user effectively cluster the provided data file. Finally, user effectively provided the visualization of the provided data file. These are discussed and analysed in detail.

References

Han, J., Kamber, M., & Pei, J. (2012). Data mining. Waltham: Morgan Kaufmann.

Hancock, M. (2012). Practical data mining. Boca Raton, FL: CRC Press.

Mitsa, T. (2010). Temporal Data Mining. Hoboken: CRC Press.

Spendler, L. (2010). Data mining and management. New York: Nova Science Publishers.

Stahlbock, R., Abou-Nasr, M., & Weiss, G. (2018). Data Mining. Bloomfield: C. S. R. E. A.

Veart, D. (2013). First, Catch Your Weka. New York: Auckland University Press.


Buy Ait 664 Information For Representation Assessment Answers Online


Talk to our expert to get the help with Ait 664 Information For Representation Assessment Answers to complete your assessment on time and boost your grades now

The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks. The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.


Get Online Support for Ait 664 Information For Representation Assessment Answers Assignment Help Online


Copyright © 2009-2023 UrgentHomework.com, All right reserved.