Urgenthomework logo
UrgentHomeWork
Live chat

Loading..

7Com1073 | Data Pre-Processing Assessment Answer

Assignment

This coursework is an individual assignment. You need to write your own Python code in a Jupyter Notebook.

Task 1: Data pre-processing and data exploration 

a. Use Pandas to load the data and report the number of data points (rows) in the dataset.

b. Consider “quality” as class labels. Report the number of features in the dataset and the number of data points in each class.

c. Perform random permutations of the data using the function, shuffle, from sklearn.utils. You must set a value to the parameter, random_state. Assign the data to a new variable as white_wine. 

d. Produce one scatter plot, that is, one feature against another feature. You are free to choose which two features you want to use

Task 2: PCA Analysis on the white-wine dataset Using Scikit-Learn

a. Perform a PCA analysis on the whole white_wine dataset.

b. Plot the data in the PC1 and PC2 projections and label/colour the data in the plot according to their class labels.

c. Report the variance captured by each principal component.

Task 3: Divide the white_wine dataset into a training set, a validation set, and a test set.

a. Take out the first 1000 rows from white_wine and save it as the validation set.

b. Take out the last 1000 rows from white_wine and save it as the test set.

c. Save the rest of rows from white_wine as the training set. 

Task 4 Investigate how the size of the training dataset affects the model performance on the test set.

In this task, let us consider the last column ‘quality’ of the white_wine dataset as a real-valued target rather than a class label. You need to use the linear regression model to finish the following tasks (a)- (c). Note that you should use all available features in the dataset

a. Produce a learning curve of the size of training set against the performance measurements. The performance should be measured on both the training set and the validation set. You need to choose at least 10 different sizes for the training set. For example, the first size may be 10% of the total training set produced in Task 3.

• Remember to scale the corresponding training set and the validation set.

b. Report what the best training data size you would like to use for this work is and explain why you choose it.

c. Report the performance on the test set obtained using the model trained from the best size.

• Remember to scale the corresponding training and test sets.

Task 5: Critical Discussion: write your conclusions using critical thinking (no more than 150 words) in your Jupyter notebook submission.

a. Summarize your findings for each task.

b. For Task 4, discuss whether there is any problem with that experimental design. If there is, what is it? How may you further improve it so that the experimental results are more reliable? 




Buy 7Com1073 | Data Pre-Processing Assessment Answers Online


Talk to our expert to get the help with 7Com1073 | Data Pre-Processing Assessment Answers to complete your assessment on time and boost your grades now

The main aim/motive of the management assignment help services is to get connect with a greater number of students, and effectively help, and support them in getting completing their assignments the students also get find this a wonderful opportunity where they could effectively learn more about their topics, as the experts also have the best team members with them in which all the members effectively support each other to get complete their diploma assignments. They complete the assessments of the students in an appropriate manner and deliver them back to the students before the due date of the assignment so that the students could timely submit this, and can score higher marks. The experts of the assignment help services at urgenthomework.com are so much skilled, capable, talented, and experienced in their field of programming homework help writing assignments, so, for this, they can effectively write the best economics assignment help services.

Get Online Support for 7Com1073 | Data Pre-Processing Assessment Answer Assignment Help Online

Resources

    • 24 x 7 Availability.
    • Trained and Certified Experts.
    • Deadline Guaranteed.
    • Plagiarism Free.
    • Privacy Guaranteed.
    • Free download.
    • Online help for all project.
    • Homework Help Services
); }
Copyright © 2009-2023 UrgentHomework.com, All right reserved.