7022DATSCI—Mini-projects Master of Sensors Data and Management Big Data Analysis
Instructions:
The aim of the Big Data Analysis project is to apply a machine learning method in a practical setting. In each of the following projects you are asked to...
You will work on your projects in groups of 3-5 students. The following list contains suggestions for project topics. Additional topics might become available and you can also suggest alternative topics:
Apply principal component analysis for recognising handwritten digits as explained in (Lu, 2017) (but without the pre-processing using Histograms of Oriented Gradients (HOG)) to the MNIST data set. http://yann.lecun.com/exdb/mnist/
Implement the variant of the PageRank algorithm described in (Allesina and Pascual, 2009) and reproduce the study for some of the food webs from this article. Note that some of the food webs are available in R by installing the cheddar library.
A highly original application of Markov chain Monte Carlo (MCMC) was presented by (Diaconis, 2009) and extended by (Chen and Rosenthal, 2012). Implement and test the approach by reproducing the example described in (Diaconis, 2009).
Allesina, S., Pascual, M., 09 2009. Googling food webs: Can an eigenvector measure species’ importance for coextinctions? PLOS Computational Biology 5 (9), 1–6.
URL https://doi.org/10.1371/journal.pcbi.1000494
Chen, J., Rosenthal, J., 2012. Decrypting classical cipher text using Markov chain Monte Carlo. Statistics and Computing 22, 397–413.
URL https://doi.org/10.1007/s11222-011-9232-5
Diaconis, P., 2009. The Markov Chain Monte Carlo Revolution. Bulletin of the American Mathematical Society 46 (2), 179–205.
Lu, W., 2017. Handwritten digits recognition using PCA of histogram of oriented gradient. In: 2017 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM). pp. 1–5.
Important! All group members will receive the same mark for the Powerpoint presentation, one-page summary and code demonstration will be marked individually.
Presentation/One-page summary |
Partial mark |
Introduction Brief description of your application Motivation: Which challenge are you going to address? |
5% |
Implementation What are the challenges of implementing the algorithm? Explain how you implemented the method. |
15% |
Results What have you found out about your data set? Show how your machine learning method addresses the challenge described in the Introduction. |
10% |
Discussion Brief summary of the analysis of the data Critically reflect how well the challenge described in the Introduction was solved by your machine learning approach. |
10% |
Formal marks Visual presentation Delivery of the talk Time keeping |
10% |
Total |
50% |
Source code (submitted to Canvas and demonstration) |
Partial mark |
Completeness of the implementation |
20% |
Demonstration |
10% |
Clarity of the code |
10% |
Quality of Comments |
10% |
Total |
50% |
Follow Us