1) Download the dataset of handwritten digits collected by USPS and divide them
ID: 3762870 • Letter: 1
Question
1) Download the dataset of handwritten digits collected by USPS and divide them into two sets of 5500 images: the training set and the testing set. The goal of this project is to develop and test methods for classification of the handwritten data, which are optimized on a new representation of the dataset and then applied to the testing set to assess the performance of the developed methodology. (With the prmission of the instructor, you can use some other data set, of comparble size and difficulty to the handwriten digits data.) (matlab)
2) Find and describe a method to represent the digit images (or elements from your dataset) as vectors. 3) Use the nearest neighbors classification scheme, or any other suitable classifier that you may already know, to verify the success rate of your classifications applied to original vectorized data. Optimize the parameters (inlcuing but the selected metric) to maximize the global success rate.
4) Next, use a data representation based on eigenvectors of Graph Laplacian for this dataset, to find the best possible representation of the handwritten digits dataset, for the purposes of classification and identification of those digits. The project work includes weight and neighborhood selection, parameter optimization for Graph Lapla- cian representation, as well optimization of digit classifications.
Explanation / Answer
data representation based on eigenvectors of Graph Laplacian for this dataset, to find the best possible representation of the handwritten digits dataset, for the purposes of classification and identification of those digits. The project work includes weight and neighborhood selection, parameter optimization for Graph Lapla- cian representation, as well optimization of digit classifications.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.