Write a technical report responding to the following questions and how you are g
ID: 3604049 • Letter: W
Question
Write a technical report responding to the following questions and how you are going to handle them as you are preparing your data to be mined for your term project. What measures are you going to apply to your data and what actions are you going to take to decide about next steps and activities associated with that: • How are you going to clean up your data? (Data Cleansing) • How are you going to express data variables? (Data Transformation) • How are you going to handle missing values? (Data Imputation) • Are all variables treated the same?(Data Weighting and Balancing) • What are you going to about outliers and other unwanted data? (Data Filtering) • How are you going to handle temporal (time-series) data? (Data Abstraction) • Can you reduce the amount of data to use? (Data Reduction) - Records? (Data Sampling) - Variables? (Dimensionality Reduction) - Values? (Data Discretization) • Do you think you are going to need to create some new variables? If not, why not? (Data Derivation) You may adopt this homework to update your project proposal/paper.
Explanation / Answer
Remember the quality of your inputs decide the quality of your output. So, once you have got your business hypothesis ready, it makes sense to spend lot of time and efforts here. With my personal estimate, data exploration, cleaning and preparation can take up to 70% of your total project time.
Below are the steps involved to understand, clean and prepare your data for building your predictive model:
Finally, we will need to iterate over steps 4 – 7 multiple times before we come up with our refined model.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.