Thursday, June 6, 2019

Data Preprocessing Essay Example for Free

Data Preprocessing EssayData Preprocessing 3 Todays real-world databases are highly susceptible to noisy, missing, and irreconcilable data due to their typically huge size (often several gigabytes or more) and their likely origin from multiple, heterogenous sources. Low-quality data will lead to low-quality mining results. How gage the data be preprocessed in order to help repair the quality of the data and, consequently, of the mining results? How can the data be preprocessed so as to improve the ef? ciency and ease of the mining process? There are several data preprocessing techniques. Data cleaning can be applied to remove noise and chasten inconsistencies in data. Data integration merges data from multiple sources into a coherent data store such as a data warehouse. Data reduction can reduce data size by, for instance, aggregating, eliminating redundant features, or clustering. Data transformations (e. g. , normalization) may be applied, where data are scaled to fall wi thin a smaller range like 0. 0 to 1. 0. This can improve the accuracy and ef? ciency of mining algorithms involving distance measurements. These techniques are not mutually exclusive they may die together.For example, data cleaning can involve transformations to correct wrong data, such as by transforming all entries for a date ? eld to a park format. In Chapter 2, we learned about the different attribute types and how to use basic statistical descriptions to study data characteristics. These can help identify erroneous set and outliers, which will be useful in the data cleaning and integration steps. Data processing techniques, when applied before mining, can substantially improve the general quality of the patterns mined and/or the time required for the actual mining.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.