Process of knowledge discovery in databases

Data mining is an integral part of knowledge discovery in databases (KDD), which is the overal process of converting raw data into useful information.

The process of knowledge discovery in databases:

Input Data

-> Data Preprocessing(Feature Selection, Dimensionality Reduction, Normalization, Data Subsetting)  (the most laborious and time-consuming task)

-> Data Mining

-> Postprocessing (Filtering Patterns, Visualization, Pattern Interpretation)

-> Information

The purpose of preprocessing: raw input data -> appropriate format

Steps involved in data preprocessing:

1. fusing data from multiple sources;

2. cleaning data to remove noise and duplicate observatoins;

3. selecting records and features that are relevant to the data mining task at hand.

原文地址:https://www.cnblogs.com/johnpher/p/2866950.html