Some key terms of Data Mining

Outlier mining - A data mining task aiming to find a specific number of objects that are considerably dissimilar, exceptional and inconsistent with respect to the majority records in the input databases.

Subspace - A combination of features of attributes of a database.


Outlying subspaces -An outlying subspace of a point is a subspace (subset of features) in which this point is considerably dissimilar, exceptional or inconsistent with respect to the remaining population in the database.

Genetic algorithm - A genetic algorithm (abbreviated as GA) is a search technique used in computer science to approximate solutions to optimization and search problems.

Space lattice - Space lattice is a lattice that contains all the possible subspaces of a data space. Each subspace in the lattice is represented as a conbination of features of that subspace.

Random sampling- Randon sampling is a sampling technique where we select a group of subjects (a sample) for study from a larger group (apopulation). Each individual is chosen entirely by chance and each member of the population has a known, but possibly non-equal, chance of being included in the sample.

Example-based outlier mining - Give a set of outlier examples, find more outliers from the dataset that exhibits the similar outlier-ness to the given outlier examples.

原文地址:https://www.cnblogs.com/johnpher/p/2582160.html