Data Mining is an important analytic process designed to explore data. Much like the real-life process of mining diamonds or gold from the earth, the most important task in data mining is to extract non-trivial nuggets from large amounts of data.
The tasks of data mining are twofold: create predictive power—using features to predict unknown or future values of the same or other feature—and create a descriptive power—find interesting, human-interpretable patterns that describe the data. In this post, we’ll cover four data mining techniques:
- Regression (predictive)
- Association Rule Discovery (descriptive)
- Classification (predictive)
- Clustering (descriptive)
Data mining is the practice of automatically searching large stores of data to discover patterns and trends that go beyond simple analysis. Data mining uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events. Data mining is also known as Knowledge Discovery in Data (KDD).
The key properties of data mining are:
- Automatic discovery of patterns
- Prediction of likely outcomes
- Creation of actionable information
- Focus on large data sets and databases