Directory Image
This website uses cookies to improve user experience. By using our website you consent to all cookies in accordance with our Privacy Policy.

Data Mining

Author: Janet Peter
by Janet Peter
Posted: Mar 18, 2019
data mining

Introduction

Data mining is the process of data analysis from different perspectives and then summarizing it into useful information. The information obtained should have the ability of increase the revenues and cutting costs, unlike the analyzed data. Data mining processes utilize several analytical tools for data analysis, and among them is the data mining software. The software allows users to carry out data analysis from diverse dimensions, group it and then summarize the relationships within it. The entire process has the aim of finding correlations among the different types of data from the relational databases. Data mining helps to extract information from a data set and then transform it another structure that is easy to use (Rouse, 2015). It is a broad field that entails other disciplines like data management aspects, data pre-processing, model considerations, visualization, and online updating. However, the principal data mining task is the analysis of large quantities of data to extract the useful information from patterns, records, and dependencies. The ultimate impact of data mining is a prediction. It involves the detection of patterns to new subsets of data.

Data mining is in the initial stages of use in various applications, but there are many industries that adopted its use. Among them is in retail, finance, healthcare, manufacturing transportation, and aerospace. The types of industries use data mining tools and techniques to take advantage of the historical data available in their databases. The techniques used are pattern recognition technologies, and the statistical and mathematical techniques that help to extract warehoused information. It helps the analysts to identify the significant facts, relationships, trends, exceptions, and anomalies with data that go unnoticed.

In business, data mining is helpful in discovering relationships that in the data for easy decisions making. Data mining technologies in use assists the business owners to spot the sales trends, have better marketing campaigns, and predict the customer loyalty. Data mining has specific applications. In market segmentation, data mining helps to identify the similar characteristics of clients who have a tendency of buying the same product. It is also useful in fraud detection in which the user can identify the products that are most likely to be fraudulent (Furnas, 2012).

The basis of data mining

Data mining approaches arise from a long process of conducting research and in product development. There has been comprehensive research work carried out in the techniques used in data mining. Initially, many businesses relied on computers for data storage. However, there are many developments have made it possible to have easy data access through the recent technologies that allow data navigation in real-time. Data mining is a step further in the development of handling data through improved data access and navigation to prospective and proactive delivery of information. It makes data mining techniques to have applications in business through the support of different technologies. They include; massive data collection, powerful multiprocessor computers, and the data mining algorithms (Han & Kamber, 2000). There are many commercial databases available in the market, and their rate of growth is very high. The need for improved computational engines is on the rise in which the use of parallel multiprocessor computer technology meets the demand. The data mining algorithms techniques are better for data analysis than the conventional statistical methods.

The major processes involved in the conversion of business data to business information build upon the previous methods. The processes involve data collection, data access, data warehousing, and decision support, and data mining. Data collection in the former days was retrospective and used the approach of static data delivery. The next step was data access in which the techniques used had the characteristic of being retrospective and having dynamic data delivery at the recording level. The nest evolution in data handling was the development of data warehousing and decision support. The major characteristic of the evolutionary step is retrospective and having dynamic data delivery at multiple levels. The recent development in data techniques is data mining that is still in its infancy stages. It has the characteristics of being prospective and gives proactive information. Thus, it is possible for businesses to plan for the future and also anticipate for the level of growth through data mining techniques.

Scope of data mining

Data mining derives its name from the scenario of searching valuable business information in a huge database. The process involves searching through a large amount of material and then probing it immensely to find the exact information required. Data mining enables businesses to generate new business opportunities through various ways. First is the automated prediction of trends and behavior of the market. Data mining techniques enable the automation of the processes of finding prospective information from databases (Berry & Linoff, 2000). The traditional types of data analysis involved much of hands-on procedures that were slow in answering certain questions about the business. The current technologies for data mining are helpful in solving the problem of targeted marketing. Data mining processes use the past promotional data to help in the identification of the targets having the highest probability of maximizing the returns on investment (Falavi & Abdoli, 2015). The techniques also enable the business executives to predict problems that may arise like in forecasting bankruptcy and other forms of shortfalls.

The second business opportunity is the automated discovery of previously unknown patterns. Data mining tools enable the scrutinizing of databases thus allowing the identification of previously hidden patterns in a single step. Business can easily analyze the retail sales data and use it in the identification of the seemingly unrelated products though often purchased together. Data mining tools can easily identify problems that arise from fraudulent transactions using the patterns of data on transactions. It is also easy to identify the data entry keying errors when using the credit card transactions.

The usage of data mining tools and techniques has many benefits of automation on the existing hardware and software platforms. The existing systems can either be implemented on the new systems or upgraded. Data processing using data mining techniques makes it easy to experiment with different data models for a comprehensive understanding of the complex data. The high speed makes it possible for users to analyze large quantities of data within a short time. The result is improved and informed predictions.

Key properties of data mining

Data mining has the characteristic of allowing automatic discovery of patterns, the prediction of the likely outcomes, the establishment of actionable information, and have a focus on large data sets and databases. In automatic discovery, data mining helps in providing solutions to the problems that are difficult to address through the normal techniques. The models created through data mining tools use an algorithm to act on a set of data. Automatic discovery is the utilization of the data mining models. Most of the model generates new data from the buildup data (Hand, Mannila, Smyth, 2001).

Data mining processes are predictive in nature. The predictions give the likely outcomes of an investment or a trend in business activities. The predictions give confidence to the user that they can rely on the information generated by data mining. Data mining predictions apply rules and support. Rules are the conditions that guide the outcome, and the support is the available information to satisfy the rule. Data mining tools enable grouping of data within certain populations. The groupings of data are guides in decision-making about the population. The information generated through data mining techniques is actionable (Hand, Mannila, Smyth, 2001). Thus, the user of the information can take certain actions depending on the outcomes of the data mining processes.

Data mining has a close relationship with online analytical processing (OLAP). OLAP is a fast analyzer of shared multidimensional data. The technique supports the activities of data summarization, cost allocation, and time-series analysis. OLAP systems lack the ability to perform inductive inference that is a process of making conclusions from the outcomes. However, inductive inference is a characteristic of data mining. Both OLAP and data mining can have various ways of integration. Thus, the two work together in a complimentary way.\

Key techniques in data mining

Several techniques are in use in describing the type of mining and that of data recovery operations. Association is a data mining technique that involves the making of simple correlation between items with an aim of identifying patterns (Han, Kamber & Pei, 2006). In the business application of the technique, the user can track the behavior of customers in the products that they buy. One might observe that every time a customer buys a particular product, there must be an accompaniment of another product. Thus, the business owner knows the association between the two products as a result of the buying behavior of the customers.

Data mining also applies classification technique in which the user describes several attributes to help in the identification of a particular class. Thus, it is possible to predict the type of customer, and the item likely to buy. Certain products can also classifications depending on their attributes. Classification technique in data mining can serve as an input or a result of other techniques.

Clustering is the other technique that allows the grouping of individual pieces of data to form a structure. Clustering comes as a result of classification. The technique assists in the identification of different information through correlating it with other information. Prediction is also a technique in data mining. It is a broad area of application of data mining in different industries. It ranges from the prediction of failure of certain tools, the identification of fraud, and prediction of business profits (Han, Kamber & Pei, 2006). The technique is applicable in combination with other data mining tools. It is a comprehensive technique that incorporates analysis of trends, classification, pattern matching, and formation of relationships. The analysis of past events is a major step towards making predictions about an event.

Sequential patterns are a technique useful for the identification of trends, and the regular occurrence of events (Muthuselvan & Sundaram, 2015). The technique checks on the frequency and the history of a particular set of data in making the patterns. A decision tree is also a data mining technique that has a close association with classification and prediction. The technique starts with a simple query that can have two or more solutions. Each of the answer provided leads to further questions that help to classify the data to facilitate categorization. Decision trees help to define the type of information. They can have usage with predictive systems. The different predictions have a basis on past historical experiences to help in creating the structure of the decision tree and the likely results (Han, Kamber & Pei, 2006).

Stages of data mining

The process of data mining follows a set of three major stages: (1) the initial exploration, (2) model building, and (3) deployment. In stage one, data preparation occurs that may entail cleaning, transformations, and selection of records of data. It also involves the selection of the particular number of variables to a manageable level. The first stage of data processing may involve deciding on the predictors for a regression model or elaborating on the methods of analysis. The procedures aim at getting the most relevant variables and also in the determination of the nature of the models.

The second stage involves the building and validation of models. It involves the consideration of the best model to use in the prediction. Many models exist, but their predictive performance varies. There are many techniques under development to help in predicting the patterns. The process may involve a comparison of different of different models of data by observing their performance. The common techniques in predictive data mining include bagging, boosting, stacking, and meta-learning.

The last stage in data mining is deployment that involves the utilization of the selected model by applying it to a new set of data. The results generated help in making predictions of the business patterns. The concept of data mining is useful in business information management in which it reveals the knowledge on structures that guide decision-making.

Conclusion

Data mining techniques find uses in many areas due to the advancement in the technologies used. Data mining is a process that guides decision making through data analysis and utilization of the models generated. The process extracts important information from huge databases and then form relationships after transform the data into another structure that is easy to use. It is a process that has attracted many users in retail, marketing, and in research due to its predictive capability. A user can easily analyze certain data and generate predictive trends for the future. Data mining in the business world is a breakthrough in information system management and also in decision-making. Though the use of data mining is in its infancy stages, the benefits associated with it are far much stretched. There have been several developments in data handling and analysis, but data mining supersedes them. Data mining involves three stages namely the initial exploration, model building, and deployment

The stages ensure that the information obtained from data mining processes is applicable by the user. Data mining has the attribute of facilitating the automatic discovery of patterns, the prediction of the likely outcomes, and the establishment of actionable information. The techniques used in data mining are an association, classification, clustering, prediction, the formation of sequential patterns, and decision trees.

References

Alexander Furnas (2012) A guide to what data mining is, how it works, and why it’s important: The Atlantic;

Berry, M., J., A., & Linoff, G., S., (2000) Mastering data mining: New York: Wiley

D.J Hand, Heikki Mannila, Padhraic Smyth (2001) Principals of Data Mining: Adaptive Computation and Machine Learning, MIT Press, 2001; ISBN 026208290X, 9780262082907

Falavi, M., & Abdoli, M. R. (2015) The Efficiency of Data Mining Models in Determining the Effect of Working Capital Management on Corporate Performance: International Journal Of Academic Research, 7(1), 339-343. doi:10.7813/2075-4124.2015/7-1/B.57

Han, J., Kamber, M. (2000) Data Mining: Concepts and Techniques: New York

Jiawei Han, Micheline Kamber, Jian Pei (2006) Data Mining, Southeast Asia Edition: Concepts and Techniques, Morgan Kaufmann Publishers, 2006; Edition 2, ISBN 0080475582, 9780080475585

Margaret Rouse (2015) Data mining

Muthuselvan, S., & Sundaram, K. S. (2015) A Survey of Sequence Patterns in Data Mining Techniques: International Journal of Applied Engineering Research, 10(1), 1807-1815.

Carolyn Morgan is the author of this paper. A senior editor at MeldaResearch.Com in legitimate essay writing service. If you need a similar paper you can place your order from research paper services.

About the Author

"Janet Peter is the Managing Director of a globally competitive essay writing company.

Rate this Article
Leave a Comment
Author Thumbnail
I Agree:
Comment 
Pictures
Author: Janet Peter
Premium Member

Janet Peter

Member since: Dec 11, 2017
Published articles: 349

Related Articles