Big data can be defined as extremely large data that increases with time. Big data collects data from disparate sources and the data is so complex that you cannot employ any traditional method or tool to manage such huge data. Big data is not a data repository like data warehouse but it is a technology invented to manage an extremely large data. In this section, we will be … [Read more...] about Big Data
Data Warehouse and Mining
Data Cube
A data cube in a data warehouse is a multidimensional structure used to store data. The data cube was initially planned for the OLAP tools that could easily access the multidimensional data. But the data cube can also be used for data mining. Data cube represents the data in terms of dimensions and facts. A data cube is used to represents the aggregated data. A data cube is … [Read more...] about Data Cube
Difference Between Data Warehouse and Data Lake
Data warehouse and data lake both are the centralized storage on an enterprise. But the basic difference between data warehouse and data lake is that the data warehouse has the structured and pre-processed data contrary to this the data lake accommodates a heterogeneous data in its raw format. Data in a data warehouse is analyzed for retrieving strategic information which … [Read more...] about Difference Between Data Warehouse and Data Lake
Data Transformation
Data transformation is data preprocessing technique used to reorganize or restructure the raw data in such a way that the data mining retrieves strategic information efficiently and easily. Data transformation include data cleaning and data reduction processes such as smoothing, clustering, binning, regression, histogram etc. In this section, we will study different … [Read more...] about Data Transformation
Data Reduction
Data reduction is a process that reduced the volume of original data and represents it in a much smaller volume. Data reduction techniques ensure the integrity of data while reducing the data. The time required for data reduction should not overshadow the time saved by the data mining on the reduced data set. In this section, we will discuss data reduction in brief and we … [Read more...] about Data Reduction
Data Integration
Data integration means merging data from several heterogeneous sources. While performing the data integration you have to deal with several issues such as data redundancy, inconsistency, duplicity and many more. In this section, we are going to discuss data integration at a stretch and along with that, we will also discuss the issues or challenges faced during data … [Read more...] about Data Integration
Data Cleaning
Data cleaning is the technique used to eliminate the inconsistencies and irregularities in the data. Redundant or irrelevant data only increase the amount of storage. So, it is very important to clean the data as the inaccurate data not only confuses the data mining programs but also degrades the quality of data. In this section, we will discuss data mining in brief along … [Read more...] about Data Cleaning
Data Warehouse Architecture
Data warehouse architecture is about organizing the building blocks or the components in such a way that they extract more benefit for an enterprise. It’s all up to the requirement of the enterprise whether it wants to stress on a specific component or boost any other component with tools and services. In this context, we are going to discuss the architecture of the data … [Read more...] about Data Warehouse Architecture
Data Warehouse and its Features
A data warehouse can be defined as an informational environment that assists in extracting strategic information that is useful in making the strategic decision for the betterment of the enterprise. In this context, we will define the data warehouse in brief along with the features that explain how data warehouse provides an integrated view of an entire enterprise. We will … [Read more...] about Data Warehouse and its Features
Need of Data Warehouse
A company experiences the need of data warehouse when the executives and managers of the organization require the strategic information for taking decisions that keeps the company competitive in the market and ensures the company's survival. In the early 1960s, the computer system emerges as an essential requirement as they would perform order processing, payment billing, … [Read more...] about Need of Data Warehouse