Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- 1. Introduction to Data Warehousing, A Multi-dimensional Data Model & Schemas,
- OLAP Operations & Servers (6 Lect.)
- • An overview and definition along with clear understanding of the four key-words
- appearing in the definition.
- • Differences between Operational Database Systems and Data Warehouses; Difference
- between OLTP & OLAP
- • Overview of Multi-dimensional Data Model, and the basic differentiation between "Fact"
- and "Dimension"; Multi-dimensional Cube
- • Concept Hierarchies of "Dimensions" Parameters: Examples and the advantages
- • Star, Snowflakes, and Fact Constellations Schemas for Multi-dimensional Databases
- • Measures: Their Categorization and Computation
- • Pre-computation of Cubes, Constraint on Storage Space, Possible Solutions
- • OLAP Operations in Multi-dimensional Data Model: Roll-up, Drill-down, Slice & Dice,
- Pivot (Rotate)
- • Indexing OLAP Data; Efficient Processing of OLAP Queries
- • Type of OLAP Servers: ROLAP versus MOLAP versus HOLAP
- • Metadata Repository
- 2. Data Warehouse Architecture; Further Development of Data Cube & OLAP
- Technology (3 Lect.)
- • The Design of A Data Warehouse: A Business Analysis Framework; The Process of Data
- Warehouse Design
- • A 3-Tier Data Warehouse Architecture; Enterprise Warehouse, Data mart, Virtual
- Warehouse
- • Discovery-Driven Exploration of Data Cubes; Complex Aggregation at Multiple
- Granularity: Multi-feature Cubes
- • Constrained Gradient Analysis of Data Cubes
- 3. P re-processing (7 Lect.)
- • The need for Pre-processing, Descriptive Data Summarization
- • Data Cleaning: Missing Values, Noisy Data, Data Cleaning as a Process
- • Data Integration & Transformation
- • Data Cube Aggregation; Attribute Subset Selection
- • Dimesionality Reduction: Basic Concepts only
- • Numerosity Reduction: Regression & Log-linear Models, Histograms, Clustering,
- Sampling
- • Data Dicretization & Concept Hierarchy Generation
- • For Numerical Data: Binning, Histogram Analysis, Entropy-based Discretization,
- Interval Merging by x Analysis, Cluster Analysis, Discretization by Intuitive Partitioning
- • For Categorical Data
- 4. Data Mining: Introduction (4 Lect.)
- • An Overview; What is Data Mining; Data Mining - on What Kind of Data
- • Data Mining Functionalities - What Kind of Patterns Can be Mined; Concept/Class
- Description: Characterization & Discrimination; Mining Frequent Patterns, Associations,
- and Correlations; Classification & Prediction; Cluster Analysis; Outlier Analysis
- • Are All of the Patterns Interesting
- • Classification of Data Mining Systems
- • Data Mining Task Primitives
- • Integration of a Data Mining System with a Database or Data Warehouse System
- • Major Issues in Data Mining
- 5. Attribute-Oriented Induction: An Alternate Method for Data Generalization &
- Concept Description (4 Lect.)
- • Attribute-Oriented Induction for Data Characterization, and Its Efficient Implementation;
- Presentation of the Derived Generalization
- • Mining Class Comparisons: Discrimination between Different Classes
- • Class Descriptions: Presentation of both Characterization & Comparison
- 6. Mining Frequent Patterns, Associations, and Correlations (4 Lect.)
- • Basic Concepts: Market Basket Analysis; Frequent Itemsets, Closed Itemsets, and
- Association Rules; Frequent Pattern Mining: A Roadmap
- • Apriori Algorithm: Finding Frequent Itemsets Using Candidate Generation; Generating
- Association Rules from Frequent Itemsets; Improving the Efficiency of Apriori
- • From Association Mining to Correlation Analysis; Strong Rules Are Not Necessarily
- Interesting: An Example; From Association Analysis to Correlation Analysis
- 7. Classification & Prediction (9+2 Lect.)
- • Introduction to Classification and Prediction; Basics of Supervised & Unsupervised
- Learning; Preparing the Data for Classification and Prediction; Comparing Classification
- and Prediction Methods
- • Classification by Decision Tree Induction, Attribute Selection Measures; Tree Pruning;
- Scalability and Decision Tree Induction
- • Rule-based Classification: Using IF-THEN Rules for Classification; Rule Extraction
- from a Decision Trees; Rule Induction Using a Sequential Covering Algorithm
- • Bayesian Classification: Bayes' Theorem, Naive Bayesian Classification; Bayesian
- Belief Networks
- • An Overview of Other Classification Methods (2 Lectures)
- • Prediction: Linear Regression; Non-linear Regression; Other Regression Models
- • Classifier Accuracy and Error Measures: Classifier Accuracy Measures; Predictor Error
- Measures
- • Evaluating the Accuracy of a Classifier or Predictor: Holdout Method and Random Sub-
- sampling; Cross Validation; Bootstrap
- • Ensemble Methods - Increasing the Accuracy: Bagging; Boosting
- 8. Cluster Analysis (6+2 Lect.)
- • Introduction to Cluster Analysis; Types of Data in Cluster Analysis; A Categorization of
- major Clustering Methods
- • Partitioning Methods; Centroid-Based Technique: K-Means Method; Overview of Other
- Clustering Methods
- • An Overview of Other Clustering Methods (2 Lectures)
- • Outlier Analysis; Statistical Distribution-based Outlier Detection; Distance-based Outlier
- Detection; Density-based Outlier Detection; Deviation-based Outlier Detection
- 9. Data Mining Applications (3 Lect.)
- • Data Mining for: (a) Financial Data Analysis; (b) The Retail Industry; (c) The
- Telecommunication Industry; (d) Biological Data Analysis; (e) Other Scientific
- Applications; (f) Intrusion detection
- • Data Mining Systems: (a) How to Choose; (b) Examples of Commercial Data Mining
- Systems
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement