Data Mining and Warehousing Question Bank - All Units

MANAKULA VINAYAGAR INSTITUTE OF TECHNOLOGY
KALITHEERTHALKUPPAM, PUDUCHERRY – 605107
DEPARTMENT OF INFORMATION TECHNOLOGY
YEAR/SEM: IV/VIII                                                ACADEMIC YEAR: 2013–14(EVEN)

IT E82 DATA MINING AND WAREHOUSING - QUESTION BANK

UNIT –I
PART – A
1. Define Data mining.
2. What are the other terminologies referring to data mining?
3. List out the applications of data mining.
4. Differentiate data mining tools and query tools.
5. What is meant by machine learning?
6. List out the data mining processing steps.
7. What are the techniques used in data mining?
8. Define clustering.
9. Define regression.
10. Give the types of regression.
11. What is classification?
12. What is an association rule?
13. Define prediction.
14. Define binning.
15. Why machine learning is done?
16. Difference between supervised learning and unsupervised learning.
17. Define data cleaning.
18. What is pattern evaluation?
19. What is descriptive and predictive data mining?
20. What are the goals of time series analysis?
21. Classify data mining systems.

PART – B
1. What is data mining? Explain the steps in data mining process.
2. Explain major requirements and challenges in data mining.
3. Explain the data mining functionalities.
4. Explain the contrast between data mining tools and query tools.
5. Give in detail about the data mining techniques.
6. What is machine learning? Why machine learning must be performed? Explain its types.
7. Describe the taxonomy of data mining tasks.
8. Explain the various data mining issues.
9. Explain the various data mining repositories on which mining can be performed.


UNIT – II
PART – A
1. Define data warehouse.
2. What is the need of data warehouses?
3. Define OLAP.
4. Define multidimensional data model.
5. What is a data cube?
6. Define dimensions.
7. What are facts?
8. Define OLTP.
9. Define OLAP.
10. Define dimension table.
11. Define fact table.
12. What are lattice of cuboids?
13. What is apex of cuboid?
14. List out the various OLAP operations.
15. Give the names of warehouse schemas.
16. Define star schema.
17. Define snowflake schema.
18. Draw a neat diagram of data warehouse architecture.
19. Define data mart.
20. Define metadata.
21. What are the applications of metadata?
22. List out the types of metadata.
23. What are the processes being carried out in backend of data warehouse?
24. What are the phases present in development cycle of a data warehouse?
25. Give the differences between a database and a data warehouse.
26. What is meant by operational environment?
27. How a pivot operation acts on the data cube?
28. What is generalization?

PART –B
1. Explain multidimensional data model with a neat diagram.
2. List out the OLAP operations and explain the same with an example.
3. Describe about dimension modeling in detail.
4. Explain the various schemas of a data warehouse.
5. Define data warehouse. Draw the architecture of data warehouse and explain the three tiers in detail.
6. Explain in detail about the implementation of a data warehousing.
7. Define metadata and explain the types of metadata.
8. Discuss the development lifecycle of a data warehouse.
9. Explain the processes taking place in the backend of a data warehouse.

UNIT – III
PART – A
1. Give some examples of data preprocessing techniques.
2. List out the preprocessing techniques available in data mining.
3. Define data cleaning.
4. What are the smoothing techniques available to remove noise/
5. Define data transformation.
6. Define a concept hierarchy.
7. Define data reduction.
8. What are the data mining task primitives?
9. Define association rule mining.
10. Define DMQL.
11. What is generalization?
12. What is summarization?
13. Define discretization.
14. Define transactional databases.
15. Define relational databases.
16. Differentiate the two types of regression.
17. Define apriori algorithm.
18. What is anti – monotone property.
19. Define support and confidence.
20. How are association rules mined from large databases/
21. Define aggregation.
22. What do you mean by numerosity reduction?

PART – B
1. Explain various issues related to data cleaning.
2. Explain the data preprocessing techniques in detail.
3. Explain the smoothing techniques.
4. Explain data transformation in detail.
5. Discuss normalization in detail.
6. Explain in detail about data reduction .
7. Explain parametric and non parametric methods of data reduction.
8. Discuss data discretization and concept hierarchy generation.
9. Explain about generalization and summarization.
10. How association rules are mined from databases.
11. Explain mining multidimensional data from transactional databases and relational databases.

UNIT – IV
PART – A
1. What are the steps involved in preparing the data for classification?
2. Define the concept of classification.
3. What is decision tree?
4. What is attribute selection measure?
5. List out the tree pruning methods.
6. Define pre pruning.
7. Define post pruning.
8. What is meant by pattern?
9. Define prediction.
10. Define back propagation.
11. What are outliers?
12. Define the centroid of the cluster.
13. What are the hierarchical methods used in classification?
14. What are Bayesian classifiers?
15. Write notes on k-means algorithm.
16. List out the density based methods.
17. Define bagging.
18. Define boosting.
19. List out the partitioning methods.
20. Define attribute oriented induction.
21. Define CLARA and CLARANS.
22. Differentiate Agglomerative and Divisive Hierarchical Clustering?
23. What is a DBSCAN?
24. What is a STING?
25. What is interval scaled variables?
26. What is CURE?

PART –B
1. Explain Indexing with suitable examples.
2. Explain the Back Propagation technique.
3. Explain Naïve Bayesian classification in detail with example.
4. Explain various classification methods.
5. Discuss Classifier accuracy with examples.
6. Elaborate the various partitioning methods in detail.
7. Explain the hierarchical methods of classifications.
8. Discuss the classification by decision tree induction.
9. Explain density based clustering methods in detail.
10. Discuss about the grid based methods.
11. Explain outlier analysis.
12. Discuss about prediction in detail.

UNIT – V
PART – A
1. Define web mining.
2. What is a multimedia database?
3. Define web content mining.
4. Define web structure mining.
5. Define web usage mining.
6. What is spatial mining?
7. What is time series analysis?
8. Define sequence mining.
9. Define graph mining.
10. What are the applications of data mining?
11. What are the additional themes in data mining?
12. What is page rank?

PART – B
1. Explain the process of mining the World Wide Web.
2. Explain about the partitioning methods.
3. Discuss about model based clustering methods.
4. Explain in detail about outlier analysis.
5. Explain the various types of web mining.
6. Explain spatial mining and time series mining.
7. Discuss about graph mining.
8. Discuss about some of the case studies in data mining applications.