What are the Steps Involved in the Data Mining Process? |
|
University | Amity blog |
Service Type | Assignment |
Course | |
Semester | |
Short Name or Subject Code | Data warehousing and Mining |
Product | of Assignment (Amity blog) |
Pattern | Section A,B,C Wise |
Price |
Click to view price |
Data warehousing and Mining
Assignment A
1. What are the steps involved in the data mining process?
2. What is the difference between OLTP and data warehouse?
3. What is spatial mining?
4. Why pre-process the data?
5. Define gain ratio.
6. Compare clustering and classification.
7. List any four data mining applications.
8. What are the goals of time series analysis?
Assignment B
Case Detail:
Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order)
13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
1. What is the mean of the data? What is the median?
2. What is the mode of the data? Comment on the data’s modality (i.e., bimodal, trimodal, etc.).
3. What is the midrange of the data?
Assignment C
Question No. 1 Marks - 10
Data mining can also applied to other forms such as................
I) Data streams
ii) Sequence data
iii) Networked data
iv) Text data
v) Spatial data
Options
I , ii, iii and v only
ii, iii, iv and v only
I , iii, iv and v only
All I , ii, iii, iv and v
Question No. 2 Marks - 10
Which of the following is not a data mining functionality?
Options
Characterization and Discrimination
Classification and regression
Selection and interpretation
Clustering and Analysis
Question No. 3 Marks - 10
............................. is a summarization of the general characteristics or features of a target class of data.
Options
Data Characterization
Data Classification
Data discrimination
Data selection
Question No. 4 Marks - 10
............................. is a comparison of the general features of the target class data objects against the general features of objects from one or multiple contrasting classes.
Options
Data Characterization
Data Classification
Data discrimination
Data selection
Question No. 5 Marks - 10
Strategic value of data mining is......................
Options
cost-sensitive
work-sensitive
time-sensitive
technical-sensitive
Question No. 6 Marks - 10
............................. is the process of finding a model that describes and distinguishes data classes or concepts.
Options
Data Characterization
Data Classification
Data discrimination
Data selection
Question No. 7 Marks - 10
The various aspects of data mining methodologies is/are...................
i) Mining various and new kinds of knowledge
ii) Mining knowledge in multidimensional space
iii) Pattern evaluation and pattern or constraint-guided mining.
iv) Handling uncertainty, noise, or incompleteness of data
Options
i, ii and iv only
ii, iii and iv only
i, ii and iii only
All i, ii, iii and iv
Question No. 8 Marks - 10
. The full form of KDD is
Options
Knowledge Database
Knowledge Discovery Database
Knowledge Data House
Knowledge Data Definition
Question No. 9 Marks - 10
The output of KDD is
Options
Data
Information
Query
Useful information
Question No. 10 Marks - 10
.__________ is a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management decisions.
Options
Data Mining.
Data Warehousing.
Web Mining.
Text Mining.
Question No. 11 Marks - 10
The data Warehouse is__________.
Options
Read only.
Write only.
Read write only.
None.
Question No. 12 Marks - 10
Expansion for DSS in DW is:
Options
Decision Support system.
Decision Single System.
Data Storable System.
Data Support System
Question No. 13 Marks - 10
The important aspect of the data warehouse environment is that data found within the data warehouse is___________.
Options
Subject-oriented.
Time-variant.
Integrated.
All of the above.
Question No. 14 Marks - 10
The time horizon in Data warehouse is usually __________.
Options
1-2 years.
3-4years.
5-6 years.
5-10 years.
Question No. 15 Marks - 10
The data is stored, retrieved & updated in ____________.
Options
OLAP.
OLTP.
SMTP.
FTP.
Question No. 16 Marks - 10
.__________describes the data contained in the data warehouse.
Options
Relational data.
Operational data.
Metadata.
Informational data.
Question No. 17 Marks - 10
. ____________predicts future trends & behaviours, allowing business managers to make proactive, knowledge-driven decisions.
Options
Data warehouse.
Data mining.
Datamarts.
Metadata.
Question No. 18 Marks - 10
__________ is the heart of the warehouse.
Options
Data mining database servers.
Data warehouse database servers.
Data mart database servers.
Relational data base servers
Question No. 19 Marks - 10
. ________________ is the specialized data warehouse database.
Options
Oracle.
DBZ.
Informix.
Redbrick.
Question No. 20 Marks - 10
. ________________defines the structure of the data held in operational databases and used by operational applications.
Options
User-level metadata.
Data warehouse metadata.
Operational metadata.
Data mining metadata.
Question No. 21 Marks - 10
. ________________ is held in the catalog of the warehouse database system.
Options
Application level metadata.
Algorithmic level metadata.
Departmental level metadata.
Core warehouse metadata.
Question No. 22 Marks - 10
_________maps the core warehouse metadata to business concepts, familiar and useful to end users.
Options
Application level metadata.
User level metadata.
Enduser level metadata.
Core level metadata.
Question No. 23 Marks - 10
What consists of formal definitions, such as a COBOL layout or a database schema?
Options
Classical metadata.
Transformation metadata.
Historical metadata.
Structural metadata.
Question No. 24 Marks - 10
__ consists of information in the enterprise that is not in classical form.
Options
Mushy metadata.
Differential metadata.
Data warehouse.
Data mining.
Question No. 25 Marks - 10
. ______________databases are owned by particular departments or business groups.
Options
Informational.
Operational.
Both informational and operational.
Flat.
Question No. 26 Marks - 10
. The star schema is composed of __________ fact table.
Options
one
two
three
four
Question No. 27 Marks - 10
The time horizon in operational environment is ___________.
Options
30-60 days.
60-90 days.
90-120 days.
120-150 days.
Question No. 28 Marks - 10
The key used in operational environment may not have an element of__________.
Options
Time.
cost.
Frequency.
Quality.
Question No. 29 Marks - 10
Data can be updated in _____environment.
Options
Data warehouse.
Data mining.
Operational.
Informational
Question No. 30 Marks - 10
Record cannot be updated in _____________.
Options
OLTP
files
RDBMS
data warehouse
Question No. 31 Marks - 10
The source of all data warehouse data is the____________.
Options
Operational environment.
Informal environment.
Formal environment.
Technology environment.
Question No. 32 Marks - 10
Data warehouse contains _____________ data that is never found in the operational environment.
Options
Normalized.
Informational.
Summary.
Denormalized.
Question No. 33 Marks - 10
The modern CASE tools belong to _______ category.
Options
Analysis.
Development
Coding
Delivery
Question No. 34 Marks - 10
Bill Inmon has estimated ___________ of the time required to build a data warehouse, is consumed in the conversion process.
Options
10 percent.
20 percent.
40 percent
80 percent.
Question No. 35 Marks - 10
Detail data in single fact table is otherwise known as__________.
Options
Mono atomic data.
Diatomic data.
Atomic data.
Multi atomic data.
Question No. 36 Marks - 10
_______test is used in an online transactional processing environment.
Options
MEGA.
MICRO.
MACRO.
ACID.
Question No. 37 Marks - 10
___________ is a good alternative to the star schema.
Options
Star schema.
Snowflake schema.
Fact constellation.
Star-snowflake schema.
Question No. 38 Marks - 10
The biggest drawback of the level indicator in the classic star-schema is that it limits_________.
Options
Quantify.
Qualify.
Flexibility.
Ability.
Question No. 39 Marks - 10
A data warehouse is _____________.
Options
Updated by end users.
contains numerous naming conventions and formats
Organized around important subject areas.
Contains only current data.
Question No. 40 Marks - 10
An operational system is _____________.
Options
Used to run the business in real time and is based on historical data.
Used to run the business in real time and is based on current data.
Used to support decision making and is based on current data.
Used to support decision making and is based on historical data.