What are the Steps Involved in the Data Mining Process? | SolveZone
whatssapp

Product Detail

What are the Steps Involved in the Data Mining Process?

University  Amity blog
Service Type Assignment
Course
Semester
Short Name or Subject Code Data warehousing and Mining 
Product of Assignment (Amity blog)
Pattern Section A,B,C Wise
Price
Click to view price

Data warehousing and Mining 

Assignment A

1. What are the steps involved in the data mining process?


2. What is the difference between OLTP and data warehouse?

3. What is spatial mining?

4. Why pre-process the data?


5. Define gain ratio.

    
6. Compare clustering and classification.


7. List any four data mining applications.

    
8. What are the goals of time series analysis?


Assignment B

Case Detail:

Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order)

13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.

1. What is the mean of the data? What is the median? 

2. What is the mode of the data? Comment on the data’s modality (i.e., bimodal, trimodal, etc.). 


3. What is the midrange of the data?

Assignment C

Question No.  1    Marks - 10
Data mining can also applied to other forms such as................


I) Data streams

ii) Sequence data

iii) Networked data

iv) Text data

v) Spatial data    
 
Options    
I , ii, iii and v only

ii, iii, iv and v only    
I , iii, iv and v only

All  I , ii, iii, iv and v

Question No.  2    Marks - 10
Which of the following is not a data mining functionality?
     
Options    
Characterization and Discrimination

Classification and regression

Selection and interpretation

Clustering and Analysis

Question No.  3    Marks - 10
............................. is a summarization of the general characteristics or features of a target class of data.    

Options    
Data Characterization 

Data Classification 

Data discrimination

Data selection

Question No.  4    Marks - 10
............................. is a comparison of the general features of the target class data objects against the general features of objects from one or multiple contrasting classes.    
 
Options    
Data Characterization 

Data Classification 

Data discrimination

Data selection

Question No.  5    Marks - 10
Strategic value of data mining is......................    
 
Options    
cost-sensitive

work-sensitive

time-sensitive

technical-sensitive    

Question No.  6    Marks - 10
............................. is the process of finding a model that describes and distinguishes data classes or concepts. 
     
Options    
Data Characterization 

Data Classification 

Data discrimination

Data selection

Question No.  7    Marks - 10
The various aspects of data mining methodologies is/are...................

i) Mining various and new kinds of knowledge

ii) Mining knowledge in multidimensional space

iii) Pattern evaluation and pattern or constraint-guided mining.

iv) Handling uncertainty, noise, or incompleteness of data
     
Options    
i, ii and iv only

ii, iii and iv only

i, ii and iii only

All i, ii, iii and iv

Question No.  8    Marks - 10
. The full form of KDD is    
 
Options    
Knowledge Database

Knowledge Discovery Database

Knowledge Data House

Knowledge Data Definition

Question No.  9    Marks - 10
The output of KDD is
     
 Options    

Data

Information

Query

Useful information

Question No.  10    Marks - 10
.__________ is a subject-oriented, integrated, time-variant, non-volatile collection of data in support of management decisions.    
 
Options    
Data Mining.

Data Warehousing.

Web Mining.

Text Mining.

Question No.  11    Marks - 10
The data Warehouse is__________.    
 
Options    
Read only.

Write only.

Read write only.

None.

Question No.  12    Marks - 10
Expansion for DSS in DW is:    
 
Options    
Decision Support system.

Decision Single System.

Data Storable System.

Data Support System    

Question No.  13    Marks - 10
The important aspect of the data warehouse environment is that data found within the data warehouse is___________.
     
Options    
Subject-oriented.

Time-variant.

Integrated.

All of the above.

Question No.  14    Marks - 10
The time horizon in Data warehouse is usually __________.

Options    
1-2 years.

3-4years.

5-6 years.

5-10 years.

Question No.  15    Marks - 10
The data is stored, retrieved & updated in ____________.    
 
Options    
OLAP.

OLTP.

SMTP.

FTP.

Question No.  16    Marks - 10
.__________describes the data contained in the data warehouse.
     
Options    
Relational data.    
Operational data.    
Metadata.      
Informational data.    

Question No.  17    Marks - 10
. ____________predicts future trends & behaviours, allowing business managers to make proactive, knowledge-driven decisions.
 
Options    
Data warehouse.

Data mining.

Datamarts.

Metadata.

Question No.  18    Marks - 10
__________ is the heart of the warehouse.
     
Options    
Data mining database servers.

Data warehouse database servers.

Data mart database servers.

Relational data base servers

Question No.  19    Marks - 10
. ________________ is the specialized data warehouse database.
      
Options    
Oracle.

DBZ.

Informix.

Redbrick.

Question No.  20    Marks - 10
. ________________defines the structure of the data held in operational databases and used by operational applications.
     
Options    
User-level metadata.

Data warehouse metadata.

Operational metadata.

Data mining metadata.

Question No.  21    Marks - 10
. ________________ is held in the catalog of the warehouse database system.    
Options    
Application level metadata.

Algorithmic level metadata.

Departmental level metadata.

Core warehouse metadata.

Question No.  22    Marks - 10
_________maps the core warehouse metadata to business concepts, familiar and useful to end users.
     
Options    
Application level metadata.

User level metadata.

Enduser level metadata.

Core level metadata.    

Question No.  23    Marks - 10
What consists of formal definitions, such as a COBOL layout or a database schema?    
 
Options    
Classical metadata.

Transformation metadata.

Historical metadata.

Structural metadata.    

Question No.  24    Marks - 10
__ consists of information in the enterprise that is not in classical form.
     
Options    
Mushy metadata.

Differential metadata.

Data warehouse.

Data mining.

Question No.  25    Marks - 10
. ______________databases are owned by particular departments or business groups.     
Options    
Informational.

Operational.

Both informational and operational.

Flat.

Question No.  26    Marks - 10
. The star schema is composed of __________ fact table.
    
Options    
one    
two    
three    
four

Question No.  27    Marks - 10
The time horizon in operational environment is ___________.    
 
Options    
30-60 days.

60-90 days.

90-120 days.

120-150 days.

Question No.  28    Marks - 10
The key used in operational environment may not have an element of__________.    
 
Options    
Time.


cost.

Frequency.

Quality.

Question No.  29    Marks - 10
Data can be updated in _____environment.
  
Options    
Data warehouse.

Data mining.

Operational.

Informational

Question No.  30    Marks - 10
Record cannot be updated in _____________.
  
Options    
OLTP

files

RDBMS

data warehouse

Question No.  31    Marks - 10
The source of all data warehouse data is the____________.    
 
Options    
Operational environment.

Informal environment.

Formal environment.

Technology environment.

Question No.  32    Marks - 10
Data warehouse contains _____________ data that is never found in the operational environment.
      
Options    
 
Normalized.    

Informational.

Summary.

Denormalized.

Question No.  33    Marks - 10
The modern CASE tools belong to _______ category.    
 
Options    
Analysis.


Development

Coding

Delivery

Question No.  34    Marks - 10
Bill Inmon has estimated ___________ of the time required to build a data warehouse, is consumed in the conversion process.
     
Options    
10 percent.


20 percent.

40 percent

80 percent.

Question No.  35    Marks - 10
Detail data in single fact table is otherwise known as__________.
     
Options    
Mono atomic data.


Diatomic data.

Atomic data.

Multi atomic data.    

Question No.  36    Marks - 10
_______test is used in an online transactional processing environment.    
 
Options    
MEGA.


MICRO.

MACRO.

ACID.

Question No.  37    Marks - 10
___________ is a good alternative to the star schema.    
 
Options    
Star schema.


Snowflake schema.

Fact constellation.

Star-snowflake schema.

Question No.  38    Marks - 10
The biggest drawback of the level indicator in the classic star-schema is that it limits_________.
     
Options
    
Quantify.

Qualify.

Flexibility.

Ability.

Question No.  39    Marks - 10
A data warehouse is _____________.
     
Options    
Updated by end users.

contains numerous naming conventions and formats

Organized around important subject areas.

Contains only current data.

Question No.  40    Marks - 10
An operational system is _____________.    
 
Options    

Used to run the business in real time and is based on historical data.  

Used to run the business in real time and is based on current data.

Used to support decision making and is based on current data.

Used to support decision making and is based on historical data.