final term solve paper CS614- Data Warehousing current spring July 2011

Question No: 1 ( Marks: 1 ) – Please choose one
It is observed that every year the amount of data recorded in anorganization is

Doubles (handouts page # 6)

Triples

Quartiles

Remains same as previous year
Question No: 2 ( Marks: 1 ) – Please choose one
Multidimensional databases typically use proprietary __________ format to store
pre-summarized cube structures.

File ( Page # 69 )

Application

Aggregate

Database
Question No: 3 ( Marks: 1 ) – Please choose one
Pre-computed _______ can solve performance problems

Aggregates (page # 101)

Facts

Dimensions
Question No: 4 ( Marks: 1 ) – Please choose one
_______________, if fits into memory, costs only one disk I/O access to locate a

record by given key.

A Dense Index (page # 211)

A Sparse Index

An Inverted Index

None of These
Question No: 5 ( Marks: 1 ) – Please choose one
The degree of similarity between two records, often measured by a numerical
value between _______, usually depends on application characteristics.

0 and 1 (page # 157 )

0 and 10

0 and 100

0 and 99
Question No: 6 ( Marks: 1 ) – Please choose one
The purpose of the House of Quality technique is to reduce ______ types of risk.

Two (page # 181)

Three

Four

All
Question No: 7 ( Marks: 1 ) – Please choose one
NUMA stands for __________

Non-uniform Memory Access ( page # 194)

Non-updateable Memory Architecture

New Universal Memory Architecture
Question No: 8 ( Marks: 1 ) – Please choose one
Which is the least appropriate join operation for Pipeline parallelism?

Hash Join

Inner Join

Outer Join

Sort-Merge Join
Question No: 9 ( Marks: 1 ) – Please choose one
There are many variants of the traditional nested-loop join. If the index is built as
part of the query plan and subsequently dropped, it is called

Naive nested-loop join

Index nested-loop join

Temporary index nested-loop join ( page # 230)

None of these

Question No: 10 ( Marks: 1 ) – Please choose one
Data mining derives its name from the similarities between searching for valuable
business information in a large database, for example, finding linked products in
gigabytes of store scanner data, and mining a mountain for a _________ of
valuable ore.

Furrow

Streak

Trough

Vein
Question No: 11 ( Marks: 1 ) – Please choose one
With data mining, the best way to accomplish this is by setting aside some of
your data in a ________ to isolate it from the mining process; once the mining is
complete, the results can be tested against the isolated data to confirm the
model’s validity
.

Cell

Disk

Folder

Vault
Question No: 12 ( Marks: 1 ) – Please choose one
The Kimball s iterative data warehouse development approach drew on decades
of experience to develop the _____________.

Business Dimensional Lifecycle (page # 276 )

Data Warehouse Dimension

Business Definition Lifecycle

OLAP Dimension
Question No: 13 ( Marks: 1 ) – Please choose one
We must try to find the one access tool that will handle all the needs of their
users.

True

False
Question No: 14 ( Marks: 1 ) – Please choose one
For a smooth DWH implementation we must be a technologist.

True

False (page # 306)
Question No: 15 ( Marks: 1 ) – Please choose one
During the application specification activity, we also must give consideration to
the organization of the applications.

True ( page # 294 )

False
Question No: 16 ( Marks: 1 ) – Please choose one
Investing years in architecture and forgetting the primary purpose of solving
business problems, results in inefficient application. This is the example of
_________ mistake.

Extreme Technology Design

Extreme Architecture Design

None of these (page # 303)
Question No: 17 ( Marks: 1 ) – Please choose one
The most recent attack is the ________ attack on the cotton crop during 2003-
04, resulting in a loss of nearly 0.5 million bales
.

Boll Worm (VIDO LECTURE # 38)

Purple Worm

Blue Worm

Cotton Worm
Question No: 18 ( Marks: 1 ) – Please choose one
The users of data warehouse are knowledge workers in other words they are
_________ in the organization.

Decision maker (page# 10 )

Manager

Database Administrator

DWH Analyst
Question No: 19 ( Marks: 1 ) – Please choose one
_________ breaks a table into multiple tables based upon common column
values.

Horizontal splitting (page # 46 )

Vertical splitting
Question No: 20 ( Marks: 1 ) – Please choose one
Execution can be completed successfully or it may be stopped due to some
error. In case of successful completion of execution all the transactions will be
___________

Committed to the database (page # 398 last line)

Rolled back
Question No: 21 ( Marks: 2 )
What is meant by the statement Be a diplomat NOT a technologist in the
context of a data warehouse development project?

7. Be a diplomat NOT a technologist

The biggest problem you will face during a warehouse implementation will be people, not the technology or the development. You’re going to have senior management complaining about completion dates and unclear objectives. You’re going to have development people protesting that everything takes too long and why can’t they do it the old way? You’re going to have users with outrageously unrealistic expectations, who are used to systems that require mouse-clicking but not much intellectual investment on their part. And you’re going to grow exhausted, separating out Needs from Wants at all levels. Commit from the outset to work very hard at communicating the realities, encouraging investment, and cultivating the development of new skills in your team and your users (and even your bosses).

Question No: 22 ( Marks: 2 )
Elaborate the concept of data parallelism.

  • § Parallel execution of a single data manipulation task across multiple partitions of data.
  • § Partitions static or dynamic
  • § Tasks executed almost-independently across partitions.
  • § “Query coordinator” must coordinate between the independently executing processes.

So data parallelism is I think the simplest form of parallelization. The idea is that we have parallel execution of single data operation across multiple partitions of data. So the idea here is that these partitions of data may be defined statically or dynamically fine, but we are requiring the same operator across these multiple partitions concurrently. And this idea actually of data parallelism has existed for a very long time.

Question No: 23 ( Marks: 2 )
What will be the effect if we program a package by using DTS object model?

Question No: 24 ( Marks: 3 )
What is meant by the classification process? How we measure the accuracy of
classifiers?

Classification means that based on the properties of existing data, we have made or groups i.e. we have made classification.
Question No: 25 ( Marks: 3 )
How page dimension captures the static and dynamic nature of different web
pages?

Question No: 26 ( Marks: 3 )
Write down the limitations of pipelining parallelism?

Pipeline parallelism is a good fit for data warehousing (where we are working with lots of data), but it makes no sense for OLTP because OLTP tasks are not big enough to justify breaking them down into subtasks.

Question No: 27 ( Marks: 5 )
For a maximum performance of Bitmapped index, what characteristics a query
should have?

Question No: 28 ( Marks: 5 )
How the three parallel tracks capture the user requirements in the Kimball s data
warehouse life cycle Road Map?

Question No: 29 ( Marks: 5 )
How time contiguous log entries and HTTP secure socket layer are used for user
session identification? What are the limitations of these techniques?

Question No: 30 ( Marks: 10 )
What are the issues regarding the record management tools at campuses where
text files are used to store data?

Main issues

Data duplication

Update the data

Data deletion

We can easily elaborate these issues

Question No: 31 ( Marks: 10 )
Shared RDBMS architecture requires a static partitioning. How do you perform the
partitioning.

download more papers 

One Response to final term solve paper CS614- Data Warehousing current spring July 2011

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>