final term solve paper CS614- Data Warehousing current spring July 2011
Question No: 1 ( Marks: 1 ) – Please choose one
It is observed that every year the amount of data recorded in anorganization is
Doubles (handouts page # 6)
Remains same as previous year
Question No: 2 ( Marks: 1 ) – Please choose one
Multidimensional databases typically use proprietary __________ format to store
pre-summarized cube structures.
File ( Page # 69 )
Question No: 3 ( Marks: 1 ) – Please choose one
Pre-computed _______ can solve performance problems
Aggregates (page # 101)
Question No: 4 ( Marks: 1 ) – Please choose one
_______________, if fits into memory, costs only one disk I/O access to locate a
record by given key.
A Dense Index (page # 211)
A Sparse Index
An Inverted Index
None of These
Question No: 5 ( Marks: 1 ) – Please choose one
The degree of similarity between two records, often measured by a numerical
value between _______, usually depends on application characteristics.
0 and 1 (page # 157 )
0 and 10
0 and 100
0 and 99
Question No: 6 ( Marks: 1 ) – Please choose one
The purpose of the House of Quality technique is to reduce ______ types of risk.
Two (page # 181)
Question No: 7 ( Marks: 1 ) – Please choose one
NUMA stands for __________
Non-uniform Memory Access ( page # 194)
Non-updateable Memory Architecture
New Universal Memory Architecture
Question No: 8 ( Marks: 1 ) – Please choose one
Which is the least appropriate join operation for Pipeline parallelism?
Question No: 9 ( Marks: 1 ) – Please choose one
There are many variants of the traditional nested-loop join. If the index is built as
part of the query plan and subsequently dropped, it is called
Naive nested-loop join
Index nested-loop join
Temporary index nested-loop join ( page # 230)
None of these
Question No: 10 ( Marks: 1 ) – Please choose one
Data mining derives its name from the similarities between searching for valuable
business information in a large database, for example, finding linked products in
gigabytes of store scanner data, and mining a mountain for a _________ of
Question No: 11 ( Marks: 1 ) – Please choose one
With data mining, the best way to accomplish this is by setting aside some of
your data in a ________ to isolate it from the mining process; once the mining is
complete, the results can be tested against the isolated data to confirm the
Question No: 12 ( Marks: 1 ) – Please choose one
The Kimball s iterative data warehouse development approach drew on decades
of experience to develop the _____________.
Business Dimensional Lifecycle (page # 276 )
Data Warehouse Dimension
Business Definition Lifecycle
Question No: 13 ( Marks: 1 ) – Please choose one
We must try to find the one access tool that will handle all the needs of their
Question No: 14 ( Marks: 1 ) – Please choose one
For a smooth DWH implementation we must be a technologist.
False (page # 306)
Question No: 15 ( Marks: 1 ) – Please choose one
During the application specification activity, we also must give consideration to
the organization of the applications.
True ( page # 294 )
Question No: 16 ( Marks: 1 ) – Please choose one
Investing years in architecture and forgetting the primary purpose of solving
business problems, results in inefficient application. This is the example of
Extreme Technology Design
Extreme Architecture Design
None of these (page # 303)
Question No: 17 ( Marks: 1 ) – Please choose one
The most recent attack is the ________ attack on the cotton crop during 2003-
04, resulting in a loss of nearly 0.5 million bales.
Boll Worm (VIDO LECTURE # 38)
Question No: 18 ( Marks: 1 ) – Please choose one
The users of data warehouse are knowledge workers in other words they are
_________ in the organization.
Decision maker (page# 10 )
Question No: 19 ( Marks: 1 ) – Please choose one
_________ breaks a table into multiple tables based upon common column
Horizontal splitting (page # 46 )
Question No: 20 ( Marks: 1 ) – Please choose one
Execution can be completed successfully or it may be stopped due to some
error. In case of successful completion of execution all the transactions will be
Committed to the database (page # 398 last line)
Question No: 21 ( Marks: 2 )
What is meant by the statement Be a diplomat NOT a technologist in the
context of a data warehouse development project?
7. Be a diplomat NOT a technologist
The biggest problem you will face during a warehouse implementation will be people, not the technology or the development. You’re going to have senior management complaining about completion dates and unclear objectives. You’re going to have development people protesting that everything takes too long and why can’t they do it the old way? You’re going to have users with outrageously unrealistic expectations, who are used to systems that require mouse-clicking but not much intellectual investment on their part. And you’re going to grow exhausted, separating out Needs from Wants at all levels. Commit from the outset to work very hard at communicating the realities, encouraging investment, and cultivating the development of new skills in your team and your users (and even your bosses).
Question No: 22 ( Marks: 2 )
Elaborate the concept of data parallelism.
- § Parallel execution of a single data manipulation task across multiple partitions of data.
- § Partitions static or dynamic
- § Tasks executed almost-independently across partitions.
- § “Query coordinator” must coordinate between the independently executing processes.
So data parallelism is I think the simplest form of parallelization. The idea is that we have parallel execution of single data operation across multiple partitions of data. So the idea here is that these partitions of data may be defined statically or dynamically fine, but we are requiring the same operator across these multiple partitions concurrently. And this idea actually of data parallelism has existed for a very long time.
Question No: 23 ( Marks: 2 )
What will be the effect if we program a package by using DTS object model?
Question No: 24 ( Marks: 3 )
What is meant by the classification process? How we measure the accuracy of
Classification means that based on the properties of existing data, we have made or groups i.e. we have made classification.
Question No: 25 ( Marks: 3 )
How page dimension captures the static and dynamic nature of different web
Question No: 26 ( Marks: 3 )
Write down the limitations of pipelining parallelism?
Pipeline parallelism is a good fit for data warehousing (where we are working with lots of data), but it makes no sense for OLTP because OLTP tasks are not big enough to justify breaking them down into subtasks.
Question No: 27 ( Marks: 5 )
For a maximum performance of Bitmapped index, what characteristics a query
Question No: 28 ( Marks: 5 )
How the three parallel tracks capture the user requirements in the Kimball s data
warehouse life cycle Road Map?
Question No: 29 ( Marks: 5 )
How time contiguous log entries and HTTP secure socket layer are used for user
session identification? What are the limitations of these techniques?
Question No: 30 ( Marks: 10 )
What are the issues regarding the record management tools at campuses where
text files are used to store data?
Update the data
We can easily elaborate these issues
Question No: 31 ( Marks: 10 )
Shared RDBMS architecture requires a static partitioning. How do you perform the