Hello,
I am going to be a bit cheeky here and ask for some information on Analyst
Services and the like that I have been asked to answer. Unfortunatly I do not
posess the knowledge to do most of this so I was wondering if someone out
there did, or could forward me a web site that could. Anyway here are the
questions.
1. What decides or who decides what physical implementation model to use for
OLAP? From what I've read there seems to be choices about what is pyhiscally
stored and what is referenced either from storage (through some sort of
cache) or from the source fact tables. ROLAP/HOLAP/MOLAP are mentioned at
some point. I'm guessing that this could be important but something that
could be changed as the logical view to the outside world wouldn't change
even if the physical implementation was changed at a later date. However,
having to go back and tune things later on creates a bit of a support
overhead.
2. The beginning of the book mentions concepts such as OLAP databases and
dimension libraries. Is an OLAP database a way of implementing a
datawarehouse for specific contexts such as for particular departments (e.g.
a finance database, a marketing database etc.) or is there something more
subtle to it than that? Similarly, what is the concept of dimension libraries
all about? More importantly, if we don't take an approach that uses these
structures, will we get ourselves into an organisational mess in the longer
term? Currently DNx is used by all parts of the business so the concept of
creating seperate data marts for different purposes doesn't really exist at
Dawson News.
3. Another cocept mentioned at the start of the book is "private" dimensions
that can only be used in certain OLAP cubes or OLAP databases. Whats this all
about? Superficially it seems to be a way of creating dimensions that only
operate in certain contexts which on the face of it seems to be a bit of a
cop out (ie making information meaningful to everybody but then allowing
exceptions when the going gets tough) .
4. This is a simple technical question! Is the OLAP repository in Access
(the default) or have we migrated it over to SQL Server.
5. This is a much bigger question. What is the best way of updating cubes?
One of the problems we've got is that the base table(s) currently in DNx SQL
Server (RTLSAL tables) are a summary so selective fields are incrementally
updated each day. Any ideas how the OLAP refreshes would work in this
specific case? Also, dimensions change from time to time so what is the best
way to ensure that cubes are kept up to date as the "static" data changes?
Its quite important that any data reported against is both non-volatile (ie
update once a day) and current (ie everything in synch).
6. Also linked to 5. but possibly even more genreal, what is DTS capable of?
We currently populate the RTLSAL and other tables using VB scripts which
contain the logic to deternine when to insert new rows and when to udpate
existing ones and what rules to apply when applying updates. Is this
something we could replace with DTS?
Thanks
J
I'll do my best to answer your questions:
1) The decision of which storage mode to use (MOLAP/HOLAP/ROLAP)
depends on many things. A good rule of thumb: If you need real-time
OLAP because your data changes throughout the day, you need to use
ROLAP. However the trade-off is query performance. Microsoft
recommends MOLAP for better query performance. The trade-off is
storage space, which is usually cheap nowadays. HOLAP is rarely used
from my observation.
2) OLAP is a term that refers to systems that deal with
multi-dimensional data storage and related capabilities. Conformed
dimensions are a important to an integrated data warehouse and the data
warehouse bus architecture (read Ralph Kimballs book Data Warehouse
Toolkit)
3)You should use shared dimensions whenever possible instead of Private
ones. The drawback is shared dimensions are always in memory, but if
they are small, it's not a problem.
4)Microsoft recommends you migrate your repository to SQL Server for
all the reasons Microsoft wants everything in SQL Server.
5)The one thing I found from experience is the success of your OLAP
solution is greatly impacted by your data source that feeds the cubes.
Spend as much time as possible to design the most effective data mart
to feed your cubes. In other words, do all of your updating in the
relational database as appropriate (using SCD where necessary).
Analysis Services will handle the cube processing increntally, or full
depending on your implementation.
6) Yes, DTS can handle most ETL tasks that aren't too difficult, but
you have to be creative because it doesn't do everything
out-of-the-box. Performance doesn't compare to say, Informatica, but
it's easy to use, comes free with SQL Server, and it better than
managing just scripts to load data.
-Kory
Julie wrote:
> Hello,
> I am going to be a bit cheeky here and ask for some information on Analyst
> Services and the like that I have been asked to answer. Unfortunatly I do not
> posess the knowledge to do most of this so I was wondering if someone out
> there did, or could forward me a web site that could. Anyway here are the
> questions.
> 1. What decides or who decides what physical implementation model to use for
> OLAP? From what I've read there seems to be choices about what is pyhiscally
> stored and what is referenced either from storage (through some sort of
> cache) or from the source fact tables. ROLAP/HOLAP/MOLAP are mentioned at
> some point. I'm guessing that this could be important but something that
> could be changed as the logical view to the outside world wouldn't change
> even if the physical implementation was changed at a later date. However,
> having to go back and tune things later on creates a bit of a support
> overhead.
> 2. The beginning of the book mentions concepts such as OLAP databases and
> dimension libraries. Is an OLAP database a way of implementing a
> datawarehouse for specific contexts such as for particular departments (e.g.
> a finance database, a marketing database etc.) or is there something more
> subtle to it than that? Similarly, what is the concept of dimension libraries
> all about? More importantly, if we don't take an approach that uses these
> structures, will we get ourselves into an organisational mess in the longer
> term? Currently DNx is used by all parts of the business so the concept of
> creating seperate data marts for different purposes doesn't really exist at
> Dawson News.
> 3. Another cocept mentioned at the start of the book is "private" dimensions
> that can only be used in certain OLAP cubes or OLAP databases. Whats this all
> about? Superficially it seems to be a way of creating dimensions that only
> operate in certain contexts which on the face of it seems to be a bit of a
> cop out (ie making information meaningful to everybody but then allowing
> exceptions when the going gets tough) .
> 4. This is a simple technical question! Is the OLAP repository in Access
> (the default) or have we migrated it over to SQL Server.
> 5. This is a much bigger question. What is the best way of updating cubes?
> One of the problems we've got is that the base table(s) currently in DNx SQL
> Server (RTLSAL tables) are a summary so selective fields are incrementally
> updated each day. Any ideas how the OLAP refreshes would work in this
> specific case? Also, dimensions change from time to time so what is the best
> way to ensure that cubes are kept up to date as the "static" data changes?
> Its quite important that any data reported against is both non-volatile (ie
> update once a day) and current (ie everything in synch).
> 6. Also linked to 5. but possibly even more genreal, what is DTS capable of?
> We currently populate the RTLSAL and other tables using VB scripts which
> contain the logic to deternine when to insert new rows and when to udpate
> existing ones and what rules to apply when applying updates. Is this
> something we could replace with DTS?
> Thanks
> J
|||Thanks KoryS
"KoryS" wrote:
> I'll do my best to answer your questions:
> 1) The decision of which storage mode to use (MOLAP/HOLAP/ROLAP)
> depends on many things. A good rule of thumb: If you need real-time
> OLAP because your data changes throughout the day, you need to use
> ROLAP. However the trade-off is query performance. Microsoft
> recommends MOLAP for better query performance. The trade-off is
> storage space, which is usually cheap nowadays. HOLAP is rarely used
> from my observation.
> 2) OLAP is a term that refers to systems that deal with
> multi-dimensional data storage and related capabilities. Conformed
> dimensions are a important to an integrated data warehouse and the data
> warehouse bus architecture (read Ralph Kimballs book Data Warehouse
> Toolkit)
> 3)You should use shared dimensions whenever possible instead of Private
> ones. The drawback is shared dimensions are always in memory, but if
> they are small, it's not a problem.
> 4)Microsoft recommends you migrate your repository to SQL Server for
> all the reasons Microsoft wants everything in SQL Server.
> 5)The one thing I found from experience is the success of your OLAP
> solution is greatly impacted by your data source that feeds the cubes.
> Spend as much time as possible to design the most effective data mart
> to feed your cubes. In other words, do all of your updating in the
> relational database as appropriate (using SCD where necessary).
> Analysis Services will handle the cube processing increntally, or full
> depending on your implementation.
> 6) Yes, DTS can handle most ETL tasks that aren't too difficult, but
> you have to be creative because it doesn't do everything
> out-of-the-box. Performance doesn't compare to say, Informatica, but
> it's easy to use, comes free with SQL Server, and it better than
> managing just scripts to load data.
> -Kory
> Julie wrote:
>
|||Hello Julie,
Have you looked at SQL Server and Analysis Services 2005? There are
some features that may help with the updating of the OLAP cubes. DTS
has been replaced with a new powerful ETL tool (SQL Server Integration
Services), which comes with a slowly changing dimensions transformation
task.
Analysis Services 2005 has a new optimization engine allowing queries
to run against cubes or directly against the RDMBS.
Check out:
http://bi-on-sql-server.blogspot.com...mensional.html
If you are in the process of updating your current system in the next
few months you should look closely at SQL Server 2005. It is currently
still in beta with a set release date this November.
Myles Matheson
Data Warehouse Architect
http://bi-on-sql-server.blogspot.com/
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment