The Artima Developer Community
Sponsored Link

.NET Buzz Forum
Solving Business Problems with SQL Server 2005 Data Mining

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Tim Sneath

Posts: 395
Nickname: timsneath
Registered: Aug, 2003

Tim Sneath is a .NET developer for Microsoft in the UK.
Solving Business Problems with SQL Server 2005 Data Mining Posted: Jul 18, 2004 2:07 AM
Reply to this message Reply

This post originated from an RSS feed registered with .NET Buzz by Tim Sneath.
Original Post: Solving Business Problems with SQL Server 2005 Data Mining
Feed Title: Tim Sneath's Blog
Feed URL: /msdnerror.htm?aspxerrorpath=/tims/Rss.aspx
Feed Description: Random mumblings on Microsoft, .NET, and other topics.
Latest .NET Buzz Posts
Latest .NET Buzz Posts by Tim Sneath
Latest Posts From Tim Sneath's Blog

Advertisement

Data Mining is about exploring your data, finding patterns and performing predictions. Where querying and analysis tells you what, data mining tells you why and how. In SQL Server 2005, data mining is made accessible and easy to use through an integrated user interface, cross-product integration and familiar, standard APIs. From the two algorithms that existed in SQL Server 2000, there are now a total of seven algorithms available out of the box, and it's possible to extend this further using an SDK.

Data Organisation
How can we organise data to provide the most pertinent information? OLAP provides fast queries with aggregations, but the hierarchies are schema-driven. That makes it hard to know what hierarchies are meaningful or provide actionable information. "Intelligent" OLAP provides self-organising cubes with data-driven hierarchies. You can achieve this with a data mining dimension: from a cube, you generate a mining model that is used in turn to create a new dimension.

For example, the clustering algorithm takes the input data and attempts to segment it into groups. You could now rename those groups, before creating a data mining dimension and using them as dimension members for slicing data. Similarly, a decision trees algorithm could be used to generate a parent-child dimension. This enables us to determine (for example) which groups of customers are the most profitable, in a far more intelligent way than if we'd used OLAP alone without integrating data mining algorithms.

Data Cleansing
How do we clean data as we load it into the data warehouse? We can use DTS pipeline integration in combination with a DM clustering algorithm to separate out clean and bad data. This can help us detect spurious data members (for example, a parent younger than 10 years old).

The steps are as follows:

  • Create a clustering model for the data you want to cleanse
  • Create a DTS project
  • Add a data source
  • Add a Mining Query Transform using the PredictCaseLikelihood() function
  • Do a conditional split based on the output.

Read: Solving Business Problems with SQL Server 2005 Data Mining

Topic: WSE 2.0 SP1 changes Previous Topic   Next Topic Topic: Ok the spammers win: no more comments...

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use