The Artima Developer Community
Sponsored Link

Agile Buzz Forum
Bliki: ReportingDatabase

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Martin Fowler

Posts: 1573
Nickname: mfowler
Registered: Nov, 2002

Martin Fowler is an author and loud mouth on software development
Bliki: ReportingDatabase Posted: Apr 2, 2014 7:02 AM
Reply to this message Reply

This post originated from an RSS feed registered with Agile Buzz by Martin Fowler.
Original Post: Bliki: ReportingDatabase
Feed Title: Martin Fowler's Bliki
Feed URL: http://martinfowler.com/feed.atom
Feed Description: A cross between a blog and wiki of my partly-formed ideas on software development
Latest Agile Buzz Posts
Latest Agile Buzz Posts by Martin Fowler
Latest Posts From Martin Fowler's Bliki

Advertisement

Most EnterpriseApplications store persistent data with a database. This database supports operational updates of the application's state, and also various reports used for decision support and analysis. The operational needs and the reporting needs are, however, often quite different - with different requirements from a schema and different data access patterns. When this happens it's often a wise idea to separate the reporting needs into a reporting database, which takes a copy of the essential operational data but represents it in a different schema.

Such a reporting database is a completely different database to the operational database. It may be a completely different database product, using PolyglotPersistence. It should be designed around the reporting needs.

A reporting database has a number of advantages:

  • The structure of the reporting database can be specifically designed to make it easier to write reports.
  • You don't need to normalize a reporting database, because it's read-only. Feel free to duplicate data as much as needed to make queries and reporting easier.
  • The development team can refactor the operational database without needing to change the reporting database.
  • Queries run against the reporting database don't add to the load on of the operational database.
  • You can store derived data in the database, making it easier to write reports that use the derived data without having to introduce a separate set of derivation logic.
  • You may have multiple reporting databases for different reporting needs.

The downside to a reporting database is that its data has to be kept up to date. The easiest case is when you do something like use an overnight run to populate the reporting database. This often works quite well since many reporting needs work perfectly well with yesterday's data. If you need more timely data you can use a messaging system so that any changes to the operational database are forwarded to the reporting database. This is more complicated, but the data can be kept fresher. Often most reports can use slightly stale data and you can produce special case reports for things that really need to have this second's data [1].

A variation on this is to use views. This encapsulates the operational data and allows you to denormalize. It doesn't allow you to separate the operational load from the reporting load. More seriously you are limited to what views can derive and you can't take advantage of derivations that are written in an in-memory programming environment.

A reporting database fits well when you have a lot of domain logic in a domain model or other in-memory code. The domain logic can be used to process updates to the operational data, but also to calculate derived data which to enrich the reporting database.

I originally wrote this entry on April 2nd 2004. I took advantage of its ten-year anniversary to update the text.

Notes

1: These days the desire seems to be for near-real time analytics. I'm skeptical of the value of this. Often when analyzing data trends you don't need to react right away, and your thinking improves when you give it time for a proper mulling. Reacting too quickly leads to a form of information hysteresis, where you react badly to data that's changing too rapidly to get a proper picture of what's going on.

Read: Bliki: ReportingDatabase

Topic: 3 Undisclosed Tips for Digital Creatives Previous Topic   Next Topic Topic: Short Overview of Estimation

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use