This post originated from an RSS feed registered with Ruby Buzz
by Francis Hwang.
Original Post: JsonInference
Feed Title: Francis Hwang's site: ruby
Feed URL: http://fhwang.net/syndicate/ruby.atom
Feed Description: Author & artist Francis Hwang's personal site.
More than once, I've been asked to make sense of a document store underneath an out-of-control codebase. For my last project, I wrote JsonInference to help me see the entire data store all at once, looking for common patterns.
Given a bunch of JSON documents that are assumed to be similar, JsonInference reports on statistical patterns about commonality. For example, feed a report object a bunch of JSON hashes:
report = JsonInference.new_report
huge_json['docs'].each do |doc|
report << doc
end
puts report.to_s
I keep meaning to write more about document stores and challenges they represent to teams in modeling data. I don't necessarily think they're worse than relational stores, but they do seem to offer lots of unfamiliar pitfalls.