Archive for March, 2007

links for 2007-03-30

Thursday, March 29th, 2007
  • How to choose which views to materialize in an OLAP cube, when it is too expensive to materialize all views. This is the next optimization for our aggregation strategies in ActiveWarehouse.
    (tags: olap database)

ActiveWarehouse Gets Some Love

Wednesday, March 28th, 2007

ActiveWarehouse, the Ruby on Rails plugin for data warehouse development, was written up by InfoQ in their article ActiveWarehouse, a New Step for Enterprise Ruby.

I’ve been writing different aggregation strategies for ActiveWarehouse, trying to find something that’s not too slow or cumbersome. ActiveWarehouse supports pluggable aggregation, or rollup, strategies, so you can use what works best for you. We have some very large data sets and very large dimensions (one dimension we have has 215 million rows). So if ActiveWarehouse can eventually handle that, I think we’re in good shape.

I can say that ActiveWarehouse will work great if you have a smallish data set. I would say up to a million rows in your dimensions would be big enough. Of course, no matter how much work we put into optimizing ActiveWarehouse’s aggregation schemes, smart database tuning will always help tremendously.

links for 2007-03-28

Tuesday, March 27th, 2007

Creating Combinations of Sets/Arrays/Things in Ruby

Tuesday, March 27th, 2007

I was looking for a way to create combinations of things in Ruby and I found an article by Uncle Bob detailing his attempt at writing a combination generator in Ruby. I modified it slightly to use an array of items, instead of simple indexes.


require 'pp'

def choose(n, k)
  return [[]] if n.nil? || n.empty? && k == 0
  return [] if n.nil? || n.empty? && k > 0
  return [[]] if n.size > 0 && k == 0
  c2 = n.clone
  c2.pop
  new_element = n.clone.pop
  choose(c2, k) + append_all(choose(c2, k-1), new_element)
end

def append_all(lists, element)
  lists.map { |l| l << element }
end

all = [:a, :b, :c, :d]

pp choose(all,3)

The above code prints out:

[[:a, :b, :c], [:a, :b, :d], [:a, :c, :d], [:b, :c, :d]]

If you don’t want these types of combinations, there is a Ruby library for calculating Permutations which will give you all the different permutations, or orderings, of a set of things.

Goodbye Productivity, Hello Desktop Tower Defense

Friday, March 23rd, 2007

I’m not usually one for online games or flash games. Heck, with a newborn in the house, I’m happy to sit and eat for five minutes. But having discovered Desktop Tower Defense, I can say that I’ve found a great fun little flash game. Inspired by Warcraft, this flash game has you deploying defensive towers to counter an onslaught of little gray circle guys. The more guys you kill, the more money you get and the more towers you can deploy or upgrade. Simple, fun, and you can shoot missles. Good times.

links for 2007-03-23

Thursday, March 22nd, 2007

Oracle 11g Gains Native OWL Support

Thursday, March 22nd, 2007

Oracle 11g will gain native OWL support.

From the article:

> (2) Native OWL inferencing (for an OWL subset that includes property characteristics, class comparisons, proprety comparisons, individual comparions and class expressions) [New API]

Way to go, Oracle! I’ve always had a soft spot for Oracle’s RDF support. The way that you can blend RDF data sets and traditional relational data sets in the same query helps to deploy RDF slowly but surely. Not to mention that Oracle has already solved all the main problems that a RDBMS should solve (like ACID compliance, backup and recovery, strong security, wide developer toolset) makes Oracle’s RDF support (and soon OWL) a strong contender for RDF data stores.

Code Comment o’ the Day

Wednesday, March 21st, 2007

Found this little gem in some code I’m working with:

if admin?
logger.info(”i’m admin lol”)

lol indeed.

Why the Semantic Web Marketing Message Has Failed

Wednesday, March 21st, 2007

So some guy writes why the semantic web will fail and ends up on Slashdot. How slashdot picks their articles, I’ll never know. The article is pure opinion and guesswork (as all predictions seem to be), and it’s perfectly OK for this guy to blog his opinions.

I’m not going to argue that the semantic web (that’s *small s* semantic) will succeed, although I think it will prove useful in a large sense in some form, even if that form isn’t RDF. I think what’s really telling about the doom and gloom post is that the marketing message of the semantic web has failed.

For example, a quote from the blog post:

> The Semantic Web will never work because it depends on businesses working together, on them cooperating

Where, in all of the W3C’s semantic web literature does it says that companies must work together for the semantic web to succeed? I think this is one of the biggest misinterpretations about the semantic web. For some reason, people think that the semantic web requires these large agreed upon ontologies before anything useful happens. Not only is that near impossible (for anything but the most generic or free form terms and definitions) but as we all know, specifications born out of committee have an awfully hard time meeting the pragmatic needs of the masses.

For the semantic web to succeed, the W3C doesn’t need more technical specifications (although a new RDF XML serialization would be nice). Instead, the W3C needs to completely revamp its marketing message. For instance, distance the semantic web from AI. AI, no matter how promising, leaves a bad taste in your mouth. We need to completely deny any relationship to AI. Secondly, the W3C needs to rebrand the semantic web as “Simply Putting Your Database On The Web. No More, No Less. Anything Else Is Purely Serendipity.” Thirdly, the W3C needs to really drive home that the semantic web will succeed *only if* it is not built with large top down ontologies.

So repeat after me: “The semantic web is just an effort to help expose the database that you already have to the web as RDF. Primary keys become URIs, and the intersection of a row and a column is a triple.”

Or, to put it another way:

Problem: I have data, most likely in a relational database, that I need to get on the Web.
Solution: Expose that data as RDF. URIs are the primary keys for the data.

16 Years of Discovery Magazine Now Online

Wednesday, March 21st, 2007

16 years of discovery magazine is now online for your pop science needs. It’s a fantastic resource for science reading that’s lighter and fluffier than something like Nature or Science.