<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>
<channel>
	<title>Comments on: Yes, database normalization is good</title>
	<atom:link href="http://blog.semergence.com/2007/08/13/yes-database-normalization-is-good/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.semergence.com/2007/08/13/yes-database-normalization-is-good/</link>
	<description>Seth Ladd's blog about Ruby on Rails and crunching data.</description>
	<pubDate>Fri, 09 Jan 2009 23:53:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=MU</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Anthony Eden</title>
		<link>http://blog.semergence.com/2007/08/13/yes-database-normalization-is-good/#comment-231</link>
		<dc:creator>Anthony Eden</dc:creator>
		<pubDate>Sat, 18 Aug 2007 12:29:50 +0000</pubDate>
		<guid isPermaLink="false">http://204.14.242.104/?p=629#comment-231</guid>
		<description>&lt;p&gt;Ted,&lt;/p&gt;

&lt;p&gt;You seem to be talking about data cleansing more than normalization. I'm just curious if you have any references to usage of the term data normalization in the context that you are using it?&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Ted,</p>
<p>You seem to be talking about data cleansing more than normalization. I&#8217;m just curious if you have any references to usage of the term data normalization in the context that you are using it?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ted Thibodeau Jr</title>
		<link>http://blog.semergence.com/2007/08/13/yes-database-normalization-is-good/#comment-230</link>
		<dc:creator>Ted Thibodeau Jr</dc:creator>
		<pubDate>Wed, 15 Aug 2007 16:47:11 +0000</pubDate>
		<guid isPermaLink="false">http://204.14.242.104/?p=629#comment-230</guid>
		<description>&lt;p&gt;Unfortunately, none of this is really about "data normalization".  This is about relational databases, schema structures, referential integrity, and such like that.&lt;/p&gt;

&lt;p&gt;"Data normalization" is much more about having standard formats for your data -- little things like always using "Street" or always using "St" (or "St.") in an address field, or storing all phone numbers as pure numerics without spaces or other punctuation vs. storing phone numbers as strings with whatever formatting the person inputting it chooses (this becomes important when you start dealing with international phone records -- which is also where you have to start thinking about country codes, in addition to area codes and local exchanges).&lt;/p&gt;

&lt;p&gt;Normalization is not only necessary if you're going to be doing joins of any kind, but also when you're doing selects based on the content of a given field -- because you cannot match "(617) 555-1212" to "6175551212" or "+1-617-555-1212" without doing some major manipulation -- and that takes significant time when you start considering all the possible format variants, and whether the leading "+1-" about breaks a match or not, and whether "617-KL5-1212" also matches....&lt;/p&gt;

&lt;p&gt;Both OLAP and OLTP benefit from normalization -- in different ways, but the benefits are inarguable.  Schema design and referential integrity concerns are different and distinct -- and &lt;em&gt;these&lt;/em&gt; is where the tradeoffs for OLAP vs OLTP optimzation come in.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Unfortunately, none of this is really about &#8220;data normalization&#8221;.  This is about relational databases, schema structures, referential integrity, and such like that.</p>
<p>&#8220;Data normalization&#8221; is much more about having standard formats for your data &#8212; little things like always using &#8220;Street&#8221; or always using &#8220;St&#8221; (or &#8220;St.&#8221;) in an address field, or storing all phone numbers as pure numerics without spaces or other punctuation vs. storing phone numbers as strings with whatever formatting the person inputting it chooses (this becomes important when you start dealing with international phone records &#8212; which is also where you have to start thinking about country codes, in addition to area codes and local exchanges).</p>
<p>Normalization is not only necessary if you&#8217;re going to be doing joins of any kind, but also when you&#8217;re doing selects based on the content of a given field &#8212; because you cannot match &#8220;(617) 555-1212&#8243; to &#8220;6175551212&#8243; or &#8220;+1-617-555-1212&#8243; without doing some major manipulation &#8212; and that takes significant time when you start considering all the possible format variants, and whether the leading &#8220;+1-&#8221; about breaks a match or not, and whether &#8220;617-KL5-1212&#8243; also matches&#8230;.</p>
<p>Both OLAP and OLTP benefit from normalization &#8212; in different ways, but the benefits are inarguable.  Schema design and referential integrity concerns are different and distinct &#8212; and <em>these</em> is where the tradeoffs for OLAP vs OLTP optimzation come in.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
