Recently I had to compare lots of sets of data. Determining if there are any differences at all, or if there are rows that exist in one but not the other, is easy. But I needed to compare every column, calculate match percentages, and report every difference.
I was able to take what would have been a lot of tedious query writing and create a standard way to do it. All you have to do is a little bit of prep on your two data sets, and these queries will compare, quantify, and report all mismatches. (more…)
Every solution I have seen required some control flow statements (such as looping) and often creation of a sproc or function. These are usually acceptable, but it really only requires a very simple SQL statement. (more…)
Should a table always have a surrogate key? Is it really best practice, or should we stick to natural keys? Let’s dig in to see if there’s one clear answer. (more…)
Spreading values across rows (or subsets of rows within a set of rows) was once cumbersome and difficult, but the invention of window functions (which have been around for a while now) have made it much easier. Here are some examples you can take and tweak to fit your needs. (more…)
Over my time I’ve seen many unique technical solutions in databases. And now I can share some of the worst ones with you!
Disclaimer: I may have been the architect of some of these. Sorry DBA team! (more…)
A reference for all the different types of keys used in logical and physical database design.
These tables will be used to illustrate all the different keys: (more…)
Each row contains one value, of the appropriate type, for each column, and there are no duplicate rows.
This is a simple rewording of the formal definition:
“Let relation r have attributes A1, …, An, of types T1, …, Tn, respectively. Then r is in first normal form (1NF) if and only if, for all tuples t appearing in r, the value of attribute Ai in t is of type Ti (i= 1, …, n).”
Date, C.J. Database Design & Relational Theory. Sebastopol: O,Reilly Media, Inc., 2012. Print. (Emphasis theirs)
Have you ever seen a table where each row had a ‘start date’ and an ‘end date’? I’m sure most of us have seen one or created one.
These tables tend to pose a key problem (no pun intended) if the following business logic applies: “If two or more rows represent different date ranges for a single item, the ranges cannot overlap.” In other words, the date range must be unique from the start date until the end date for any particular item represented by the table. (more…)