Tuesday, May 19, 2009

Puzzler: Which way to the runway?

Here is a fun puzzle for those looking for a break from normalcy ...

When visiting and looking at my route via a popular map utility on my phone, I came across an interesting artifact.

I'm curious if anyone can find the shadow of this, or for bonus points, the equation needed to calculate the height of said artifact, once the shadow is found!

Monday, March 2, 2009

Power of SQL: Update Replace()

Sometimes you realize after the fact that there's an issue with the way your data is being stored. In this case, we had a field that was storing names. Originally intended to be able to show an admin / show on reports the full person's name without concatenating data together, here is what was being built.
"firstname middlename lastname"
Unfortunately, when the middlename was not filled out, it contained two spaces between the first and last names. In addition to fixing the way that the the data is being updated, there's an easy fix for existing data.

update users
set fullname = replace(fullname,'  ',' ')
where fullname like '%  %'
For clarity, this finds and replaces two spaces with one space. Its easy to adapt to your specific situation and yes, its really that easy.

Wednesday, December 24, 2008

Tag Context

Update: I had a formatting issue in the code example.

I found this method handy when trying to track down the tab Context of a page's execution. This throws an error, catches it and allows you to output (or email ... or post to a log) the needed information of where a page's execution is coming from.
<cftry>
<cfthrow>
<cfcatch>
<cfset e=duplicate(cfcatch)>
<cfdump var="#e.tagContext#">
</cfcatch>
</cftry>
One word of caution though, in my experience, it can over time (several days of try/throw/catching errors) lead to server instability. In my case, it was sending an email in the catch and was left in even after I fixed the issue to monitor / make sure it was fixed. The error you will see is:
An exception occurred when setting up mail server parameters.
This exception was caused by: coldfusion.mail.MailSessionException
For some reason it loses the connection to the mail server and ANY code that tries to send an email will fail. Not good. It was handy and served its purpose, but be warned on your method of implementation.

Sunday, December 14, 2008

Power of SQL: Update Join Tables

This is one of the most common update queries I do when dealing with multiple tables that needs data updated.
update o
set o.status = 'complete'
from orders o
join order_detail od
on o.id = od.order_id
where ... (criteria here)
I often see code that will do a select and then update the subsequent table where id in (list_of_values). Remember that (almost?) anytime you see an IN clause, it can be replaced with a more efficient join statement.

Wednesday, October 8, 2008

Power of SQL: Find and Delete Duplicates

Most likely we have all had the situation, or will most likely encounter it, where we have duplicate data that needs to be removed from a system. For me this was experienced when moving data from a flat file / legacy system into SQL. This is the solution I came up with after getting it into SQL. Perhaps you have a more elegant way in dealing with this, if so, feel free to comment. The execution of this entire process against ~200,000 records took less than 15 seconds on my particular database. The creation of the index seems to be helpful if you are going to run several queries against the data while preparing the statement.

First, the background:
I have a table of notes. Those zero or more of those notes are related to a foreign key. If there is more than one note that is identical, for the same foreign key, it needs to be removed.

It is extremely difficult to compare a text field to another text field. With up to 2GB in length, it can be a daunting task. For my situation, most of the notes were less than a length of 8000, only a small number were longer. If they were longer, we could safely assume if they were not different for the first 8000 chars, they wouldn't be. You may choose to add additional constraints for your situation.

Basic Table layout:
noteId (identity field)
foreignId (integer / foreign key to other data)

noteText (text field)
... other unimportant data

ALTER Statements Before Starting: This adds a checksum and length based on the note. FYI - I did find duplicate checksums with different lengths. If you need to be precise, I suggest using the binary checksum function which is more sensitive, but understand, a checksum CAN have a collision. That means you can have the same checksum for different data.
ALTER TABLE notes ADD len_notes AS len(CAST(notetext AS Varchar(8000)))
ALTER TABLE notes ADD cs_notes AS checksum(CAST(notetext AS Varchar(8000)))

CREATE INDEX temp_index ON notes (foreignId,cs_notes,len_notes)

Now we have just added a column that represents the length of a note and the checksum of a note. If your data is changing, new records will contain NULL values and updated records will NOT recalculate the checksum, so keep that in mind. You may want to put criteria on your select statement that will only update the non-null values if your data changes, or the data that has a length <>

Select and Review part of your data that will be deleted:

SELECT distinct top 100 n2.noteId, n2.cs_notes, n2.len_notes, n2.notetext
FROM notes n1
JOIN notes n2
ON (n1.foreignId = n2.foreignId
AND n1.cs_notes = n2.cs_notes
AND n1.len_notes = n2.len_notes)
GROUP BY n2.noteId, n2.cs_notes, n2.len_notes, n2.notetext
HAVING n2.noteId > min(n1.noteId)
ORDER BY n2.noteId
Delete your duplicates: To be clear here, this will preserve the lowest noteId and delete those with higher ids where the checksum AND length AND foreignId match.
DELETE from notes where note_id in
(
SELECT distinct n2.noteId
FROM notes n1
JOIN notes n2
ON (n1.foreignId = n2.foreignId
AND n1.cs_notes = n2.cs_notes
AND n1.len_notes = n2.len_notes)
GROUP BY n2.noteId
HAVING n2.noteId > min(n1.noteId)
)
Put the table back the way it was:
DROP INDEX notes.temp_index
ALTER TABLE notes DROP column [len_notes]
ALTER TABLE notes DROP column [cs_notes]
I hope this helped.

Tuesday, September 9, 2008

We Need YOU!

As a friend or family member - or even a stranger if you are reading this here, we are are asking for your help. A relative of ours was recently diagnosed with a highly aggressive form of lymphoma at the ripe old age of 33. His doctors at Baylor are working aggressively to eliminate the cancer, and part of his treatment is a stem cell transplant to rebuild his immune system (think of it as a system reboot after that nasty blue screen).

Unfortunately, we haven't been able to find a suitable donor yet. While there are millions of donors already registered, they have not found a "good enough" match. His doctors tell us that he's mostly likely to find a compatible donor in the area where we grew up, which happens to be in the Lincoln, NE area. Our family has set up a donor drive in conjunction with the the National Marrow Donor Program to get more on the donor registry and find a donor for him. Even if you suspect you are not a match for him, there are thousands of others who are also in his position (or will be).

We would appreciate very much if you would consider participating in the donor drive - or by passing the word along to someone who may be able to participate.
The drive will be Saturday, September 27, 2008 from 8am to noon
at the Southwood Community Center (5000 Tipperary Trail).
There is a small fee ($25) for the tissue typing. Minorities are currently free due to high demand and funding that is available.
We need an appx. head count in advance, so please send an email to donordrive @ kottwitz.org if you're planning to come. There are some businesses and people who are donating to the cause by offering prizes for registering as a donor.

You can get additional information about the drive, prizes, as well as all of those questions not already answered by visiting: http://donordrive.kottwitz.org By the way, the organization heading the donation drive is a 501(c)(3) just in case there was any worry about that. If you cannot attend, you can order a "typing kit" online for $52 - all costs are tax deductible either way and they also accept monetary donations for those who do not meet the health requirements.

Again, please share this information with others who may be willing to help. An official flier is on the home page of the website that was setup. If anyone is interested in posting the information at their office or church, they can reference that image or I can send you a copy.

Again, we are grateful to anyone willing to participate. Thank you and have a great day!

If you have questions you can also contact me via Grand Central - located on the right. (Please leave a message though)

Saturday, June 7, 2008

Power of SQL: Indexing and sorting

Coldfusion Muse recently made a couple of posts about clustered indexes on SQL Server. I was commenting off the grid to the Muse that the timely post helped me in solving a data import issue.

To protect the innocent, and make things less prone to narcoleptic episodes, I had the following scenario:
- An Access Database with Categories and Products.
- The First product listed in a category was a "special" product. Problem: There wasn't any information that would indicate that product was the first one.

Example Data:
category_idproduct_idproduct_name...
12Red Jumbo Ball...
11Green Jumbo Ball...
13Blue Jumbo Ball...
24Purple People Eater...
26Yellow Submarine...
25White Christmas...

Once imported, I knew the data was at the whim of the database which meant that I needed to "preserve" the order prior to importing the data. I added another column "autonumber" on my products table in Access.


Once imported, I could set the clustered index to the "autoNum" column in SQL and know the sort order was correct! Of course, it never hurts to add an order by to your select statement. There are many factors that can affect sorting, clustered indexes are just one factor.