Sunday, February 15, 2009

The mess with Dates - And what really IS a valid date?

To add to the confusion with valid DATE values in MySQL, there is the issue that it's not really 100% clear what a valid date is. Most RDBMS systems aren't as forgiving as MySQL, so trying to insert a date in a DATE column like this '2009-02-30' will cause an error with, say Oracle (which has it's own problems, as the Oracle date datatype is really a DATETIME, but the default is to only display the DATE part. This is onle of the #1 mistakes newcomers to Oracle do:
SELECT datecolumn FROM table_with_dates;
datecol
=========
2009-01-01
SELECT * FROM table_with_dates WHERE datecol = '2009-01-01'
No rows returned;
)
The above was a long parentesis, maybe a recordbreaking such?

Anyway, getting back to my original point, is 2009-02-30 a valid date or not? Look in your calendar, and you see that it's not. On the other hand, ask someone who work with fincial instruments, and the answer will be maybe!

So what's the deal here? Well, way back, to be able to distribute, say, an interest rate, that is always set by year, into months, if we were to use, say, the right number of days per month, we would pay a different interest rate depending on the days of each month. If you have a loan, on say a house or a car, you know that this is not so, you pay the same amount of interest every month, namely 1/12 of the yearly interest! And think about it, not doing so would be complicated.

Another issue is the leap day. Now we have the complexity of certain years being "longer" than others, which has the issue of coluation, say, a value of a instrument is different in different years. So for some instruments, you use a 365 day calenday, i.e. there are no leap years!

In summary, in the finacial worlds, one uses three calendars (at least), the standard (often called a 366 day calendar, although only one year in 4 has 366 days), a 365 day calendar and a 360 day calendar. In the latter case, which is true for interest calculations for example, there are "invalid" dates.

Now, if we assume that you wanted to map the value of one instrument on another, say you take a loan (360 day calendar for interest) to pay for an instrument that uses a 366 day calenday, then calculation of, say, the current value of this investment, gets real complex here. I should now as I have worked with these systems, you often get into the situation that depending on the order of calculations, the result is different, so there are certain ways to follow here, and these ways are not always that well docmented.

I once talked to David (Axmark) around this, and according to him, one reason for the current relaxed DATE-handling in MySQL is due to just this: There was a need for storing dates for financial instruments, more specificaly for interest rates, and for computations.


Let's say you buy stock and take a loan to pay for those stocks. A month later, you want to compute the value of your investment. Now, a month later may mean a different number of days when it comes to the interest that the stock. Hey, go figure! (I have actullay forgotten which calendar stocks use, I know interests use a 360 though).


I can't say I think this is any good. Buy way back, when financial transactions were easier, and the number of instruments limited and cross instrument deals were uncommon, and so was financial risk analysis (which is what I used to work with), this was reasonable.

So all in all, even though I'm no fan of the handling of DATE values in MySQL, it actually has it's good points.

/Karlsson
Who has some knowledge of things that are best left alone.

1 comment:

rpbouman said...

Hi Anders!

very interesting post, thank you!

I must say, I think that some of Jay Pipes posts on fixing dates in drizzle had more to do with general obscurity in the code base more than with accusing 'invalid dates'.

However, I like your post because it makes the most valuable distinction between date values and the calendar. I know that in Java, date formatting is implemented through a separate calendar object which could in principle be extended to a custom one. At the same time, I have seen actually very little examples of this.

I can see some of the complications though - (implementing a calendar across the board would also mean that operators and date functions must be made calendar aware.)