Year 2000 Compliance:
Lawyers, Liars, and Perl
by Tom Christiansen
[ this essay originally appeared at http://language.perl.com/news/y2k.html and is republished with permission.]
As the clock draws us relentlessly closer toward 2000, the final year
of the second millennium, doom sayers everywhere are prophesying
unprecedented computer failure in every conceivable sector. Known
popularly as the Year 2000 Problem, or the Millennium Bug,
this situation is quite easy to explain. Programs that interpret
two-digit dates in the form XX as 19XX behave unpredictably
starting in the year 2000 and beyond into the next millennium. If your
birthday is ``2/2/05'', are you 101 years old in 2006, or just one
Cost estimates of fixing this bug range well into the billions of dollars,
with the likely threat of at least that much money again incurred in
protracted legal fees due to real and alleged damages. Because of these
devastating cost projections, corporations and government are mounting
pressure to secure legally binding statements that all software they
use is warranted to be year 2000 compliant.
As you might well guess, this pressure often stems from lawyer-wary
insurance companies, who quite reasonably fear death-by-litigation even
more than they might the effects of costs of actual, incurred
damages. To defend themselves against legal perils, organizations all
over the world are rushing to secure, in all possible haste, affidavits
to the effect that such-and-such software is year 2000 compliant. They
intend to brandish these as some sort of legal shield when the inevitable
chaos strikes, righteously proclaiming themselves free of Y2K taint and
redirecting lawsuits toward the signers of the aforementioned documents.
Remember this: if someone asks you to warrant that your software is free
of year 2000 bugs, they're really just looking for an excuse to sue you
if they misuse your software, even if it should happen to be their own
fault. Probably they've already forgotten the terms of their software
licence, one which probably read something rather close to the following:
IN NO EVENT SHALL THE AUTHORS OR DISTRIBUTORS BE LIABLE TO ANY PARTY
FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES
ARISING OUT OF THE USE OF THIS SOFTWARE, ITS DOCUMENTATION, OR ANY
DERIVATIVES THEREOF, EVEN IF THE AUTHORS HAVE BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.
THE AUTHORS AND DISTRIBUTORS SPECIFICALLY DISCLAIM ANY
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
NON-INFRINGEMENT. THIS SOFTWARE IS PROVIDED ON AN ``AS IS'' BASIS,
AND THE AUTHORS AND DISTRIBUTORS HAVE NO OBLIGATION TO PROVIDE
MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
Did you ever stop to wonder just how long an automobile manufacturer
might survive government scrutiny with such an anti-warranty? Why should
software manufacturers be any different? Somehow, though, they do appear
to be. Whether such things survive the courts in the litigational
feeding frenzy certain to ensue in a couple of years is something that
remains to be seen. Don't count on anything.
The anxiety about the Y2K situation is reaching an increasingly feverish
pitch in most large organizations. Every few days, you read something
else in the popular media forecasting certain havoc. Nearly all these
reports are peppered with vague prevarications or technical confusion.
That's not to say that there isn't a real issue here, and that we can go
pretending there's nothing to bother ourselves with. There certainly is.
But what that problems stems from, and what can be done about it, is
something usually misunderstood at best. Three commonly repeated lies
exacerbate the situation.
Three Popular Y2K Lies
The first lie routinely recounted is that the Y2K problem historically
derived from expensive computers of yesteryear whose memory was so
dear that programmers maintained dates in a two-digit format to keep
costs under control.
This assertion is demonstrably false. Think about it. A two-digit
number requires how much storage? Two bytes, that is, sixteen bits?
No, much less: numeric data are seldom stored in text format, since a more
compact representation is readily available. A two-digit year would be a
number ranging between 00 and 99. That can be represented in just 7 bits.
What about using full years, like 1985 or 2010? Those numbers could be
represented in just 11 bits. So did those veteran programmers of old
truly gleefully rack up dramatic cost savings at the rate of 4 entire
bits per date? Surely this tremendously precious hardware could have
spared 4 bits! And even if not, one could have employed an offset from a
reasonable base instead. If instead of using just the last two digits,
years in dates could have been represented not as absolute values but
rather as the number of years since 1900. If so, this too would still
fit in those aforementioned 7 bits, at least for a while. Add another
bit, and we're clear until 2156. End of problem. Saving memory
was not why it was done.
Although the users of that day may have chosen a non-extensible
representation for their years, one that was year % 1900
rather than year - 1900, that doesn't make it a hardware
cost issue. It was still a short-sighted, soft design mistake.
The second lie is that this phenomenon is somehow brand new, that in
the year 2000, myriad systems will suddenly fail, and that this kind of
thing has never occurred before. Think about the ancient pensioner born
in 1895. A program that reads their birthday as ``95'' will not pay them
their months checks, since in 1998, they appear to be only three years
old. The year 2000 has not occurred in this equation. The so-called
``year-2000'' problem is in no fashion new, nor is it limited to that year.
It will simply be more evident then.
The third, and by far the gravest lie about Y2K matters, is that your
company can, through the acquisition of affidavits of compliance, protect
itself against harm, whether real or litigated. It can't. This faith
in legal documents is hollow and in fact dangerous. The wisest course
of action is for you to immediately disabuse yourself of this deceit.
The insidious, underlying root cause of this entire problem is neither
the hardware nor the software. No, that would be too easy; that we
know how to fix. Just apply a few hundred billion dollars, and voilý,
it's all taken care of.
Unfortunately, that's not it. The real problem is the wetware. That's
right: the defect lies not in our computers, nor in their programming,
but rather in ourselves.
Most of the time that people think about dates, they use only the
final two digits of the year. They write it on checks. They write in
family Bibles. You hear someone casually say, ``I remember back in '65,''
or ``the Generation of '98 had their collective consciousness shaken to
the roots by their astonishing defeat to the Americans in the Caribbean,''
and you're just supposed to know what they mean. Just which 65 is that?
Assuming a living speaker, it provably has to be 1965. But just which
98 was that? Why, it was not the current year, but rather way back
in 1898, when Spain lost the remainder of their decrepit empire to
those upstart New Worlders and subsequently succumbed to a national
soul-searching that permeated throughout their literature of that age.
In both cases, you resolve the ambiguity by inferring the full year from
the context, of course. But if you don't have that context, then you
just have to guess. And remember: computers make notoriously bad guessers.
The most horrifying aspect of all this is that even with a perfectly
accurate and working computer program, one that is obviously ``Y2K
compliant'', you are still in big trouble. Take for example, the famous
Unix cal program. Let's check out the current month.
$ cal 2 98
Su Mo Tu We Th Fr Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28
Hold on. What was that? Isn't Valentine's Day is supposed to fall on
Saturday, not Wednesday, this year? Oops; wrong millennium! What you
really meant to type was:
$ cal 2 1998
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
As you see, it doesn't matter whether the programs are compliant,
because the humans using them are not! Fixing the programs is certainly
a necessary step, but far, far from sufficient. You can certify every
single program in existence, and it still will not be enough for safety.
Until such time as the teeming billions of people in this world -- or even
just the many millions using computers -- are all similarly certified,
and warranted not to forget, there can be no safety. And that's not
going to happen.
To seek legally binding statements that a particular program cannot
be intentionally or unintentionally misused is nothing but a witch
hunt doomed to fail in its ultimate goal of protecting you and yours.
You cannot help that there will always be cluefully-challenged users
and programmers out there, or even persons of clue who occasionally
have a memory lapse. You cannot find them, you cannot blame the tool
or language, and you cannot protect yourself from them. Every time a
human being thinks about a year in terms of just two digits, the problem
reasserts itself. And no one has yet figured out how to fix the wetware.
Now, what about Perl? Is Perl ``Year 2000 Compliant''? The answer is
that Perl is every bit as Y2K compliant as is your pencil; no more, and
no less. Does that comfort you? It shouldn't. Just as you can commit
Y2K transgressions with your pencil, so too you can do so with Perl --
or with any other tool, for that matter. You don't really even have to
go very far out of your way to do so; witness the demonstration of the
perfectly compliant cal program provided above.
The date and time functions supplied with Perl are the gmtime() and
localtime() functions, which are derived from their namesakes from the C
programming language. These supply adequate information to determine
the year well beyond 2000. 2038 is when trouble strikes, but only for
those of us still stuck on 32-bit machines, a somewhat unlikely albeit
admittedly not entirely unthinkable situation.
The year returned by these functions (when used in list context) is,
contrary to popular misconception, not by definition a two-digit year.
Rather, it merely happens to be such right now. What it actually is,
is the current year minus one thousand nine hundred. For years between
1900 and 1999 this happens to be a 2-digit decimal number, but that's
not going to last long. To avoid the year 2000 problem, simply do not
treat the year as a 2-digit number. Easy to say, and easy to break.
Imagine that you want find out what the year appears to be in five years,
so you write code like this.
$then = time() + ( 60 * 60 * 24 * 365 * 5 ); # 5 years from now
$that_year = localtime($then) -> year;
printf("It shall be 19%d\n", $that_year); # WRONG! 19103
printf("It shall be %d\n", 1900 + $that_year); # right: 2003
As you see, in the wrong hands, even a nominally year 2000 compliant tool
such as Perl or cal can be misused by the underclued or simply the forgetful.