Sunday, February 21, 2010

java.util.DuctTape

Overview

The proposed class, java.util.DuctTape, is designed as a general purpose fix for a variety of commonly-observed situations in production code. It serves as a temporary patch until a permanent solution is developed and deployed.

Wednesday, February 17, 2010

Database/Code impedance mismatch

I love natural keys in database design. You have to pay attention, though: the natural impedance mismatch between a programming language representation and the database representation of the key can bite you.

Consider an object whose primary key might contain a date--say, a change log record. Oracle and DB2 both store a DATE as a time containing year, month, day, hours, minutes, and seconds. No timezone. The natural mapping for a Java tool like Hibernate is to map to a java.util.Date, which stores the Date as a time in milliseconds since the epoch GMT, and then maps it to whatever timezone is set on the machine where the code is running for display and conversion.

Now consider what might happen (especially if our change log record is attached to some parent object);

  1. We create and save the object; it is persisted. The local cached copy contains a non-zero value for milliseconds, but the database has truncated the milliseconds value and saved it.
  2. Later on in the code somewhere, we have reason to save the object again, perhaps as part of some collection operation.
  3. Hibernate looks in its cache, compares it with the database, and notes that the values of the Date don't match--so it tries to save the value again.
  4. The database dutifully tosses out the spare milliseconds, and bam! we have an attempt to re-insert an existing record, so it throws an exception.
This is all terribly confusing to the programmer, who, inspecting the objects in question, sees no difference between what's in the database and what's in her code, especially since the default display characteristics of her database browser and her debugger don't show the milliseconds.

The easy fix in this case is to declare a class which matches the database representation--in this case, a good choice would be to declare a new class which truncates the milliseconds. A modest example is shown below:

/**
* Public Domain; use or extend at will.
*/
import java.util.Date;

public class DbDate extends Date {
/** increment if you change the state model */
private static final long serialVersionUID = 1L;

/** @see java.util.Date#Date() */
public DbDate() {
long t = getTime();
setTime(t - t%1000);
}

/** @see java.util.Date#Date(long) */
public DbDate(long t) {
super(t - t%1000);
}

/** @see java.util.Date#setTime(long) */
@Override
public void setTime(long time) {
super.setTime(time - time%1000);
}
}

Also note that if you declared the database column as a TIMESTAMP, the Java and database representations more-or-less match--avoiding, in this case, this kind of problem. Note that Oracle doesn't support TIMESTAMP_WITH_TIMEZONE in a primary key, and DB2 doesn't implement TIMESTAMP_WITH_TIMEZONE at all--as of the last time I had access to DB2.

Dealing with timezones is another topic entirely--one which I'll take up in a future post.

Sunday, February 14, 2010

Limiting Irreversibility

This afternoon I was reading Martin Fowler's commentary on architecture: http://www.martinfowler.com/ieeeSoftware/whoNeedsArchitect.pdf, and ran across the following:

At a fascinating talk at the XP 2002 conference (http://martinfowler.com/articles/xp2002.html), Enrico Zaninotto, an economist, analyzed the underlying thinking behind agile ideas in manufacturing and software development. One aspect I found particularly interesting was his comment that irreversibility was one of the prime drivers of complexity. He saw agile methods, in manufacturing and software development, as a shift that seeks to contain complexity by reducing irreversibility—as opposed to tackling other complexitydrivers. I think that one of an architect’s most important tasks is to remove architecture by finding ways to eliminate irreversibility in software designs.

I think he's absolutely on the mark with this--in fact, I think it illuminates one of the two or three key roles of architecture in system implementation. If the architect focuses on helping to define approaches which are hard to change once the system "complexifies", and in helping developers write solid code that can easily be modified when needs change, (s)he goes a long, long way towards making the system flexible and maintainable.

Note that there's two architectural roles there: (1) helping to make key decisions early, and (2) mentoring developers around those decisions.

On most of the dev teams I've worked with, the first set of decisions is made by proposal and refinement--one of us proposes an overall approach, and the rest of us point out tweaks or whole new approaches until the whole thing gels in everyone's mind. It seems to work, if everyone is engaged. Some architects would prefer to "rule by fiat", but I've found that generally results in systems which can't be maintained or in far more work than is needed. It's very hard to get key decisions right by yourself. To illustrate: I once proposed a relatively modest refactoring in a large thorny user interface, to separate business logic from display logic and generally make it easier to maintain. The reigning architect decided we needed a complete rewrite, in direct opposition to the opinion of everyone else on the team. Nobody objected strongly, though everyone quietly agreed that a rewrite probably wasn't needed--it's lot more fun to write new code than to modify old. Nobody asked what the minimum effort needed to meet the requirements was. About two years and a couple million dollars later, the new system is, indeed, quite a bit better-structured than the old one. It's not clear if the new UI will produce allow faster, cleaner updates than the old one--but it sure cost a lot to build: about twice the initial estimate, and about 6 times the original proposal. (By the way: the rewrite is considered a success by all involved. See "What Could Possibly Be Worse Than Failure?" by Alex Papadimoulis.)

A second role implied in minimizing irreversibility is "mentor". Writing good code is hard; writing well-structured code without someone to bounce your design off of is doubly hard. I spend a lot of time talking with the members of my teams, trying to make sure we have a design that's flexible and understandable. A lot of what we think of as "architecture" starts out as a small feature being implemented by a relatively junior developer. I try to make sure (s)he has someone to work with in the early stages of that work, or at least a trial design to start from.

I love Martin Fowler's writing: he always gives me something good to think about.

Wednesday, February 3, 2010

SEMAT and development principles

My first reaction to SEMAT was--is this practical? But as I've thought about it more, I've decided there are some principles of good software design and implementation they can probably agree upon and illuminate.

For context: I spent 15 years as a practicing nuclear engineer before becoming a practicing software developer.

"Engineering practice", as I found it in the field, is often as arbitrary as "software development practice". What is "good" is measured first by "what works", second by "what's elegant", and finally by "what's inexpensive", and is judged primarily by senior practitioners working against their own experience rather than some set of objective standards (though those exist as well). In hardware engineering, the time lapse between design review and implementation is quite often very long (especially large-scale design, e.g. power plants, where I worked). As a result, feedback loops are even longer and harder to manage. What dominates in the large scale seems to be "what worked"--and just as often, "what failed".

This seems to me to be exactly the type of thing we're developing now as developers--we now have lots of categories of languages for solving different types of problems and we're developing solid tools and techniques for measuring performance, managing projects, and closing the loop between design and implementation. The world for software developers is a FAR less arbitrary place, and has far more development of tools and techniques, than in 1976 when I started coding.

The body of common practice is regularly changing in hardware engineering, as analysis tools get better. When I graduated, one of my first tasks required a stress analysis, which I did with a calculator by hand. These days, an engineer would set that up in a desktop finite-element analysis program with a nice UI, and he'd get a better answer in less time. The principles are the same, though: determine, through an understanding of material behavior and machine design, what a specific application required of the machine and what combination of off-the-shelf components and custom machining could be used to implement that application. It's very much what I do today: I pick off-the-shelf components and custom components to create a design to meet certain requirements.

Can we, as a group, specify some of the principles on which those decisions get made? I suspect we can. While I'm only really fluent in one sub-part of OO design, I know of a few; consider the SOLID principles in OO design, various common design patterns, and the corpus of tools and techniques in Knuth's "The Art of Computer Programming".