Saturday, August 12, 2006

Generics

Boo doesn't have generics. This makes me sad. Since I want a strongly typed matching solution, I _really_ want generics. I also want macros. ARRRRRGGGGG! Why won't people give me EVERYTHING. One option is to use arrays instead of lists. Another option is to add generics to the language. I think I'll start with the first and then migrate to the second. That should be sufficient for my needs.

Ok, now on to the O/R mapping.

First, since this is all about types, we're going to add the type information for (most) things in the database. Why, you ask? Shouldn't we know if it is a FirstName or a LastName because we have knowledge of the table? Yes, that is correct, but you're thinking too simplistically. In true OO fashion, FirstName and LastName will be nothing more than base classes. We will end up with an AsianFirstName, AngloFirstName, HispanicFirstName, etc... The same is true for last names. Then, when we consult our statistics, we'll be able to use statistics based on the frequency of the name within its ethnic culture (and also within the geographic location). Therefore, we want to be able to generate and store this extra type information. In addition, we'll want to use the type information when comparing names with the matching engine. We might use a completely different function when comparing a HispanicFirstName to an AngloFirstName as we would a HispanicFirstName to an AsianFirstName (auto-reject, anyone?). By emploring multi-method dispatch on the type information, we can quickly choose the right matching logic.

But enough skipping ahead, back to the O/R mapping.

Obviously, each Element will be stored in its own field. It will have an associated type information field. It may also have a pointer into a metadata table. I'm not sure on that one yet, we'll have to see what sort of metadata we will keep that is outside of the type system.

An entity becomes a little more complicated. Remember that an Entity is a collection of other entities, groups, and elements.

Let's look at two different Entities:

class Name(Entity):
first_name as FirstName
last_name as LastName
middle_initial as Nullable(MiddleInitial)
name_suffix as Nullable(NameSuffix)

Wow, what's that Nullable thing? Well, the type system should include whether or not the field can be blank, and Nullable is just as good a choice as any. I'm really starting to want to do this project in O'Caml. I'm getting very close to breaking open the docs on F#. Of course, now that I think about it, C# might be a good choice. I wouldn't need to modify the parser if I had introspection, which C# gives you. Plus it is strongly typed and has generics...hmmm...I had forgotten introspection...drat!

Ok, back to the task at hand.

In most cases, the Name entity will have a Name table. Each element within the entity will correspond to a field in the table. There will also be the associated type field (and perhaps metadata fields). Each record in the table will also have a unique, primary key.

That was easy...

Now, for a more complicated example:

class Person(Entity):
name as Name
address as Address
ssn as Nullable(SSN)
birthday as Nullable(Date)

For a person we will have a unique primary key, but we will also store the primary key of both the name and address information. The ssn and birthday elements will be stored "in-line" like the name elements were in the previous example.

Of course, we might want to force denormalization of the table...we could try something like

class Person(Entity):
[inline]
name as Name
address as Address
...

Now, name has an inline attribute and will not go in a separate table. Instead, the name fields will be placed in the Person table. However, when we extract the Person object from the table, we'll extract a Name object as well, so you can't tell the difference from the user side.

I'm hesitant to allow an [inline] attribute, because you get to the same point as with C++ and its inline modifier. The compiler can't inline without you telling it, so you're forced to make decisions that the compiler should be able to make. Therefore, if we have an [inline] modifier it will be more like "auto" in C++, a hint but the compiler can do what it wishes with regards to inlining. Hopefully, it's usage will vanish just like auto's.

Ok, next time we'll look at the O/R mapping of groups. I'm still not using the mailing list from SF because I sent things to it that I never got back, so I'm waiting until I have a successful test run before I move there for good.

No comments: