By W H Inmon

The traditional method of designing a data base is to determine what fields for processing are needed then to define those fields into the data structure. Depending on the application, typical fields of data might include customer name, customer identification, address, date account applied for, item purchased, and so forth.

The practices of classical data base design have been around since the first data base made its appearance.

But designing a data base for residential real estate activities is a different story. There are some very basic problems when it comes to designing the data base for residential real estate transactions. The first problem is that there are MANY, MANY diverse fields of data found in a real estate transaction. A residential real estate transaction might have fields of data such as purchaser, purchase price, date of purchase, seller, date of conveyance, and so forth. Then some real estate transactions have information as to the mortgage, the mortgage company, the terms of the mortgage, and so forth. Or there might be information about outstanding liens on the property. Or there might be information about co-mortgagees. Or there might be information about taxes owed. In short, it is almost an accident if any two recorded real estate transactions have the same fields of information. There is a wide variety of information found on real estate transactions and trying to identify and capture all of the information that might be contained in the deed is a difficult thing to do.

Furthermore, when the real estate data base is created across county or state lines, there is almost a guarantee that there will be different data in the deeds of trust.

This profusion and confusion of data elements found in real estate transactions leaves the data base design in a dilemma. Trying to use the classical method of data base design for a real estate data base just does not work very well.

An alternative design is to not put elements of data in the data base but to have classes of elements in the data base. And each of the classes of data in the data base has a contextual identifier. As a simple example, there might be a general classification of METHOD OF PAYMENT in the data base. One contextual identifier is cash. Another contextual identifier is cashier’s check. Another contextual identifier might be money order.

When a new form of payment is encountered, a new contextual identifier is added to the data base. No redefinition of the data base is required. The data analyst simply inserts (at the application level) a new contextual entry into the data base. In using this approach, there is no need for the data base analyst to try to capture all forms of payment and to place each of those forms of payment in the data base as its own unique data element.

Furthermore, one deed of trust has one form of payment and the next deed of trust has another form of payment and there is no conflict in recording this information in the data base, at least insofar as the data analyst (and the data base management system) is concerned.

By slightly bending the rules of classical data base design, free form text such as that found in the deeds of trust found in a real estate application can be accommodated.

And by slightly bending the rules of classical data base design the analyst can grossly simplify the day to day maintenance activities of the data analyst.

There is great value in placing a volume of real estate transactions into the form of a standard data base. By creating a standard data base, the analyst can now handle a significantly large volume of transactions. In order to illustrate the difference, try giving 100,000 real estate transactions to an analyst who has to read each transaction to understand what is in the transaction. See how long the analysis takes to manually process the transactions. (And see how accurate the analysis is, as well.)

Now give the same proposition to an analyst who has taken the time to build a data base. Ask the computer analyst to read and process the records.

Now compare how long it took to the person reading the records manually versus how long it took the computer analyst. It will not be a fair comparison.

Furthermore, now change the question of what needs to be analyzed and see which analyst complains more the analyst that has to manually read the transactions or the analyst that uses the computer to read the transactions.

Fortunately, with Forest Rim Technology’s textual ETL you can read and process textual data and place it into a data base, using simple techniques such as that described in this article. Now raw text can be read and converted into a standard data base. Now you can conveniently, quickly and accurately create a real estate data base that can then be handled by any standard data base technology.

Forest Rim Technology is a Bill Inmon company located in Denver, Colorado. Forest Rim has the technology needed to build a residential real estate data base. Bill can be reached at