By W H Inmon

Different data bases have different personalities. Take a banking transaction processing data base. In a banking data processing data base there is great emphasis on response time and the integrity of the transactions that are being processed by the bank. Neither the bank nor the customer wants the daily banking activities lost or applied incorrectly once they enter the bank’s computerized system. If banking transactions are lost or misplaced, either the bank or the customer loses money. More importantly the public loses its confidence in the bank’s ability to properly handle financial transactions.

Or take archival data bases. An archival data base is charged with holding large volumes of data. There usually isn’t much activity in an archival data base. However, the data in the archival data base needs to be held for a lengthy period of time – in some cases indefinitely. And there may be severe consequences in the loss of some archival data.

Now suppose you wanted to build a data base of residential real estate activities. Response time is probably not going to be much of an issue with a residential real estate data base. And while there will be volumes of data, a residential real estate data base is not nearly as big as most archival data bases.

However, real estate data bases have their own set of unique issues. The first issue is that the input going into a residential real estate data base is textual. The textual nature of a real estate data base presents its own set of challenges. The computer has never been very good at handling text. Text is simply too erose. One person uses 10 words to say what another person uses 1000 words to say. When people are speaking there are no referees or umpires to tell the person what to say or how to say it. So the very nature of text is lack of structure and lack of uniformity.

But the lack of structure of text is not the only problem with text. By its very nature text can be confusing. The same word can mean very different things depending on the context of the word. Take the word “bridge”. We all think we know what the word “bridge” means. But our understanding of the word depends entirely on the context in which the word is used. If you are discussing Omar Sharif, bridge probably refers to a card game played by four people. If you are talking about a conference call, bridge refers to the way the callers will be linked. If you are talking about Brooklyn, bridge probably refers to a conveyance that crosses a large body of water. If you are talking about a dentist, bridge refers to a structure built inside the mouth of a patient to hold teeth into place.

So there are very different meanings of “bridge” even though the word is exactly the same. The correct interpretation of the word depends entirely on the context in which the word is used.

And the word “bridge” is hardly the only word that depends on context for its correct interpretation. The English language is FILLED with many such words. In many cases the context of a word is as important as the word itself.

For these reasons (and more!) a residential real estate data base has challenges that other data bases do not have.

Fortunately, there is technology today that can handle the creation of a data base from text. That technology is known as Forest Rim Technology’s textual ETL. Today it is as simple as reading a real estate transaction and converting that transaction into a standard data base.

Once the standard data base is built, it can be handled by any standard data base technology – oracle, SQL Server, DB2, Teradata, etc. And once the data base is cast into the form of a standard data base, the data is transformed insofar as the processing that can be done against the data.

In order to illustrate the power of having a data base handle large volumes of data, suppose there are 1,000,000 real estate transactions that need to be examined. Suppose the analyst has a simple question – how many of the real estate transactions have closed for more than $1,000,000. In one room an army of people start to read the real estate transactions manually. It takes the small army of people a month to read and analyze the real estate transactions. In another room a data base reflecting the real estate transactions has been built. It takes an analyst 5 minutes to formulate the query. It takes the machine another 5 minutes to process the data. The analyst has an answer in 10 minutes time. Furthermore, it took only one analyst to do the analysis, not a small army.

After management gets its answer management then rephrases the question – how many transactions were there over a million dollars where the buyers were a corporation, not an individual.

A groan comes out of the room where there is an army of analysts that have to manually reread and reprocess the transactions. More overtime. More long weekends. More missed family events. The small army of analysts ask why the analyst could not have asked the question correctly the first time.

In the next room the analyst using the data base has the results for management in ten minutes time.

Building a real estate data base is an extremely good idea for certain businesses. That have to understand the real estate marketplace

Forest Rim Technology is a Bill Inmon company located in Denver, Colorado. Forest Rim has the technology needed to build a residential real estate data base. Bill can be reached at