Regression analysis can be an incredibly useful analytical tool, but one should not blindly trust the results. “Trust but verify” is a phrase made famous by President Ronald Reagan in 1987 after signing a nuclear weapons treaty with Russian President Gorbachev. In statistical analysis, a commonly used phrase is: “Correlation does not mean causation.” This statement definitely applies in the appraisal of real property.
Did that cause that?
An example from a statistics textbook is that there appears to be a strong correlation in younger folks between reading ability test scores and shoe sizes; the larger the shoe size, the higher the reading ability score of the individual. Does that mean that either one causes the other? Of course not. Yes, there is a correlation. But the real issue in the example is that reading skills increase with age, and coincidentally, so does shoe size.
Certainly many factors in appraising have a causal relationship. For instance, adding an additional bathroom to a house would typically be expected to cause an increase in the price that a typical market participant would be willing to pay for that property.
Is it always dollar for dollar?
Sometimes it is. But more often than not, the effect is measured in terms of the “market’s reaction” and not the actual dollars expended. For example, replacing a roof might cost $10,000, but the market expects a functioning roof, and therefore might very well pay extra for it being new, but most likely will not pay the full cost of replacement. Extracting a market reaction for how much the market will pay can become challenging for even the most experienced appraiser.
Could regression analysis tools solve the problem?
Sometimes they can, but often regression analysis is not the answer. Many times, there are insufficient numbers of cases or properties for any semblance of statistical confidence or reliability. Small towns or neighborhoods, unique properties, or unusual characteristics often don’t have enough examples in the market for regression tools to function effectively. Classic examples I’ve written about before are small towns like mine. It is an old historic town with homes of a wide variety of styles, sizes, ages, etc. In typical times, finding more than one or two reasonably compatible recent comparable sales in town is challenging if not darn right impossible. Appraisers often must carefully search further back in time and adjust to current market conditions, or go outside the city limits and account for the appropriate locational differences.
McKissock has some excellent courses that include guidance on both techniques. Browse appraisal courses.
Automated Valuation Models (AVMs) often expand their search area to find enough sold properties for their statistical program to function. For example, in some areas, such as where I live, the AVM program often uses sales from modern subdivisions with HOAs, rural properties with acreage and animals, or waterfront properties, which are distinctive markets all their own. As a result, the AVM results are unreliable, since none of these properties appeal to the same buyers as homes within the city limits.
As an example, my $48,500 rental house purchase that I spent $6,000 remodeling showed up on one of the top real estate websites using an AVM as being valued in the low to mid-$300,000s. Another property that I paid $30,000 for and fixed up for $7,000 was on a competing site valued at $575,000. I sure wish I was that shrewd of an investor!
How did the above two examples happen?
My initial response was that I can’t even begin to imagine how they came up with such numbers. In fact, when I sat down and really studied in depth the market data available at that time, I still couldn’t figure it out. Obviously, their algorithms, modeling, data collection parameters, or overall number crunching had no relationship to reality. Hence another classic statistical quote: “Wrong input equals wrong answer.”
Is their information current?
Another related issue is outdated information. The $48,500 property listed above shows today on one of the top websites as being a house with a rental cottage on a double lot. In reality, the property was subdivided seven years ago and the parcels sold to different parties. I bought the lot with the house and someone else bought the lot with the cottage. Six years ago, the cottage was torn down and a new home was built on that lot. Yet, the website still shows my property as being a house with a cottage on a double lot.
Is regression analysis any good?
Absolutely, I’m a believer. However, this blog post is about trust and the level of trust one should put in algorithms, modeling, and regression analysis tools. Taking any data or results without performing your own verification and analysis is dangerous. Those two top websites today differ significantly on another of my rentals. One gives a figure of $98,900 and the other $175,700; which one should be trusted?
Regression analysis is a fantastic tool for many uses, especially when large volumes of data/cases are available. But when only minimal information is available for analysis, it can yield insufficient or extremely inaccurate and unreliable results. In large compatible suburban tract housing developments, or sizable condominium developments, regression analysis results can typically be relied on with a high degree of confidence. However, the further one gets away from that ideal, compatible, data-rich environment, the weaker the accuracy and reliability of regression analysis results can become.
Algorithm, modeling, and regression analysis weaknesses typically result from a lack of:
Information gathered by a live seeing, hearing, smelling, and touching, boots on the ground examination of the subject property and its surroundings;
an in-depth investigation into the current market conditions, and determining what is true market competition as of the effective date;
geographical knowledge and experience in that area regarding historical, environmental, and economic characteristics; and
verification of the raw data to enhance accuracy and applicability.
USPAP AO-37 offers very important guidance. Pay special attention to the sub-sections titled “Data” and “Use of Computer Assisted Valuation Tools.”
Get up-to-speed on the latest edition of USPAP with The Appraisal Foundation’s 7-hr National USPAP Update Course.
Remember, just because there is a correlation between given properties and/or their characteristics does not mean that correlation creates or causes any impact on any property’s value. As the saying goes, “Trust but verify.”
Written by Steven W. Vehmeier. Steve resides in Florida where he is a state-certified general real estate appraiser and a licensed real estate broker. He has taught appraisal qualifying and continuing education courses for multiple colleges, professional appraisal organizations, his own school, and McKissock Learning since the mid-90s, often spending over 100 days a year traveling and teaching. He has authored dozens of appraisal courses and textbooks, including several for McKissock, and has been a member or affiliate of eight national appraisal organizations, and national director of two.