The purpose of the assignment is to provide us with experience calculating Z-Scores and Probability for a given data set. Additionally, the assignment will provide experience relating the calculated information (Z-Scores and Probability) to a given scenario for pattern analysis.
Terms
Z-Score: A Z-Score is the the precise number of a standard deviation which a specific observation lies on the standard deviation curve. If a numeric value lies between 1 and 2 standard deviations on the curve, using the Z-Score formula (Fig. 1) will calculate the exact location.
(Fig. 1) Z-Score formula. Zi: Z-Score, Xi: observation value, u: mean of the data, S: standard deviation of the data. |
Probability: Probability can be described as the likelihood (by percent) a numeric valued event will occur. The probability is calculated from the z-score via a probability chart (Fig. 2). Z-Scores can also be calculated in Microsoft Excel and a host of other programs. Probability is calculated from absolute frequency of a specific event occurring compared to the all of the events in the data.
(Fig. 2) Probability chart based on Z-Scores. |
The Scenario
You have been hired by an independent research consortium to study the geography of foreclosures in a Dane County, Wisconsin. County officials are worried about the increase in foreclosures from 2011 to 2012. As an independent researcher you have been given the addresses of all foreclosures in Dane County for 2011 and 2012 and they have been geocoded and then added to the Census Tracts for Dane County. While you realize that you cannot determine the reasons for foreclosures occurring, you do have the tools to analyze them spatially. Specifically, you are interested to see how the patterns of these foreclosures have changed from one year to the next. Explain what the patterns are and also provide some understanding as to the chance foreclosures will increase by 2013?A second question is to be answered after calculating the Z_Score for three specific Tracts located in Dane county.
If these patterns for 2012 hold next year in Dane County, based on this Data what number of foreclosures for all of Dane County will be exceeded 70% of the time? Exceeded only 20% of the time?
Methods
The first step was to create a map displaying the change between 2011 and 2012 for Dane County using ArcMap. I added a field to the attribute table and subtracted the 2012 foreclosure value from the 2011 value for each tract. The result was displayed using standard deviation classification (Fig. 4).
The first step was to create a map displaying the change between 2011 and 2012 for Dane County using ArcMap. I added a field to the attribute table and subtracted the 2012 foreclosure value from the 2011 value for each tract. The result was displayed using standard deviation classification (Fig. 4).
(Fig. 3) Display of locations of selected Census tracts 114.01, 122.01, and 31. |
The next step the instructions was to calculate the Z-Score for 3 select tracts in the data (Fig. 3). I utilized ArcMap to extract the Mean and the Standard Deviation for both years of data. I then extracted the values for the specific tracts from both years and input all of the values in Microsoft Excel. The Z-score was then calculated using Excel (Fig. 5).
Results
(Fig. 3) Display of the foreclosure change between 2011-2012 by Census Tract in Dane County. |
(Fig. 4) Excel spreadsheet with the Z-Score calculation data and results. |
(Fig. 5) Display of 2011 foreclosures by standard deviation classification. |
(Fig. 6) Display of 2012 foreclosures by standard deviation classification. |
Finally I will answer the following question:
If these patterns for 2012 hold next year in Dane County, based on this Data what number of foreclosures for all of Dane County will be exceeded 70% of the time? Exceeded only 20% of the time?Foreclosures with a Z-Score greater than a -.52 will be exceeded 70% of the time. This equates to 70% of the time the foreclosures for a given tract will exceed approximately 7.
Foreclosures with a Z-Score greater than .84 will be exceeded 20% of the time. This equates to 20% of the time the foreclosures for a given tract will exceed approximately 20.
Conclusion
A pattern of higher foreclosures seems to fall outside of the downtown/capital area of Dane county. While more information is need to fully analyze why this pattern is being displayed I have my own assumptions. There was a significant housing market crash around these years due to a dwindling economy. People whom moved from the inner city to more posh suburbs bought houses which they could no longer afford when they lost their jobs. Thus many homes went into foreclosure in these areas. Again, this is merely a guess and more research would be required to verify that claim.
Analyzing the change between years doesn't give you the full picture of what is going on with the data. Calculating the Z-Scores or creating a standard deviation classified map provides additional information which is critical when attempting to interpret data. Like in the case of the central most northern Tract in Dane county which showed a decrease in foreclosures but still was above the average for the county. The observation tells you there is more to investigate in the area to gain a full understanding of what is going on. These observations combine with further data from more recent years would be very beneficial to many government agencies in Dane County.