A universal concern with all information systems must be the quality of the data contained within them, hence, the well-known adage of the com-
puter age: garbage in, garbage out. Nevertheless, it should be recognized that “errors and uncertainty are facts of life in all information systems” (Openshaw, 1989). The process of describing aspects of reality as a file structure on storage media requires a high level of abstraction, as was illustrated in Chapter 2. Thus, any attempt to completely represent reality in GIS, while no doubt resulting in robust and flexible data sets, would also result in large, complex, and costly data sets that would require a higher order of technology to handle them.
Historically, a detailed consideration of data quality issues in GIS lagged considerably behind the mainstream of GIS development and application. This is evident from the growth of the relevant literature, which underscores a sudden vogue in spatial data quality research from 1987 onward, some 25 years after the introduction of GIS (Figure 8.1). This lag in concern for spatial data quality may be attributed to:
- The inherent trust most users have in computer output, particularly after some complex analysis.
- The possible lack of awareness among operators and managers from nonspatial disciplines of the sources of uncertainty in spatial datasets and the consequences of propagating them through analyses, other than the need to correct blunders.
- The growing desire in the late 1980s for remote sensing (RS) and GIS data integration, there having been already a body of research onaccuracy assessment of RS data.
The growth of GIS through stages of inventory, analysis, and management (Crain and Macdonald, 1984) such that a need to consider the consequences of uncertainty in outcomes on decision making may only become apparent after some years of system development. Data are usually collected within a specific context and the design for any primary data collection is usually specified within that context. Surveyor and user may be the same individual, part of the same team or linked by contract. Thus, the chances for misinterpretation of outcomes or misconceptions concerning accuracy of the data should, in theory, be quite small. But, data are likely to have a life span (shelf life) well beyond the original context and may well be used as secondary data on other projects. Those who collected the data may be unaware of subsequent uses (or misuses) to which their data are put. Most of the early literature on GIS data quality was concerned with the accuracy of data sets, or more specifically, the recognition and avoidance of error. We will be taking a wider view of this issue by considering the level of uncertainty that exists in the use of spatial data and the fitness-for-use of GIS outputs.