A-49 
p's because estimates of x, and s 2 , will vary over months. The parameter p and it’s 
estimate p' will be close if x and s are close to (D and (D. In the simple case with 
constant mean and independent errors, the CFD estimated by conditional simulation 
will better approximate the true CFD because both are based on binomial distribu¬ 
tions with the same N and approximately the same p. 
Now consider the same sequence of distributions where the assumption of inde¬ 
pendence is relaxed and interpolation of the data is used to estimate the proportion 
of noncompliance. The introduction of spatial covariance in the base simulation 
changes distribution of the true p's to a dependent binomial. The dependent binomial 
will have variance similar to an independent binomial with N < 300. Sample size that 
approximates the variance of the dependent binomial is termed Nb. The variance of 
the p's estimated from spatially dependent data is approximated by (p(l-p)/Nb) 
where Nb < 300 and thus the CFD from the independent case will be steeper than 
from the dependent case. The degree to which Nb is less than N will depend on the 
strength of the spatial correlation. 
Next consider the effect of dependent data and interpolation on the distribution of the 
p’s. When we interpolate the sample of 40 onto the grid of 300, the interpolated 
surface is smooth relative to the original data (compare curves 1 and 4 in Figure 5.2). 
Because of this increased dependence in the krig estimates, the estimates of p 
computed from the interpolated data behave more like binomial data with N=Ns (the 
sample size) than like binomial data with N=Nb (the number of grid cells). Because 
Ns is smaller than Nb, the variance of the population of p's computed from interpo¬ 
lated data will be greater. The greater variance explains why curve 1 in Figure 5.6is 
much flatter than line 1. 
Finally consider the effect of conditional simulation on the distribution of the p's. 
When data are conditionally simulated and the mean and variogram estimated from 
the sampled data are accurate, then the character of the simulated data will be similar 
to that of the true data (compare the line 1 with line 3 in Figure 5.7). Like the simple 
independent case, the population of p's computed from the conditionally simulated 
data will have a binomial variance that is similar to a binomial with sample size Nb. 
The simulation experiment shows that the CFD computed from these conditionally 
simulated p's will have a shape similar to the true CFD. This effect is illustrated in 
Figure 5.6 where the median of the conditionally simulated CFDs (blue line) is more 
similar to the true CFD line 1 than is the CFD estimate based on kriging (red line). 
Additional analytical work is needed to formalize the heuristic concepts presented 
here, but this finding indicates a productive direction in developing statistical infer¬ 
ence procedures in the CFD approach. 
Confidence Intervals 
The most successful technique for computing confidence bounds for the CFD were 
obtained using conditional simulation based on kriging interpolation of the sample 
data. The 95% confidence bands (lines 2, Figure 5.6) are well centered over the true 
CFD (line 1) for the simplistic case where the true data have spatial dependence but 
no spatial or temporal trends. When these simplistic assumptions are relaxed (Figure 
5.8) and the true data are simulated to have spatial dependence and temporal and 
appendix a 
The Cumulative Frequency Diagram Method for Determining Water Quality Attainment 
