referred to as the capture-recapture estimator (CR): 
cr = £ - ^ CFC ' 
/€f ^ C ,i 
where F is the set of all CML records classified as 
farms based on their responses to the census 
questionnaire. 
To estimate the capture and correct census farm 
classification probabilities, a matched dataset 
consisting of JAS records and census records was 
created. Records in the 2012 JAS sample were 
matched to the 2012 census using probabilistic 
record linkage. The CML records that matched with 
JAS tracts represent the Census sample. Note: The 
Census Sample is a subset of the CML records and 
includes only those records matching a JAS tract. 
Both agricultural and non-agricultural tracts were 
included in the matched dataset. (This differs from 
the 2007 processes, which considered only the 
agricultural tracts and non-agricultural tracts with 
potential or with potential unknown. It also included 
CML records that responded to the census as a farm 
or nonfarm and CML records that did not respond to 
the census.) 
Resolving Farm Status 
The farm status based on census responses to either 
the CML or NML census data collection and the JAS 
agreed in most cases; these records are referred to as 
having resolved farm status. However, in other 
cases, a record was identified as a farm (nonfarm) on 
the JAS and as a nonfarm (farm) by the census 
through either the CML or the NML. Such records 
are said to have conflicting or unresolved farm 
status. An operation identified as a farm is referred 
to as in-scope; one identified as a nonfarm is referred 
to as out-of- scope. From the set of matched records, 
three groups with conflicting farm status were 
identified: 1) in-scope JAS records that were out-of- 
scope on the census and 2) census in-scope and JAS 
out-of-scope records, and 3) in-scope JAS records 
that did not have a census response. The records 
with conflicting farm status were sent to regional 
field offices for review. In each case, efforts were 
made to determine whether (1) the status had 
changed between June and December when the 
2012 Census of Agriculture 
USDA, National Agricultural Statistics Service 
census was conducted, (2) the JAS farm status was 
correct, (3) the census farm status was correct, (4) 
the records were incorrectly matched, or (5) the farm 
status could not be resolved. Not all of the records 
with conflicting farm status could be resolved. In 
2012, 11.6 percent of the records in the Census 
Sample had unresolved farm status. Of these, 18.9 
percent were from nonresponse to the census report 
form. 
The probability an operation is a farm was estimated 
for the records with unresolved farm status. Using 
the 2012 matched dataset, a logistic model of the 
probability an operation is a farm based on the 
records with resolved farm status was developed; 
that is, the operations where the farm (or nonfarm) 
status agreed between the JAS and the census were 
used to develop a missing data model, which was 
then used to resolve farm status. The final missing 
data model was used to impute the probability that 
each of the agricultural operations with unresolved 
farm status is a farm. For the resolved farms and 
nonfarms, the probability of the operation being a 
farm was 1 and 0, respectively. Five-fold cross- 
validation was used to develop and to compare 
competing models. The accuracy of the model was 
thereby not overstated due to fitting and evaluating 
the model on the same set of data. To ensure that 
each of the cross-validation samples covered the 
U.S., the five cross-validation samples of JAS 
segments were drawn within State-stratum 
combinations. Characteristics of the JAS tracts were 
considered as potential covariates in the model. 
Because limited information is available for JAS 
nonfarm tracts, county-level socio-demographic 
variables from the most recent U.S. population 
census were also considered. The sample weight 
associated with each JAS tract was multiplied by the 
probability of being a farm. This adjusted weight 
was used in all subsequent modeling. 
Capture Probabilities 
Recall that, for a farm to be identified as a farm, and 
thus captured, by the census, it must be on the CML, 
respond to the census report form and, based on the 
census response, be classified as a farm. These 
adjustments are dependent so that the probability of 
capture i ic may be written as 
APPENDIX A A- 11 
