Yield Map Management and Statistical Mapping
Raw yield map point locations
The following graphic shows four consecutive seasons of yield maps for the same field; it can be seen that the location of the yield points vary. In order to organize the data into one file, yield for the different seasons can be associated with the same points. To do this, the nearest yield points from a clean yield map with one-meter spacing is joined to the points; the average distance a joined point will be is about one foot. In this example, the 2005 soybeans had missing data that was filled in the interpolation process of yield data cleaning. Also, there is an electrical installation near the center of the field that the combine needs to be driven around.
Evenly spaced grid of points with optional yield point boundary included
To help manage the different data, a single map file can be made in GIS whereby all seasons of yield data can have yield amounts associated with it. The points from a yield map that delineate the boundary of the field well enough can be merged with an evenly spaced grid of points to make a master file for a field, or an evenly spaced grid, solely, can be used (each method has their advantages). All subsequent yield maps can have values associated with the master file so there is just one map per field for all seasons. The graphic below shows the points that represent the boundary of the field (left; points are from the 2007 season above) and the merged file of boundary points and evenly spaced (4-meter for this example) points that are within the boundary points (right). It is not necessary to harvest the entire field to include yield data in a master file; if only part of the field is planted/harvested then the points with yield have values and the other points do not. (The grid of points is symbolized smaller than the boundary points so the boundary points are more visible; a zoomed-in image of points is also shown.)
Northwest corner of yield points from above right
Yield data from all seasons can be associated with the same master yield file, so there is just one map for each field for the history of the field. This makes data analysis effective because you can statistically compare and map yield values at the same location.
Yield maps for the four different seasons with same point locations
(maps are all classified with natural breaks with yield in order of highest to lowest as: dark green, green, yellow, orange, and red [all points, boundary and grid, are same size])
If yield points all have the same location over the seasons, statistics can be better calculated and mapped for a field. A common method to compare different crops for the same field is to normalize to the mean (average) or maximum (calculated by dividing by the mean or maximum, respectively). This can be helpful if you are certain that your yield monitor has been properly calibrated or your yield data has had the field average and variability post-calibrated correctly. However, keeping yield monitors properly calibrated is a challenge; this is easily understandable considering calibration can require multiple loads to be weighed (based on different traveling speeds) with a certified scale many times during a season. Normalizing by incorrect values will create erroneous data. For example, normalizing by a mean that is incorrectly high will result in values that have lower variability than they should while normalizing by a mean that is incorrectly low will result in values that have higher variability than they should. Unless you are certain that your yield monitor has been properly calibrated, it is better to derive maps based on normalized differences in yield amounts. Differences are calculated by subtracting the lowest clean yield value from all the values so the range of values starts at zero. The differences in bushels per acre for the 2004 corn, 2005 soybean, 2006 corn, and 2007 soybean yields are 78.9, 26.2, 87.6, and 29.2, respectively. The values can then be normalized to the maximum difference which will results in a range of values from 0 to 1 for any crop or season.
Statistical Mapping
Maps below show average and median yield difference normalized to the maximum difference and corresponding standard deviation and coefficient of variation (variability) maps. (Normalized average and median yield difference maps are classified with natural breaks in order of highest to lowest values as: dark green, green, yellow, orange, and red; standard deviation and coefficient of variation maps are classified with natural breaks in order of highest to lowest values as: red, orange, yellow, green, and dark green).
To make the most valid statistical maps, it is necessary to include areas that have valid yield data. This excludes the area between the headlands and the rest of the field because there is no data in these areas and where the combine needed to be steered around the electrical installation (near the center of the field) because this type of operation causes erroneous data. The headland areas should also be removed for this analysis. The shaded part below shows the valid area for all seasons; the area that has missing data for the 2005 soybeans season is included for purposes here.
The maps below represent the same statistics as shown above and are symbolized the same, but the extent corresponds to the more valid data. Kleinjan et al. (2006; pdf) state that: "the preferred method for explaining yield variability used a combination of average yields and standard deviation to delineate productivity zones" and that by "comparing yield and standard deviation maps, the potential yield losses associated with not implementing a corrective treatment can be determined". It can be seen in this example, by viewing the average and standard deviation maps that part of the higher yielding areas are relatively stable (associated with lower standard deviations [green shades]) and the low yielding area in the northwest (north is ↑) is also relatively stable.