ESRI ArcView Extension: Point Stat Calc
U.S. Geological Survey Open-File Report 00-302
Online Release 1.0
Online Only
By Matthew Dombroski
Introduction
This discussion assumes that the reader is familiar with
the ArcView 3.x commands and usage. Point Stat Calc is an ArcView extension
that can be used to calculate summary statistics of data points based on their
distribution within a set of polygons. Point Stat Calc can calculate the mean,
median, maximum, minimum, count, or nth percentile for any numeric
attribute(s) of the points inside each polygon. The results of these
calculations are stored in a new table that is linked to the polygon shapefile's
attribute table using the ArcView join command.
Compatibility
Point Stat Calc was developed and tested using Esri's ArcView versions 3.1 and 3.2
on Windows NT 4.0, Windows 98, and Windows 95 systems. The script was tested on
neither Mac nor Unix versions of ArcView. It was written using ArcView's Avenue
scripting language.
Installation
Once downloaded, place the pntstatcalc.avx file into the
\Av_gis30\ArcView\Ext32 subdirectory of your ArcView installation's
root directory (usually c:\Esri). Start ArcView, then load the
extension by choosing File -> Extensions and then checking the box beside
"Point Stat Calc." This places a new button on the button bar for each view.
Use
- Open ArcView, open or create a project, open or create a view with at
least one point theme and one polygon theme, load the Point Stat Calc extension
by clicking File -> Extensions... -> Point Stat Calc.
- In a view, select one point and one polygon theme by highlighting its
legend (the second one will have to be "shift-selected").
All points in the point-theme will be used in the calculations, regardless
of any selections that may be active (moreover, selections will be lost
after the calculations are complete). However, statistics will only
be gathered for currently selected polygons in the polygon-theme.
- Click the button for Point Stat Calc on the button bar.
- Choose one or more variables from the Values field. These are the
numeric attributes of the point shapefile. All statistics selected
in step 4 will be separately calculated for each variable selected here.
- Choose one or more selections from the Calculations field. Each
selection here will result in one output variable being calculated for each
variable selected in step 3. The "random" selection simply chooses
one value at random from the set of points falling within each polygon.
- Check the box to include zeros if desired. Be sure that this is not
selected for datasets having zeros in place of missing values.
- Check the box to include negative values if desired, choosing whether they will
be included as negatives or positives. This allows one to use datasets in
which negative values signify the symbol "<". If negative
values are to be used, several options become available for processing them,
including using the values unchanged, or replacing them with an arbitrary
multiple of their absolute values.
- Check the box to include dummy values if your data set includes a value
for "no data." Many types of geophysical datasets and data derived from
certain types of grids contain a large, negative number to signify that data
are absent.
- Click OK.
- If you chose to have an Nth Percentile calculated, enter up to five numeric
values into the available fields, or alternately, select a number using
the slider. All numbers must be between 0 and 100 and will be rounded to
the nearest 0.5.
- Provide a unique name for the new table that will contain the results
of the calculations. Point Stat Calc provides a default name that is the polygon
shapefile's name with "-dat" appended to the end. This table will be joined to
the polygon shapefile's attribute table after the calculations have been completed.
- Point Stat Calc must join its results table to the attribute table of the polygon
shapefile based on a common field. This requires that there exists a field in the
polygon attribute table containing a unique value (an index) for each polygon.
Click one of the two radio buttons to either (a) use an existing field within
your polygon attribute table or (b) create a new index field in the original
polygon attribute table. Choice (b) results in the only modification that Point
Stat Calc may make to the original shapefiles used in the calculations.
- Click OK.
- Click Yes to proceed, No to cancel.
- Once the calculations have completed, open the polygon shapefile's attribute
table to view the results. The calculated fields are furthest to the right. They
have been named using a combination of the source field (from the point shapefile)
and the calculation performed. Reminder: the results are not permanently
attached to the polygon shapefile's attribute table. They are temporarily joined.
This join may be broken by examining the polygon shapefile's attribute table,
and clicking Table -> Remove all joins. To permanently join the calculated
data to the polygon theme's attribute table, select the polygon theme while it
is still joined to the results table, and click Theme -> Convert To
Shapefile to create a new shapefile.
Example
You have one polygon shapefile (Midwest.shp) of 4 Midwest states (Illinois, Wisconsin,
Iowa, and Minnesota), and one point shapefile (Geochem.shp) containing arsenic,
mercury, and lead values in rock samples within these states (data are randomly
generated for this example only). You would like to know the average, median,
and 95th percentile values as well as the counts for all three elements by
state.To do this, complete the following steps:
- In the active view, select the Midwest.shp and Geochem.shp themes.
- Click the Point Stat Calc button on the button bar.
- Select Pb_ppm_, As_ppm_, and Hg_ppm_ from the Values field.
- Select Average, Median, Count and Nth Percentile from the Calculations field.
- Zeroes are not valid values in this data set, so make sure the check-box
excludes them.
- In this data set, negative values mean "less than." For example, a value
of -5 means <5. Check the box to include negative values, and then select the
radio button to treat negative values as n multiplied by the absolute
value. Enter 0.5 for n, so a value of <5 would be treated as 2.5
for the calculations.
- Check the box to ignore dummy values. In this data set, -9999 means "no data."
Enter -9999 into the Dummy Value field.
- In the Choose Percentiles window, click Ok to calculate the 95th percentile.
- Accept the default table name of Midwest-dat.shp.
- Click on the "Create new index item in tables" radio button. Click Ok.
- Click Yes
- To view the data, open up Midwest.shp's attribute table. The calculations will
be in the fields beginning with Hg, As, and Pb. Your results should match those displayed
in the table below.
|
|
Results:
Known Issues
Several factors have been shown to lengthen the processing time of Point Stat
Calc on large data sets. Ensure that the point theme is a shapefile rather
than an event theme. If it is an event theme, convert it to a shapefile. Also,
be sure to place both themes’ associated files on the hard disk rather than
running them off of a temporary storage device such as a Jaz or Zip disk.
Exceptionally large data sets (point themes with greater than 100,000 records)
may take a considerable amount of time to process. A practical way of shortening
the processing time is to remove unnecessary records from [copies of] each
shapefile. For example, if you are using a U.S. counties theme for your polygon
theme and you do not need calculations performed for any counties with areas
greater than 1,800 square miles, do a query on a copy of the data set to select
these counties and any other counties that may be removed. Then, use
select-by-theme within the theme menu to select all points
from a copy of your points theme that fall within these counties. Finally,
delete all selected records from both the polygon and point shapefile copies.
An alternate means of excluding unneeded polygons from a data set is to select
only those polygons that are needed. Point Stat Calc will ignore all unselected
polygons if any are selected, but will include all polygons if none are selected.
Point Stat Calc will use all points regardless of whether or not they are
selected. The only way to exclude points from the calculations is to delete
them.
Contact
Please send any questions or comments to:
Jeffrey Grossman
US Geological Survey
954 National Center
Reston, VA 20192-0001
or
jgrossman@usgs.gov
Acknowledgments
Thanks to Yew Yuan, Jeffrey Grossman, Andrew Grosz, and Joseph Duval for help
with developing and testing Point Stat Calc.
Download
Download Point Stat Calc
Download Sample Data
Disclaimers
This report is preliminary and has not been reviewed
for conformity with U.S. Geological Survey editorial
standards or with the North American Stratigraphic code.
Any use of trade, product, or firm names is for
descriptive purposes only and does not imply
endorsement by the U.S. Government.
Although all data and software released with this open file
have been used by the USGS, no warranty, expressed or
implied, is made by the USGS as to the accuracy of the
data and related materials and (or) the functioning of
the software.
Eastern Mineral Resources Team
USGS Geologic Information
This page is https://pubs.usgs.gov/openfile/of00-302/
Maintained by Eastern Publications Group
Last modified July 26, 2000 (jmw)