Identification_Information: Citation: Citation_Information: Originator: Cynthia Wallace, Miguel Villarreal and Laura Norman Publication_Date: Unknown Title: Santa Cruz River vegetation map of riparian corridor and surrounding watershed Geospatial_Data_Presentation_Form: raster digital data Series_Information: Series_Name: Open-File Report Online_Linkage: http://pubs.usgs.gov/of/2011/1143/ Larger_Work_Citation: Citation_Information: Originator: Wallace, Cynthia S.A., Villarreal, Miguel L, and Norman, Laura M., Publication_Date: July 19, 2011 Title: Developing a high-resolution binational vegetation map of the Santa Cruz River riparian corridor and surrounding watershed Geospatial_Data_Presentation_Form: raster digital data Series_Information: Series_Name: Open File Report 2011-1143 Publication_Information: Publisher: U.S. Geological Survey Description: Abstract: This a detailed vegetation map of the Santa Cruz Watershed based on TES units that seamlessly crosses the international border Purpose: This dataset was created to be used as input to the Santa Cruz Watershed Ecosystem Portfolio Model (SWEPM) for supporting ecosystem services evaluation and wildlife habitat modeling. Time_Period_of_Content: Time_Period_Information: Single_Date/Time: Calendar_Date: 1999-2000 Currentness_Reference: ground condition Status: Progress: Complete Maintenance_and_Update_Frequency: None planned Spatial_Domain: Bounding_Coordinates: West_Bounding_Coordinate: 457985 East_Bounding_Coordinate: 574025 North_Bounding_Coordinate: 3620015 South_Bounding_Coordinate: 3429995 Keywords: Theme: Theme_Keyword_Thesaurus: REQUIRED: Reference to a formally registered thesaurus or a similar authoritative source of theme keywords. Theme_Keyword: Vegetation Map Theme_Keyword: CART Model Theme_Keyword: Terrestrial Ecosystem Units Theme_Keyword: SWReGAP Place: Place_Keyword: Santa Cruz Watershed Place_Keyword: Mexico Place_Keyword: Arizona Stratum: Temporal: Temporal_Keyword: Landsat TM data 1999, 2000 Access_Constraints: Downloadable data. Use_Constraints: There is no guarantee concerning the accuracy of the data. Users should be aware that temporal changes may have occurred since this data set was collected and that some parts of this data may no longer represent actual surface conditions. Users should not use this data for critical applications without a full awareness of its limitations. Acknowledgement of the originating agencies would be appreciated in products derived from these data. Any user who modifies the data is obligated to describe the types of modifications they perform. User specifically agrees not to misrepresent the data, nor to imply that changes made were approved or endorsed by the U.S. Geological Survey. Please refer to for the USGS disclaimer. Point_of_Contact: Contact_Information: Contact_Person_Primary: Contact_Person: Cynthia Wallace Contact_Organization: US Geological Survey Contact_Position: Research Scientist Contact_Address: Address_Type: mailing and physical address Address: 520 N. Park Ave., Ste #355 City: Tucson State_or_Province: AZ Postal_Code: 85719 Country: USA Contact_Voice_Telephone: 5206705589 Contact_Electronic_Mail_Address: cwallace@usgs.gov Native_Data_Set_Environment: Microsoft Windows XP Version 5.1 (Build 2600) Service Pack 3; ESRI ArcCatalog 9.3.1.4000 Data_Quality_Information: Attribute_Accuracy: Attribute_Accuracy_Report: For accuracy assessment we created a classification based on 80% of the training data and evaluated the result with the remaining 20% withheld as "testing" data. A random sample for training and testing was generated for each data provenance (NM-GAP, Villarreal, SWReGAP Cores). A total of 641 points were withheld for testing, 48 from Villareal, 89 from NM-GAP, and 504 from SWReGAP Cores. The final vegetation classification, however, was created from the full set of testing and training sites. The overall accuracy of all vegetation classes modeled with the random 80% subset of the training points and evaluated with the withheld 20% test points was 77%. Accuracy for only NM-GAP test points was 76% and only Villarreal was 60%. Highest accuracy for Villarreal test points are those for North American Warm Desert Riparian Mesquite Bosque (100%) and Riparian Forest (75%), which mapped into the classification scheme one-to-one. Lowest accuracies are for the North American Warm Desert Riparian Woodland and Shrubland (53%) and the Chihuahuan-Sonoran Desert Bottomland and Swale Grassland (50%), both of which were a many-to-one mapping, suggesting that the some of the Villarreal classes that were collapsed into single SWReGAP classes may not have been representative of the TES unit as defined by the NM-GAP mappers. The mapping accuracy may be improved by reevaluating the classification cross-walk and by eliminating Villarreal map units that are uncharacteristic of the NatureServe-defined TES units. The combined accuracy for only the test points based on field data (NM-GAP and Villarreal) is 71%. Quantitative_Attribute_Accuracy_Assessment: Attribute_Accuracy_Value: 71 to 77% Attribute_Accuracy_Explanation: Predicts the correct classification for 71 to 77% of the data withheld from classification. Lineage: Source_Information: Source_Citation: Citation_Information: Online_Linkage: http://GloVis.usgs.gov/ Source_Time_Period_of_Content: Time_Period_Information: Single_Date/Time: Calendar_Date: 1999-2000 Source_Currentness_Reference: ground condition Source_Citation_Abbreviation: Landsat TM images from USGS-GLOVIS website (http://glovis.usgs.gov/) Source_Contribution: The Landsat images were downloaded from the USGS-GLOVIS website (http://glovis.usgs.gov/) and radiometrically corrected to top of atmosphere (TOA) reflectance using models created in Erdas Imagine 9.1. These images are from 1999 and 2000, as follows: P35R38: April 13, 2000; June 16, 2000; September 12, 2000 and November 13, 2000. P36R38: April 12, 2000; June 15, 2000; October 19, 2000. P36R37: April 12, 2000; June 15, 2000; October 19, 2000. Process_Step: Process_Description: Using a CART modeling environment, we produced a binational vegetation classification of the Santa Cruz River riparian habitat and watershed vegetation based on NatureServe TES units. Environmental layers used as independent, predictor data were derived from a seasonal set of Landsat TM images (spring, summer, and fall) and from a 30-meter DEM grid. Training data were compiled from existing field data collected for a recent map produced by Villareal for the Santa Cruz riparian corridor and data collected by the NM-GAP team for the original SWReGAP modeling effort. Additional training data were collected from "core" areas of the SWReGAP classification itself, allowing the extrapolation of the SWReGAP mapping into the Mexican portion of the watershed without collecting additional training data. Processing details are published as Open File Report (same title) Source_Used_Citation_Abbreviation: SWReGAP classification and training data (http://fws-nmcfwru.nmsu.edu/swregap/ Source_Used_Citation_Abbreviation: Riparian Corridor training data: Villarreal, M. L. 2009. Land use and disturbance interactions in dynamic arid systems: multiscale remote sensing approaches for monitoring and analyzing riparian vegetation change. Doctoral dissertation, University of Arizona Source_Used_Citation_Abbreviation: Landsat TM from USGS-GLOVIS website (http://glovis.usgs.gov/) Process_Date: October 2010 Process_Step: Process_Description: The Landsat images used by the SWReGAP modelers (http://fws-nmcfwru.nmsu.edu/swregap/, last accessed October 21, 2010), were downloaded from the USGS-GLOVIS website (http://glovis.usgs.gov/) and radiometrically corrected to top of atmosphere (TOA) reflectance using models created in Erdas Imagine 9.1. These images are from 1999 and 2000, coincident with the collection of the field data collected by the New Mexico GAP (NM-GAP) modeling team. Once radiometrically corrected, the three contemporaneous Landsat Images (P35R38, P36R37, and P36R38) for each of three seasons (Spring, Summer, and Fall) were mosaiced in Erdas Imagine and clipped to the AOI. Source_Used_Citation_Abbreviation: Landsat TM dta from GLOVIS website Process_Step: Process_Description: Compiled Landsat images and created derivatives to use as environmental variables for modeling, as follows: 1.Spring Landsat Bands (6 bands: B,G,R,NIR,MIR,SWIR; Figure 3), 2.Summer Landsat Bands (6 bands: B,G,R,NIR,MIR,SWIR; Figure 4), 3.Fall Landsat Bands (6 bands: B,G,R,NIR,MIR,SWIR; Figure 5): The seasonal mosaics are the original Landsat images, corrected to TOA, mosaiced using Erdas Imagine (specifying P36R38 as the reference image with no color-balancing and no feathering) and clipped to the AOI. 4.Spring Tasseled Cap (3 bands: Brightness, Greenness, and Wetness), 5.Summer Tasseled Cap (3 bands: Brightness, Greenness, and Wetness), 6.Fall Tasseled Cap (3 bands: Brightness, Greenness, and Wetness): The Tasseled Cap algorithm transforms the original multi-band TM data onto 3 orthogonal axes that represent overall "Brightness," "Greenness," and "Wetness" (Crist and Kauth 1986). The seasonal tasseled cap images were calculated in Erdas Imagine (specifying the output stretched to 8-bit) from the mosaiced images and then clipped to the AOI. 7.Spring Normalized Difference Vegetation Index NDVI, 8.Summer NDVI, 9.Fall NDVI: The NDVI is an index derived from reflectance values of the Red (R) and Near-Infrared (NIR) regions of the electromagnetic spectrum and is sensitive to various biophysical vegetation characteristics, such as biomass and percent cover (Duncan and others 1993, Huete and Jackson 1987). The formula is: NDVI= (NIR - R)/ (NIR + R). NDVI values range from -1 to 1, with non-land surfaces (such as water and snow) typically assuming negative values and land surfaces typically assuming positive values. As landscapes become more densely vegetated, the NDVI trends to 1. For this modeling, NDVI images were calculated in Erdas Imagine (specifying the output stretched to 8-bit) from the mosaiced images and then clipped to the AOI. 10.April NDVI Texture: This derivative was included to capture information about the pattern of vegetation on the landscape. It was calculated in Erdas Imagine (Interpreter, Spatial Enhancement, Texture, 3x3 Variance) from the mosaiced images and then clipped to the AOI. 11.October Red Band Texture: This derivative was included to capture information about the pattern of brightness on the landscape, especially to highlight urban/non-urban areas. It was calculated in Erdas Imagine (Interpreter, Spatial Enhancement, Texture, 3x3 Variance) from the mosaiced images and then clipped to the AOI. Source_Used_Citation_Abbreviation: Landsat images and derivatives used for modeling are as follows: Process_Date: October 2010 Process_Step: Process_Description: Another suite of independent variables was created from topographic data for the region. A rectangular subset of the 30-meter resolution DEM containing the Santa Cruz study area was obtained through EROS Data Center (http://edc.usgs.gov/; figure 6). The DEM based rasters used for modeling and their derivations are as follows: 12. DEM: Original data. 13. Aspect: Derived in Erdas Imagine (Interpreter, Topographic Analysis, Aspect) from the mosaiced images and then clipped to the AOI. 14. Slope: Derived in Erdas Imagine (Interpreter, Topographic Analysis, Slope) from the mosaiced images and then clipped to the AOI. 15. Flow Accumulation: Derived using ArcGIS Spatial Analyst Hydrology tools. The DEM was first "Filled", then "Flow Direction" was calculated and finally "Flow Accumulation" was derived. Although the Flow Accumulation was not used directly in the modeling, it was used to derive the next layer. 16. Flow Accumulation Difference of Focal Max and Mean. This was calculated in Erdas Imagine using the Model Maker from the mosaiced images and then clipped to the AOI. We experimented with several variations on this type of neighborhood calculation. The calculation that seemed to produce the desired result is subtracting the mean Flow Accumulation value within a 25x25 window from the maximum Flow Accumulation value within a 5x5 window. This difference was designed to pull out the areas within and near the topographic drainages to effectively constrain classification of riparian vegetation types. 17. Landform: A stratification of the DEM into 10 landform classes as developed by Jenness (2008). The algorithm that produced the Landform layer was accessed in the ArcGIS toolbox (http://arcscripts.esri.com/details.asp?dbid=15996%20, last accessed October 21, 2010) Source_Used_Citation_Abbreviation: EROS Data Center (http://edc.usgs.gov) Process_Date: October 2010 Process_Step: Process_Description: Prepared Training Data: To simplify the classification process and enhance processing speed, we converted all training data into points. For the NM-GAP training sites, random points were gathered within the polygons using a stratified-random sampling algorithm in Erdas Imagine specifying a minimum of 10 points per class. The points were then inspected on a 2006 DOQQ and a few were deleted if they appeared to be unrepresentative of their labeled land cover type (e.g., the random point was located near a polygon edge). All of the random riparian points were deleted because another data set (Villarreal 2009) provided points within the Santa Cruz riparian corridor. A total of 447 training points were generated in this manner. The training data Villarreal collected for 2006 (field data) and 1996 (interpreted aerial photos) were subset to those that were mapped identically for both time periods, to identify field data that existed as labeled in our intermediate time period of 1999-2000. These training site polygon arcs were buffered by 60 meters, to exclude mixed-pixel edge areas, and random points were gathered within the reduced polygons using a stratified-random sampling algorithm in Erdas Imagine, again, specifying a minimum of 10 points per class. The points were then inspected on a 2006 DOQQ and a few were deleted if they appeared to be unrepresentative of their labeled land cover type. A total of 237 points were generated in this manner. Mapped vegetation classes for the Santa Cruz riparian corridor were cross-walked to the SWReGAP TES units, requiring generalization of some classes. For example, "Riparian Woodland," "Shrub Savanna," "Shrubland" and "Tree Savanna" were all merged into the SWReGAP "North American Warm Desert Riparian Woodland and Shrubland". To preserve ecological detail we considered important for wildlife habitat, we added a "Riparian Forest" class to the existing SWReGAP TES units. Additional training points were collected within the core areas of the SWReGAP polygons using a stratified-random sampling algorithm in Erdas Imagine, again specifying a minimum of 10 points per class. The core areas of the polygons were extracted using the Erdas Imagine model maker by first calculating a Focal Diversity within a 3x3 window for the thematic image, and then eliminating (setting to zero) all resulting values greater than 1. A total of 2,912 points were generated in this manner. We removed all "Agriculture" points from the SWReGAP Core data, for three reasons. First, SWReGAP did not use CART to model agriculture - it had been captured through heads-up digitizing or from other agency GIS data. Second, the only agriculture evidenced in the Santa Cruz Watershed occurs along the riparian corridor and the SWReGAP includes this agriculture as well as upland center-pivot agriculture; these are distinctly different types of agriculture and would likely confound the CART classifier. Third, the Villareal data includes field points for the Santa Cruz riparian corridor agriculture, which are the data we preserved for training purposes. Source_Used_Citation_Abbreviation: SWReGAP classification and training data (http://fws-nmcfwru.nmsu.edu/swregap/, Source_Used_Citation_Abbreviation: Training Data for Riparian Corridor: Villarreal, M. L. 2009. Land use and disturbance interactions in dynamic arid systems: multiscale remote sensing approaches for monitoring and analyzing riparian vegetation change. Doctoral dissertation, University of Arizona. Process_Date: October 2010 Process_Step: Process_Description: The vegetation classification of the Santa Cruz Watershed was accomplished using Classification And Decision Tree (CART) software (Quinlan 1993, 1996). We accessed the CART tool developed by the USGS-NLCD for Erdas Imagine that is available free for download at . For each modeling run, we first used the "NLCD sampling tool" to create a ".data" file by inputting the independent data layers (all clipped to the AOI), and specifying the "training data" as a tab-delimited text file containing the three columns X, Y, and CODE (input to the tool as a .txt file without a header). The See5 program (installed as a stand-alone program outside of the Erdas toolbar) is then run, with the ".data" file specified as input and "construct classifier" executed to produce the output ".names" file. Several options are offered in the "construct classifier" menu. After some trial and error, we chose to "boost" 10 trials and chose Global Pruning (specifying Pruning CF 25%, minimum 2 cases) for the final classification runs. Once See5 is run, the NLCD tool "See5 classifier" is activated, choosing the See5 created ".names" file as input and specifying a "tree" classifier. The output is a classified image. The final classification was the result of an iterative process. All TES classes were initially modeled, including anthropogenic and disturbed landscapes, such as Developed (urban) and Recently Mined or Quarried. Many of these non-vegetation classes tend to be quite heterogeneous and, although the CART environment produced reasonable results with the suite of independent layers we included, small patches of these types were identified by the CART classifier in improbable landscapes. Because of this, we followed the protocol of the original SWReGAP modelers, who captured these non-vegetation classes separately and applied them as an overlay on the CART-modeled vegetation classification. The SWReGAP modelers used a variety of methods, such as heads-up digitizing of Urban and Agriculture from hi-resolution imagery and accessing other agency GIS data. For our product, we accessed and overlaid Developed Classes (high, medium and low density) from the Land Cover product for 1999 produced under a parallel effort for this project (Villarreal and others 2010). In addition, the Recently Mined or Quarried class was accessed from the original SWReGAP product. Inspection of our model that included all TES classes revealed no obvious Recently Mined or Quarried landscapes in the Mexican portion of watershed, so this overlay of the US-side only was considered reasonable. In summary, the CART tool was used to model only natural and vegetated landscapes. A 3x3 majority filter was applied to the output classification to eliminate small isolated pixels that are generally considered noise. The Developed and Mined/quarried classes were then overlain and the final product was clipped to the Santa Cruz Watershed study area. Process_Date: October 2010 Process_Step: Process_Description: Following the lead of the SWReGAP team, for accuracy assessment we created a classification based on 80% of the training data and evaluated the result with the remaining 20% withheld as "testing" data. A random sample for training and testing was generated for each data provenance (NM-GAP, Villarreal, SWReGAP Cores). A total of 641 points were withheld for testing, 48 from Villareal, 89 from NM-GAP, and 504 from SWReGAP Cores. The final vegetation classification, however, was created from the full set of testing and training sites. The overall accuracy of all vegetation classes modeled with the random 80% subset of the training points and evaluated with the withheld 20% test points was 77%. Accuracy for only NM-GAP test points was 76% and only Villarreal was 60%. The combined accuracy for only the test points based on field data (NM-GAP and Villarreal) is 71%. Process_Date: October 2010 Source_Produced_Citation_Abbreviation: santacruz_tes Spatial_Data_Organization_Information: Direct_Spatial_Reference_Method: Raster Raster_Object_Information: Raster_Object_Type: Grid Cell Row_Count: 6334 Column_Count: 3868 Vertical_Count: 1 Spatial_Reference_Information: Horizontal_Coordinate_System_Definition: Planar: Grid_Coordinate_System: Grid_Coordinate_System_Name: Universal Transverse Mercator Universal_Transverse_Mercator: UTM_Zone_Number: 12 Transverse_Mercator: Scale_Factor_at_Central_Meridian: 0.999600 Longitude_of_Central_Meridian: -111.000000 Latitude_of_Projection_Origin: 0.000000 False_Easting: 500000.000000 False_Northing: 0.000000 Planar_Coordinate_Information: Planar_Coordinate_Encoding_Method: row and column Coordinate_Representation: Abscissa_Resolution: 30.000000 Ordinate_Resolution: 30.000000 Planar_Distance_Units: meters Geodetic_Model: Horizontal_Datum_Name: North American Datum of 1983 Ellipsoid_Name: Geodetic Reference System 80 Semi-major_Axis: 6378137.000000 Denominator_of_Flattening_Ratio: 298.257222 Entity_and_Attribute_Information: Detailed_Description: Entity_Type: Entity_Type_Label: santacruz_tes.vat Attribute: Attribute_Label: Rowid Attribute_Definition: Internal feature number. Attribute_Definition_Source: ESRI Attribute_Domain_Values: Unrepresentable_Domain: Sequential unique whole numbers that are automatically generated. Attribute: Attribute_Label: VALUE Attribute: Attribute_Label: COUNT Attribute: Attribute_Label: RED Attribute: Attribute_Label: GREEN Attribute: Attribute_Label: BLUE Attribute: Attribute_Label: SCODE Attribute: Attribute_Label: DESCRIPTION Attribute: Attribute_Label: OPACITY Distribution_Information: Resource_Description: Downloadable Data Standard_Order_Process: Digital_Form: Digital_Transfer_Information: Transfer_Size: 3.388 Metadata_Reference_Information: Metadata_Date: 20101102 Metadata_Contact: Contact_Information: Contact_Person_Primary: Contact_Person: Laura M. Norman Contact_Organization: US Geological Survey Contact_Address: Address_Type: mailing and physical address Address: 520 N. Park Ave, Ste #355 City: Tucson State_or_Province: AZ Postal_Code: 85719 Contact_Voice_Telephone: 5206705510 Metadata_Standard_Name: FGDC Content Standards for Digital Geospatial Metadata Metadata_Standard_Version: FGDC-STD-001-1998 Metadata_Time_Convention: local time Metadata_Extensions: Online_Linkage: http://www.esri.com/metadata/esriprof80.html Profile_Name: ESRI Metadata Profile