Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud

GIScience and Remote Sensing
By: , and 

Links

Abstract

The South Asia (India, Pakistan, Bangladesh, Nepal, Sri Lanka and Bhutan) has a staggering 900 million people (~43% of the population) who face food insecurity or severe food insecurity as per United Nations, Food and Agriculture Organization’s (FAO) the Food Insecurity Experience Scale (FIES). The existing coarse-resolution (>250-m) cropland maps lack precision in geo-location of individual farms and have low map accuracies. This also results in uncertainties in cropland areas calculated from such products. Thereby, the overarching goal of this study was to develop high spatial resolution (30-m or better) baseline cropland extent product of South Asia for the year 2015 using Landsat satellite time-series big-data and machine learning algorithms (MLAs) on the Google Earth Engine (GEE) cloud computing platform. To eliminate the impact of clouds, ten time-composited Landsat bands (blue, green, red, NIR, SWIR1, SWIR2, Thermal, EVI, NDVI, NDWI) were derived for each of the 3 time-periods over 12 months (monsoon: Julian days 151-300; winter: Julian days 301-365 plus 1-60; and summer: Julian days 61-150), taking the every 8-day data from Landsat-8 and 7 for the years 2013-2015, for a total of 30-bands plus global digital elevation model (GDEM) derived slope band. This 31-band mega-file big data-cube was composed for each of the 5 agro-ecological zones (AEZ’s) of South Asia and formed a baseline data for image classification and analysis. Knowledge-base for the Random Forest (RF) MLAs were developed using spatially well spread-out reference training data (N=2179) in 5 AEZs. Classification was performed on GEE for each of the 5 AEZs using well-established knowledge-based and RF MLAs on the cloud. Map accuracies were measured using independent validation data (N=1185). The survey showed that the South Asia cropland product had a producer’s accuracy of 89.9% (errors of omissions of 10.1%), user’s accuracy of 95.3% (errors of commission of 4.7%) and an overall accuracy of 88.7%. The National and sub-national (districts) areas computed from this cropland extent product explained 80-96% variability when compared with the National statistics of the South Asian Countries. The full resolution imagery can be viewed at full-resolution, by zooming-in to any location in South Asia or the world, at www.croplands.org and the cropland products of South Asia downloaded from The Land Processes Distributed Active Archive Center (LP DAAC) of National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS): https://lpdaac.usgs.gov/products/gfsad30saafgircev001/

Study Area

Publication type Article
Publication Subtype Journal Article
Title Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud
Series title GIScience and Remote Sensing
DOI 10.1080/15481603.2019.1690780
Volume 57
Issue 3
Year Published 2020
Language English
Publisher Taylor & Francis
Contributing office(s) Western Geographic Science Center
Description 21 p.
First page 302
Last page 322
Country India, Pakistan, Bangladesh, Nepal, Sri Lanka, Bhutan
Google Analytic Metrics Metrics page
Additional publication details