Building a Web-Services Based Geospatial Online ... - IEEE Xplore

4 downloads 40197 Views 2MB Size Report
Building a Web-Services Based Geospatial. Online Analysis System. Peisheng Zhao, Liping Di, Senior Member, IEEE, Weiguo Han, Member, IEEE, and Xiaoyan ...
1780

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

Building a Web-Services Based Geospatial Online Analysis System Peisheng Zhao, Liping Di, Senior Member, IEEE, Weiguo Han, Member, IEEE, and Xiaoyan Li

Abstract—Recent advances in geospatial Web services and Service-Oriented Architecture (SOA) are shifting geospatial data and analysis from the everything-locally-owned-and-operated paradigm to the everything-shared-over-the-Web paradigm. By embracing geospatial content and capabilities within the context of SOA, we have developed a Geospatial Online Analysis System (GeOnAS), a fully extensible system designed for discovery, retrieval, analysis, and visualization of geospatial and other network data based on Web services. GeOnAS also is built as an open and collaborative system capable of integrating distributed services to implement all required functions. The more users are involved, the more powerful the system becomes. GeOnAS provides values in its overall efficiency of integrating and analyzing distributed geospatial data over the Web. Index Terms—Service-Oriented Architecture, catalogue service, processing service, web service.

I. INTRODUCTION

G

EOSCIENCE research and applications often involve analysis of a large volume of distributed geospatial data. Traditionally, scientists have spent a lot of time installing and learning a variety of software on local machines, searching for and collecting the data from various sources, and preprocessing and analyzing the data on local machines. This “everything-locally-owned-and-operated” paradigm makes the analysis of geospatial data very expensive and time-consuming. As Web technology has matured in recent years, an increasing amount of geospatial content and capabilities are available online. This increase is shifting geospatial resource from the “everything-locally-owned-and-operated” paradigm to the “everything-shared-over-the-Web” paradigm. For instance, the Geospatial One-Stop (GOS, http://www.geodata.gov) links users to different levels of data sets over the Web, and provides them scientific knowledge in reports, maps, models and applications. The Global Earth Observation System of Systems (GEOSS, http://www.earthobservations.org) provides GEO Web Portals (GWPs) for searching and exploring the Manuscript received October 26, 2011; revised December 29, 2011; accepted April 10, 2012. Date of publication September 28, 2012; date of current version December 28, 2012. This work is supported by a grant from the National Aeronautics and Space Administration (NASA) GeoBrain project (NNG04GE61A, PI: Dr. L. Di). P. Zhao is with the Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA 22030 USA, and is also with NASA/ GSFC, Greenbelt, MD 20771 USA (e-mail: [email protected]). L. Di is with the Center for Spatial Information Science and Systems, George Mason University, Greenbelt, MD 20770 USA (e-mail: [email protected]). W. Han and X. Li are with the Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA 22030 USA (e-mail: whan@gmu. edu; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/JSTARS.2012.2197372

data, information, imagery, services, and applications across organizations. The National Remote Sensing Data Library (NRSDL) illustrates the concept of interoperable geospatial service infrastructure for Earth observation data management [1]. These systems greatly improve the efficiency and effectiveness of geospatial data discovery, access and visualization over the Web, but still remain weak in online data analysis. To meet the needs of Earth science research, the GES-DISC Interactive Online Visualization and Analysis Infrastructure (Giovanni, http://disc.sci.gsfc.nasa.gov/giovanni/) provides a series of Web portals for analysis of vast amounts of Earth science remote sensing data directly on the Internet [2]. But it is a proprietary system that has no open and standardized architecture and interfaces for processing interoperability and users collaboration. Since Earth and space research and applications are often multi-scale and multi-disciplinary, geospatial analysis involves a number and variety of data and computations in a distributed and heterogeneous environment. An open and interoperable infrastructure is strongly required for integrating and collaborating different geospatial datasets and operations. [3] and [4] discuss the benefits, implementations and challenges of distributed and collaborative geospatial data processing. A new scalable Service-Oriented Architecture (SOA) is emerging as the basis for distributed computing and collaborative applications. By embracing geospatial content and capabilities within the context of SOA, we have developed a Web services based Geospatial Online Analysis System (GeOnAS, http://geobrain.laits.gmu.edu/OnAS/). The distinguishing characteristic of this system is that it is designed to provide an open environment for the use of distributed Web services with the following properties: 1) the system uses standardized architecture and interfaces to achieve data and processing interoperability and collaboration; 2) the system provides a single point of entry to the geospatial data from any Open Geospatial Consortium (OGC) compliant data service; 3) all analysis functions are provided by integrating distributed Web services, including built-in system services and users’ external services. The remainder of this paper is organized as follows. In Section II, GeOnAS architecture and components are discussed. Section III illustrates how to search and retrieve distributed geospatial data. Sections IV and V highlight how to implement geospatial processing services and integrate distributed Web services to analyze data. Finally, Section VI presents the conclusions and plans for future work. II. GEONAS ARCHITECTURE “A Web service is a software system designed to support interoperable machine-to-machine interaction over a network”

1939-1404/$31.00 © 2012 IEEE

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

1781

Fig. 1. GeOnAS Architecture.

[5]. To achieve interoperability, a Web service usually encapsulates a set of discrete functionalities with explicit interfaces described in a machine-processable format, such as the Web Service Description Language (WSDL), and it is programmatically network-accessible using standard Internet protocols, such as the Simple Object Access Protocol (SOAP). SOA is a framework for services and service-based application development [6]. Within the context of SOA, a system consists of a collection of loosely coupled services that communicate with each other by passing data from one service to another to coordinate an activity. New systems can be created dynamically by combining new application-specific services with existing services [7], [8]. SOA empowers systems with the following advanced features to offer an innovative and flexible approach for the design and development of the Web: • Consistency: a system is broken down into common and repeatable services with explicit interfaces. • Interoperability: self-described services are machine-to-machine discoverable and executable through standard protocols. • Orchestration: services can be assembled into a service chain to solve a more complicated problem. • Efficiency: reusable services allow improved integration and collaboration to build new capabilities. • Flexibility: services can be distributed over the network, run on different platforms, and implemented in different programming languages.

The GeOnAS architecture shown in Fig. 1 relies on SOA as its fundamental design principle to link distributed computational resources to support geospatial analysis. Its design and implementation is based on the standards ranging from service interface, and service description to service invocation. Users do not need to download and install any new software or plug-in to use GeOnAS. The client is a standard Web browser. The use of Asynchronous JavaScript and XML (AJAX) allows rich, responsive, intuitive, and interactive user interfaces and enables dynamic features and functionalities inside the browser. Moreover, instead of traditional “click, wait, and refresh” user interaction, AJAX brings the GeOnAS client better performance and Web experience by allowing users’ requests to be added or retrieved asynchronously without reloading Web pages. The GeOnAS Server integrates and manages distributed Web services to coordinate functional activities and provide clients application logic. It includes mainly five modules: • Context Management: each user can create, configure, and store context information to record the working environment in an OGC Web Map Context (WMC) document file. These context files contain the list of all preferred data layers and relevant processing services to facilitate users to build and restore their own systems. The context information can also be exported into a Keyhole Markup Language (KML) file in order to share, visualize, and validate data in Google Earth. • Data and Service Discovery: GeOnAS provides an integrated and intuitive interface to enable users to register and

1782

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

discover data and services from distributed catalogue services using the OGC Catalogue Services for Web (CSW) protocol [9]. • Data Analysis: GeOnAS coordinates all distributed geospatial processing services and provides users an interactive environment to discover and invoke services for data analysis. This module provides users an efficient way to implement analysis tasks asynchronously in a cluster-based network environment. • Data Visualization: GeOnAS can render both raster and vector geospatial data on the fly based on users’ requirements. Users can set up their own preference on how to organize and display data, such as overlay sequence, data subsetting, image palette, thematic classification, statistic chart. • Workflow Management: If an analysis task is too complex to be performed by an individual service, the GeOnAS allows the user to integrate a set of related services to build a service chain to implement the task. This module provides users an easy way to manipulate and coordinate distributed services. Geospatial contents and capabilities are increasingly available. Different agencies have developed their own geospatial catalogues to facilitate discovery, access, and sharing of large volumes of geospatial data and processing capabilities. But the discovery of multi-discipline geospatial content is time-consuming and tedious, especially when the catalogues use different metadata models and access protocols. To provide a catalogue that presents a unique well-known metadata model and interface protocol to users and hides the complexity and diversity of the affiliated catalogues is very desirable [10]. The Catalogue Federation is a federation service that connects to distributed geospatial catalogue services at the back end, and provides an integrated OGC CSW access point at the front end. It facilitates distributed geospatial resource discovery from the CSISS Catalogue, the National Aeronautics and Space Administration (NASA) Earth Observing System Clearinghouse (ECHO) and the Global Earth Observation System of Systems (GEOSS) Clearing House. The Data Services component provides users a common data environment to retrieve and integrate geospatial data from distributed data archives in an interoperable manner. The OGC Web Coverage Service (WCS) provides intact multi-dimensional and multi-temporal geospatial data as a “coverage” to meet the requirements of client-side rendering, input for scientific models, and other clients beyond simple viewers [11]. The OGC Web Feature Service (WFS) supports the networked interchange of geographical vector data as “features” encoded in Geographic Markup Language (GML) [12]. The OGC Web Map Service (WMS) provides geospatial data as a “map” which is generally rendered in a spatially referenced pictorial image format such as PNG, GIF or JPEG, dynamically from real geographical data [13]. The Processing Services component provides a domain-specific computational model to enable users to do data analysis over the network. All these services have explicit standardized self-descriptions, such as WSDL document and OGC Web Processing Service (WPS) [14] capability description, for machine

accessibility. Thus, the processing services from different domains can be easily integrated into GeOnAS, and be chained together to perform more complex analysis tasks dynamically. III. DATA DISCOVERY AND RETRIEVAL In recent decades, different agencies have developed their own geospatial catalogue systems to facilitate geospatial data discovery. To enable users to find multi-discipline data across different catalogues using single entry point instead of working with each catalogue individually, GeOnAS implements the Catalogue Federation. Previous proposed federation mechanisms, especially query interface and query results assembling, are not standard-based, and not very suitable for data discovery with geospatial metadata models. GeOnAS adopts the OGC CSW ebRIM profile [15] to provide a general and flexible way for querying and access of distributed catalogue systems: • NASA ECHO: as a spatial and temporal metadata registry, NASA ECHO enables different users to search and access a large amount and variety of the Earth Observing System (EOS) data at the NASA Distributed Active Archive Centers (DACCs). ECHO’s Earth Science Metadata Conceptual Model (EESMCM) covers most conceptual descriptions at two levels for data granule and collection, such as campaign, processing level, online access, sensor name, and spatial and temporal attributes. ECHO’s Application Programmers Interface (API) supports Web-based protocols (SOAP and REST) and messaging formats to accommodate a wide range of clients to search for and order Earth science granules, or retrieve and manipulate granule metadata. • ESRI GEOSS Clearinghouse: GEOSS is becoming a global, coordinated, comprehensive, and sustained network of Earth observing systems that cover areas of critical importance to people and society. The ESRI GEOSS Clearinghouse provides access to a distributed network of GEOSS community catalogues and services through harvesting or distributed search. It provides an OGC CSW 2.0.2 baseline-compliant query interface to clients for the discovery of existing data, metadata, services and pre-defined common products. • CSISS Catalogue: The GeoBrain project conducted by GMU CSISS provides online sharing of 10 terabytes of geospatial data, including global Landsat Thematic , Mapper (TM) and Enhanced Thematic Mapper global WindSat surface soil moisture, and global Digital Elevation Model (DEM) data. By following the ebRIM profile of OGC CSW, the CSISS Catalogue combines the metadata models of ISO 19115 [16], [17] and the service rules of ISO 19119 [18] with the metadata implementation of ISO 19139 [19], and provides users a standardized way to discover and access GeoBrain data. To deal with the heterogeneity in metadata information models among the aforementioned catalogue systems, the Catalogue Federation captures the core elements of each model to provide a global and logical information schema for browsing, querying, and translating metadata. This global information model is developed based on ebRIM 2.5 [20], and combines a few extension elements from EOSDIS Core System (ECS),

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

1783

Fig. 2. Global view of data metadata in catalogue federation.

ISO 19115, and ISO 19119. Fig. 2 shows how metadata is organized and stored in the Catalogue Federation. A new class “DataGranule”, a subclass of ebRIM class “CSWExtrinsicObject”, is developed to use the following metadata to describe and query a real data file resource: • dataCollectionID: a unique name of the data collection to which the data belongs. • responsibleParty: organization or individual with responsibility for the data granule. • beginDateTime: the date and time when temporal data granules begin. • endDateTime: the date and time when temporal data granules end. • dayNightFlag: attribute for identifying if a data granule is collected during the day, night or both. • orderable: indication of whether the data granule is orderable. • dataFormat: file format of the raw data. • sizeMBDataGranule: file size of the data granule in Megabytes. • processingLevelID: classification of science data processing level for the source. • topicKeyword: general topic keyword of the data granule. • disciplineKeyword: general discipline keyword of the data granule. • variableKeyword: keyword used to describe the scientific parameter content. • ParameterKeyword: keyword used to describe specific characteristics at a high level. • termKeyword: keyword used to describe the area of the scientific parameters. • platformName: name of the platform collecting the dataset. • instrumentName: name of the instrument collecting the dataset.

• sensorName: the name of the sensor collecting the dataset. • bbox: bounding box of data granule with spatial reference system • onlineAccessURL: online URL for data access. Fig. 3 shows the graphical user interface for data discovery. Its design uses the four sets of queryable properties. “Catalog” allows user to choose which affiliated catalogue will be searched. “Spatial” enables the user to express spatial constraints on data discovery. Google Map is integrated to give user a convenient way to find the desired location. The user can also select states, counties, and cities within the United States or input country name or code to specify the spatial bounding box of area of interest. “Temporal” allows the user to query time series data in which “ProductTime” specifies the date and time of data generation and “CollectionRange” specifies the date and time when the temporal coverage covers. And “Others” allows the user to discover geospatial data through data physical attributes described in class “dataGranule”. Since the OGC CSW is an OGC-recommended standard for geospatial catalogue interoperability, and many applications and clients have been developed with this specification, the Catalogue Federation adopts a mediator-based architecture with OGC CSW query interfaces. When getting a user query in the OGC CSW format, the system assigns the query a unique identifier for tracking use, then does a syntactic analysis to retrieve query logic, and then performs a semantic analysis to decide which catalogue will be queried. The data query can be sent to the ESRI GEOSS Clearinghouse and CSISS Catalogue directly, and needs to be translated into a specific format through an adapter before querying NASA ECHO. This adapter is equipped with model mapping rules and knowledge about how to transform the metadata term, query format and query language functionality. When getting results from multiple underlying catalogue services, all information is transformed into the global

1784

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

Fig. 3. User interface for data search.

Fig. 4. Data search result.

information model, and is listed and viewed in a Web page. The original metadata information is still kept for user reference. As shown in Fig. 4, the query result information including data preview, description, size, format, and spatial and temporal at-

tributes is shown to enable users easily find and select the data of interest for online analysis or download for future use. To import data into GeOnAS, each data item is associated with a service for online access in the Catalogue Federation.

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

1785

Fig. 5. XML query for an associated service for a specific data.

Fig. 6. Service query response.

As shown in Fig. 2, the class “Association” has “sourceObject” and “targetObject” attributes that specify the source, a service instance, and target, a data instance, respectively. Its other attribute, “associationType”, specifies the type of association, “operateOn”, between data and service. With this “Association”, data can be discovered by either using data properties

or exploring service metadata. For example, searching NASA DAAC WCS enables users to discover not only the WCS provided by NASA DAAC, but also the data information hosted by this WCS. Because OGC data services provide well-known interfaces for data retrieval, the Catalogue Federation searches data that are associated with OGC services with high priority

1786

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

Fig. 7. WCS getCoverage request.

Fig. 8. GRASS processing workflow.

and provides them directly to users. If users try to retrieve the online data not associated with an OGC service, GeOnAS downloads the data at the server side, and associates directly them to a virtual OGC service. Therefore, the process of data retrieval involves the following steps: 1. Find an associated service for a specific data item. Fig. 5 depicts one example of constraints include a service query whose ”, “ “ ”, and “ ”. 2. Parse service information. Fig. 6 shows the result of a service query that includes information about service name, description, type, connect point, and binding. 3. Use the service binding information to invoke the result service to get data. Fig. 7 describes an example of a getCoverage request that is used to call an OGC WCS to get user customized data with constraints on the data identifier, domain subset, range subset provided, and output format. IV. PROCESSING SERVICE IMPLEMENTATION GeOnAS is designed to use interoperable Web services for scientific analysis of geospatial data. However, redesign of

an entire application from scratch to a set of Web services is clearly an arduous task. A better approach is to wrap legacy software modules or geospatial algorithms into loosely coupled Web services so that full functionalities of the underlying software are reused in an interoperable way. The Geographic Resources Analysis Support system (GRASS) is an open source geographic information system (GIS) with more than 350 modules for management, processing, analysis and visualization of geospatial data [21]. It is being widely used in many areas by universities, government offices, and non-profit or commercial organizations throughout the world. Most GRASS GIS functionalities and image processing capabilities have been converted into Web services and imported into GeOnAS. The basic idea of developing GRASS services is to create self contained and loosely coupled services based on GRASS commands [22], [23]. However, some GRASS commands are tightly coupled with each other in contradiction to the Web service design, and some of them have no explicit processing meaning unless they are combined with others. For example, such GRASS commands as “r.in.gdal”, “g.region”, “r.watershed”, “r.mapcalc,” and “r.out.gdal” are used to derive a stream network from DEM data. Therefore, a GRASS service is designed to involve a sequence of GRASS commands to

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

1787

Fig. 9. Cluster environment for processing services.

Fig. 10. Average latency of stream extraction service.

execute a processing workflow. As Fig. 8 shows, a GRASS processing workflow generally comprises the following functional modules: • Environment Setting: This module specifies GRASS and environment variables. The “g.gisenv” command, for example, can be executed to set where GRASS is located in the system, and where the GRASS database, the GRASS location and the GRASS mapset are located in the current process. • Data Import: GRASS supports many geospatial data formats. The “r.in.gdal” and “v.in.ogr” commands import and convert raster and vector data into the GRASS internal format. • Data Analysis: This module performs types of data analysis ranging from map algebra, map statistics, image classification, and network analysis to hydrologic modeling and vegetation indices. • Data Output: GRASS can export data in many formats. For example, the “r.out.png” command exports the

resulting data in PNG format, and the “r.to.vect” and “r.out.org” commands export the result in “GML” or “ESRI Shapefile” format. Such a workflow is wrapped as a standard Web service by using a “top-down” approach, i.e., from interface description to service encoding. A WSDL file is developed for describing the service, service operations, and relevant input and output parameters. The key operational parameters of the GRASS commands are kept, and other control and optional parameters are set to default values or ignored for simplicity. Since the data for analysis are retrieved from an OGC data services, the source of data input is designed to be “anyURI”. Building workflows from distributed services requires that analysis results are stored at a temporary and online accessible place. The service output is also designed to be “anyURI” for direct retrieval over the Web. The service code about service binding and server skeleton is generated based on the WSDL file, using a toolkit such as Apache Axis. The execution of GRASS commands, as a sub-process with the specified environment and working

1788

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

Fig. 11. User interface for service registration.

directory, is embedded in the service code. The detailed description of the GRASS service is available at http://geobrain.laits.gmu.edu/grassweb/manuals/index.html. To adhere to interoperability specifications, the GRASS functionalities are also exposed as interoperable OGC WPS processes [14], [24]. The clustering environment uses a group of linked servers to provide more capacity and higher availability than a single server. Since our GRASS services are stateless and do not need session management, they can easily be distributed in the clustering environment. Therefore, all processing services are deployed into a clustering environment in order to balance the workload among multiple component servers for high performance. The operational cluster, shown in Fig. 9, includes four machines, all of which have the same operating system, run in the same environment, reside on the same subnet, and share the same RAID disk. The Apache runs as a load balancer at the front end in order to get optimal resource utilization, maximize throughput, minimize response time, and avoid overload. When clients invoke services, the load balancer splits up incoming requests, and then distribute them to the one of four backend nodes following a load balancing policy. Fig. 10 shows the average response time for invoking a stream extraction service from a single server (dashed line) and a clustering server (solid line). With an increasing number of client threads, the response time of a single server grows more sharply than that of a clustering server. This clustering environment is proven to be faster and more reliable than the corresponding single server environment when handling multiple user requests. V. DYNAMIC SERVICE INTEGRATION GeOnAS is designed to be an open and fully extensible system. It can integrate new Web services dynamically. Users who have their own geospatial processing services and would like to use it to perform data analysis can integrate that service into GeOnAS to build a unique system. If the service is registered into the service catalogue, other users will benefit from

it. Hence, as more users are involved, GeOnAS becomes more powerful. Fig. 11 shows the user interface for service registration. The “Name”, “Description”, “Version”, and “Keywords” specify the basic service properties. The “WSDL Address” indicates the URL of the service WSDL file that includes all the information necessary for invoking services. However, there is no relevant WSDL for some OGC services. The “Service Address” is used to specify the service access point directly in such a case. Different geospatial services have different application scopes. Service classification uses standards or proprietary taxonomies to assign a class to a service in order to indicate service functionalities explicitly and facilitate service discovery. GeOnAS supports a variety of classification methods so that the service publisher can indicate the domain to which a service belongs at publication time: • OGC: the classification schema of the OGC services, such as WCS, WFS, WMS, and WPS. • ISO19119: the hierarchical classes of services based on the semantic type of computation defined in ISO19119—human interaction services, model/information management services, workflow services, processing services, communication services, and system management service. • Global Change Master Directory (GCMD): hierarchical classes of services defined especially for Earth science, such as an environmental advisory service, a hazards management service, and a model service. • LAITS_VDP: a proprietary classification schema of services for virtual data products. To import a new service into GeOnAS, the service’s WSDL file is essential. GeOnAS can read the WSDL to determine the operations available from the service, the structure of message types, the binding protocols, and how to invoke the service. The other requirement for integrating a new service is that the service output must be an accessible network point, a Uniform Resource Locator (URL). Thus, GeOnAS can parse the URL to import the output data into the system for further use. The following steps uses a stream extraction service, which is designed

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

1789

Fig. 12. User interface for service discovery.

Fig. 13. Service operation list.

to derive stream network from DEM data using morphology based algorithm [25], as an example to show how to dynamically integrate a service into the system: 1. Register a stream extraction service with its WSDL file. 2. Select a DEM data that is to be analyzed.

3. Find a service that can handle the data selected in step 2. Fig. 12 shows the user interface for service discovery. The user can use a keyword, “stream extraction” or a classification schema, “Processing/Thematic/GeospatialAnalysis”, to discover a stream extraction service.

1790

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

Fig. 14. User interface for operation inputs.

Fig. 15. Output result of service invocation.

4. Select a proper service operation from the operation list. When users chose a stream extraction service, the service WSDL is imported and parsed in order to show service operations for user interaction. The selected stream extraction service includes three operations, as shown in Fig. 13: • flowDirectionBasedMethod_GRASS: operation that uses hydrological module in GRASS to compute flow accumulation for deriving stream network based on flow direction. • curvatureBasedMethod: operation that uses morphology based algorithm to compute curvature and flow accumulation for deriving stream network. • simplifiedCurvatureMethod: operation that uses morphology based algorithm to compute curvature and flow accumulation for deriving stream network. To improve the speed of calculation, curvature is approximately derived by convoluting DEM with a 9 9 Laplacian filter.

5. Input the parameter values of the selected operation. Fig. 14 shows the user interface for operation “curvatureBasedMethod” inputs by which user can specify the source data, threshold of tangential curvature, threshold of flow accumulation, output format, and output format type. The service can be invoked with these parameter values. 6. Add the service output into the system. Fig. 15 shows the result of service invocation: service name, port name, operation name, and result data URL and format. Users can add the result data into the system for further use. VI. CONCLUSIONS GeOnAS takes advantage of open and standardized Web services and architecture to provide interoperable online analysis of geospatial data. It provides an open data platform by which users are able to discover and access distributed geospatial information using the latest common protocols. Within the context

ZHAO et al.: BUILDING A WEB-SERVICES BASED GEOSPATIAL ONLINE ANALYSIS SYSTEM

of SOA, GeOnAS provides an interoperable application platform by which different users are able to create customized data analysis systems with a collection of loosely coupled Web services that communicate with each other through standard languages, interfaces, and protocols. Moreover, GeOnAS provides a collaboration platform that allows different users to contribute geospatial processes and data products for sharing, exchange, and reuse. Thus, new solutions, especially for those that require the collaborations across geoscience domains, can be created dynamically by composing new application-specific services and existing recombinant services. Cloud computing, as an increasingly promising platform for providing on-demand network access to computational power and storage, is extending its reach into the geospatial domain [26], [27]. It is a natural evolution of SOA and Web service by representing Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Our future work will focus on how to transfer large pieces of distributed geospatial data, and how to migrate GRASS services into Cloud computing platforms such as Amazon EC2 and Google App Engine. The shift paradigm, including standards, programming languages, and service invocation and collaboration, will be investigated. ACKNOWLEDGMENT Special thanks to Dr. B. Schlesinger for his detailed comments on this paper. REFERENCES [1] T. Heinen, S. Kiemle, B. Buckl, E. Mikusch, and R. Loyola, “The geospatial service infrastructure for DLR’s National Remote Sensing Data Library,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (JSTARS), vol. 2, pp. 260–269, 2009. [2] A. Prados, G. Leptoukh, C. Lynnes, J. Johnson, H. Rui, A. Chen, and R. Husar, “Access, visualization, and interoperability of air quality remote sensing data sets via the Giovanni online tool,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (JSTARS), vol. 3, pp. 359–370, 2010. [3] D. Brunner, G. Lemoine, F. Thoorens, and L. Bruzzone, “Distributed geospatial data processing functionality to support collaborative and rapid emergency response,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (JSTARS), vol. 2, pp. 33–46, 2009. [4] G. Yu, L. Di, B. Zhang, and H. Wang, “Coordination through geospatial web service workflow in the sensor web environment,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens. (JSTARS), vol. 3, pp. 433–441, 2010. [5] D. Booth, H. Haas, F. McCabe, E. Newcomer, M. Champion, C. Ferris, and D. Orchard, Web Service Architecture. Cambridge, MA: W3C Working Group, 2004. [6] C. Harding, Definition of SOA, 2006 [Online]. Available: http://www. opengroup.org/projects/soa/doc.tpl?CALLER=doc.tpl&gdid=10632 [7] T. Erl, Service-Oriented Architecture: Concepts, Technology, and Design. Upper Saddle River, NJ: Prentice-Hall, 2005. [8] E. Newcomer and G. Lomow, Understanding SOA With Web Services. London, UK: Addison Wesley, 2005. [9] D. Nebert, A. Whiteside, and P. Vretanos, OpenGIS Catalogue Services Specification OGC 07-006r1. Wayland, USA: , 2007. [10] Y. Bai, L. Di, A. Chen, Y. Liu, and Y. Wei, “Towards a geospatial catalogue federation service,” Photogrammetric Engineering and Remote Sensing (PE&RS), vol. 73, pp. 699–709, 2007. [11] A. Whiteside and J. Evans, Web Coverage Service (WCS) Implementation Standard OGC 07-067r5, Open Geospatial Consortium Inc., Wayland, USA, 2008 [Online]. Available: http://portal.opengeospatial.org/ files/index.php?artifact_id=27297

1791

[12] P. Vretanos, Web Feature Service Implementation Specification OGC 04-094, Open Geospatial Consortium Inc., Wayland, USA, 2005 [Online]. Available: http://portal.opengeospatial.org/files/index.php?artifact_id=8339 [13] J. Beaujardiere, OpenGIS Web Map Server Implementation Specification OGC 06-042. Wayland, USA: , 2006. [14] P. Schut, OpenGIS Web Processing Service OGC 05-007r7, Open Geospatial Consortium Inc., Wayland, USA, 2007 [Online]. Available: http://portal.opengeospatial.org/files/index.php?artifact_id=24151 [15] R. Martell, CSW-ebRIM Registry Service—Part 1: ebRIM Profile of CSW. Wayland, USA: Open Geosptial Consortium Inc., 2009. [16] ISO 19115:2003: Geographic Information—Metadata International Organization for Standardization, Geneva, Switzerland, 2003, ISO. [17] ISO 19115-2:2009: Geographic Information—Metadata—Part 2: Extensions for Imagery and Gridded Data International Organization for Standardization, Geneva, Switzerland, 2009, ISO. [18] ISO 19119:2005: Geographic Information—Services International Organization for Standardization, Geneva, Switzerland, 2005, ISO. [19] ISO/TS 19139:2007: Geographic Information—xML Schema Implementation International Organization for Standardization, Geneva, Switzerland, 2007, ISO. [20] OASIS/ebXML Registry Information Model v2.5 OASIS, Burlington, USA, 2003. [21] M. Neteler and H. Mitasova, Open Source GIS: A GRASS GIS Approach, 3rd ed. New York: Springer, 2008. [22] X. Li, L. Di, W. Han, P. Zhao, and U. Dadi, “Sharing geoscience algotithms in a web service-oriented environment,” Computer & Geosciences, vol. 36, pp. 1060–1068, 2010. [23] U. Dadi and L. Di, Data Independence and Geospatial Web Services, 2007 [Online]. Available: http://gsa.confex.com/gsa/2007GE/finalprogram/abstract_122248.htm [24] X. Li, L. Di, W. Han, P. Zhao, and U. Dadi, “Sharing and reuse of service-based geospatial processing through a web processing service,” in 17th Int. Conf. Geoinformatics, Fairfax, VA, 2009, pp. 1–5. [25] W. Luo, X. Li, I. Molloy, L. Di, and T. Stepinski, “Web service for extracting stream networks from DEM data,” GeoJournal, to be published. [26] C. Gerber, Computing Clouds Cast Geospatial Vision, 2010 [Online]. Available: http://www.geospatial-intelligence-forum.com/mgtarchives/94-mgt-2009-volume-7-issue-1/716-computing-clouds-castgeospatial-vision.html [27] B. Schäffer, B. Baranski, and T. Foerster, “Towards spatial data infrastructures in the clouds,” in Geospatial Thinking, M. Painho, M. Santos, and H. Pundt, Eds. New York: Springer Verlag, 2010, pp. 399–418.

Peisheng Zhao received the B.S. degree in geophysics from China University of Geosciences in 1994, and the M.S. degree and Ph.D. degree in cartography and geographic information system from Chinese Academy of Sciences, China, in 1997 and 2000, respectively. He is a Research Associate Professor at the Center for Spatial Information Science and Systems, George Mason University. His main research interests include geospatial information interoperability, geospatial Web service and processing workflow, geospatial semantic Web, and geospatial knowledge discovery.

Liping Di (M’01–SM’06) is a Professor and the founding Director of the Center for Spatial Information Science and Systems (CSISS) and a professor of the Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA. He received the Ph.D. degree in remote sensing/GIS (geography) from the University of Nebraska-Lincoln in 1991. He has engaged in geoinformatics and remote sensing research for more than 25 years and has published over 300 publications. He has served as the principal investigator (PI) for more than $30 million research grants and as co-PI for more than $8 million research grants/contracts awarded by U.S. federal agencies and international organizations. His current research activities are mainly in the following three areas: remote sensing standards, web-based geospatial information and knowledge systems, and remote sensing applications. Dr. Di has actively participated in the activities of a number of professional societies and international organizations, such as IEEE GRSS, ISPRS, CEOS,

1792

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 5, NO. 6, DECEMBER 2012

ISO TC 211, OGC, INCITS, and GEO. He served as the co-chair of the Data Archiving and Distribution Technical Committee (DAD TC) of IEEE GRSS from 2002 to 2005 and the chair of DAD TC from 2005 to 2009. He currently chairs INCITS/L1, a U.S. national committee responsible for setting U.S. national standards on geographic information and representing U.S. at ISO Technical Committee 211 (ISO TC 211).

Weiguo Han (M’09) received the B.S. degree in applied mathematics from Tianjin University, China, in 1996, the M.Eng. degree in computer science and engineering from Huazhong University of Science and Technology, China, in 2002, and the Ph.D. degree in cartography and geographic information system from Chinese Academy of Sciences, China, in 2005.

He is a Research Assistant Professor at the Center for Spatial Information Science and Systems, George Mason University. His research activities encompass geospatial Web services, geospatial data sharing and interoperability, semantic Web, geospatial Web portal, and geospatial cyber-infrastructure.

Xiaoyan Li received the Ph.D. degree in thermophysics engineering, MS and BS degrees in power mechanical engineering. She possesses a strong knowledge of GIS toolkits and Web service technologies. She has been contributing to develop interoperable, Web-executable geospatial service modules and models. Her research efforts are primarily focused on providing Web service-enabled solutions to support extensive use of geoscientific algorithms and geospatial information.