OpenTopography 2.0 System Architecture

The figure below depicts the architecture of the OpenTopography system. A brief summary of the various cyberinfrastructure components are shown below the architecture diagram. The OpenTopography 2.0 system was released June 14th, 2010 and represented a complete re-architecting of the previous system (OpenTopography 1.0)

image


The OpenTopography system is service oriented architecture comprised of various software services and hardware resources. Important system components:

Gridsphere Portal Framework

Gridsphere portal framework provides an advanced open-source portlet based Web portal. The portlet model provides users a customized and flexible user interface and allows each portlet to be deployed on its own if necessary. In OpenTopography, Gridsphrere is used to manage all data access and processing aspects of the system.
More information: http://www.gridsphere.org

Google Fusion Tables: Within Gridsphere, the OpenTopography data discovery page which displays available data and provides metadata and links to data products based on user selection is driven by Google Fusion Tables and the Google Maps API. Fusion Tables handles display of available dataset extents and spatial (bounding box) queries to retrieve metadata and data products.
More information: http://www.google.com/fusiontables/public/tour/index.html

Expression Engine

Expression Engine is a content management system written in PHP. It is highly flexible and supports plugins, modules and extensions for added customization and functionality. In OpenTopography, Expression Engine manages all of the standard website content such as the news items, blog, and other pages.
More Information: http://www.expressionengine.com/

LiDAR Point Cloud Management Systems

OpenTopography employs two systems for managing LiDAR point cloud data:

Multi-Node Partitioned IBM DB2 Database: Currently, all LiDAR point cloud data are hosted in a multi-partitioned IBM DB2 database. In the multiple partition configuration the LIDAR database tables are partitioned across multiple machines or “nodes”. Each node is a dual processor machine with 8GB of Memory and attached to 2 TB (approximately - varies by node) of local disk. Each partition is managed by an independent database manager, each with its own data, configuration files, indexes, and transaction logs. This architecture provides better scalability and performance. New machines can be added to the cluster and the database can be expanded across them. See Nandigam et al., 2010 for more information on this system.

OpenTopography LAS Server: For LiDAR data delivered to OpenTopography in the LAS binary point cloud format, we use a hybrid database and flat file system for data storage and retrieval . The open source libLAS libraries are used to read, write and convert the LAS files. This approach as several advantages, including significantly shorter data ingestion times, and a smaller storage footprint when compared to the DB2 system above.

Processing Server

The LiDAR point cloud data extracted from the database are processed on a high performance server (Bebop - 32 CPUs, 172 GB RAM) provided by SDSC. This machine allows use to allocate significant amounts of memory to each job while also having multiple jobs running simultaneously. Bebop runs the service listed below.
More Information: http://www.sdsc.edu/us/resources/bebop/

DEM Generation: Currently, digital elevation model (DEM) generation in OpenTopography is performed using a local bining algorithm (Points2Grid) developed by OpenTopography specifically for this task. The algorithm utilizes the elevation information from LiDAR returns contained within a circular search area (radius defined by the user) to calculate the DEM elevation at each grid cell (resolution specified by the user). Five values are computed for each node in a grid: 1) the minimum, 2) maximum, 3) mean, and 4) inverse distance weighted mean of the local points, and 5) the number of points in the search area. If the number of points in the search radius is 0, the grid node is assigned a null value. The user can choose to fill null values in the grid by applying a moving window over the DEM after it has been generated.
More Information: See Documents page under the OpenTopography Resources tab

Format Translation and Derivative Products: OpenTopography has created a number of web services that are based on the the Geospatial Data Abstraction Library (GDAL) to handle tasks such as grid format generation and the production of derivatives. These services also run on Bebop, permitting high-performance processing. A format translation service generates DEMs in the format requested by the user (GeoTIFF, IMG, Arc ASCII). The derivative products service is based on the gdaldem utility and generates hillshade and slope grids when requested by the user.

Opal Toolkit

Opal is an open source toolkit built by the National Biomedical Computation Resource project at SDSC for wrapping scientific applications as Web services in a matter of hours. Opal provides features such as scheduling, standards-based Grid security and data management in an easy-to-use and configurable manner. Opal is production quality software, and is being used by several cyberinfrastructure projects around the world. The OpenTopography Facility is leveraging the Opal toolkit to wrap its core functionality as Web services, utilizing its features such as scheduling, logging, and automatic interface generation.
More Information: http://www.nbcr.net/software/opal/

Backend Databases

In addition to the main LIDAR DB2 partitioned database, there are a number of backend databases that include PostgreSQL databases that support Gridsphere, jforums and integrated user authentication database, MySQL database for supporting Expression Engine and an IBM DB2 database for supporting LIDAR Point cloud monitoring and user job statistics and personal workspace.

File Servers

Standard DEM Tile Servers: These servers are hosted at Arizona State University and deliver the tiled DEM data that are available via OpenTopography. The DEMs are typically delivered as bare earth (ground) and full feature (all returns) surfaces organized into tiles (e.g. 1 km2). The DEMs are in a GIS compatible format and are compressed (zipped) to reduce their size. Also running on these machines is custom logging software that allows OpenTopography to track data access.

Streaming Google Earth Hillshades Image File Server: This machine hosts a cache of pre-computed hillshade and slope-shade imagery that is accessed via network linked KMZ files that OpenTopography provides for hosted data.

LiDAR Point Cloud Bulk Download File Server: The bulk download feature is designed for advanced users seeking to access large amounts of data quickly and who have the bandwidth, expertise, and software necessary to manage the gigabytes of data this method delivers.

Visualization Web Services

For every OpenTopography DEM generation job, the system produces browse images and Google Earth compatible KMZ files to make it simple for the user to view the results of their job. OpenTopography uses a Global Mapper-based web service to generate these images and KMZs. We have also enabled the Google Earth API to allow users to view the KMZ products without leaving the web-browser.

INCA

The Inca system provides user-level Grid monitoring with periodic, automated user-level testing of the software and services required to support Grid operations. OpenTopography uses INCA to monitor system "liveness" and report outages in real time.
More Information: http://inca.sdsc.edu/

JForums

JForum is a powerful and robust discussion board system implemented in Java available freely under the BSD Open Source license. Built around a MVC framework, it can be deployed on any servlet container or application server.
More Information: http://www.jforum.net/



Previous OpenTopography Architecture Diagrams