Projects > ESIP Testbed Semantic Search

1. Overview

Semantic Web technology has been driving the next generation of the Web where the focus is on the role of semantics for automated approaches to exploit Web resources (Sheth et al., 2005). It was first proposed by Tim Berners Lee (Berners-Lee et al., 2001), and has become a hot research topic in AI and related fields. The core of semantic web research involves two well recognized critical enabling capabilities: (1) ontology generation (Maedche and Staab, 2001; Omelayenko 2001) and automated resource annotation, reasoning and integration (Hammond et al. 2002; Dill et al. 2003; Patil et al. 2004). The introduction of Semantic Web inspires domain experts. Scientist from medical domain (Rector and Horrocks, 1997), biology (Ashburner et al., 2000), chemistry (Gordon, 1988), etc. rely on this technique to solve problem in various domains. In Earth Science domain, there are some uses of reasoning engines, such as Pellet and NEOSS, are limited to query search expansion. The ESIP testbed of semantic search aims to fully utilize the full knowledge encoded in the ontologies and provide better search experience to end users. In this project, our efforts are making the ontology to encode detailed and rich Earth Science knowledge. To fully exploit this encoded knowledge requires extending inferences engines such as Pellet, Racer or Fact++ by adding new components that allow query parsing, ranking results based on matrix to be developed. A new reasoning engine with capabilities are developed to enable automatically assembling shared services into applications to answer questions on the fly, in order to support better decision making in the future.

2. Activities

2.1 Build-up of Domain Knowledge Base

Knowledge base contains assertion and facts about a specific domain. It's the basis for logic reasoning and other activities conducted in semantic search study. The KB used in this project is based on upper-level ontology Semantic Web for Earth and Environmental Terminologies (SWEET 2.0). The modularized design of SWEET 2.0 enables the ontology engineer to populate the knowledge base subject by subject, such as subject of hydrology or atmosphere. The KB could also be enriched by applications: such as air Quality or natural disaster.
Currently, our work focuses on the 4D visualization of massive scientific data , where an integreated server is present. Beyond the computing enviroment, clients are able to view the visual analytical results in a timely fashion.

Figure 1: SWEET 2.0 Ontology

2.2 Reasoning engine

The kernel of a semantic search engine is the knowledge-based reasoning engine. Based on our experiences with NOESIS and its support of the ESIP Earth Information Exchange (EIE) and WECHO, we will develop a new reasoning engine that (Figure 2) includes an ontology supported knowledge base and model, semantic web enabled resources, semantic reasoner, and syntax analysis, and ranking model based-on a matrix to be developed.

Figure 2. The reasoning engine components include: an ontology base and matrix, a water science knowledge base and model, semantic reasoner, semantic matching/indexing, data retrieval, and ranking model. Courtesy of Dr. Rob Raskin from NASA JPLI.

Domain knowledge was captured and encoded as ontologies and knowledge base model. The domain ontology will capture the relevant information about the terminologies and taxonomies. These terminologies These ontologies are interlinked with each other to leverage and extend existing ontologies especially SWEET core modules. Functionality of an existing semantic reasoner such as a Pellet will be expanded to support detailed inference based on knowledge base models. A module to perform syntax analyzer will be a component within this engine. The syntax analyzer will be able to parse user queries and reformulate them as semantic queries to the reasoner. The results from the reasoner will be mapped to different instances of data and services/workflows. Similarity measures will be developed to provide users different rankings to the results. The engine will also provide the user feedback capability by allowing them to rerank the returned results. This feedback mechanism will be used by the engine to adjust the similarity measures.

2.3 Semantic Interoperability

Another major activity of this project is to enable semantic interoperability among external data resources. The interoperability work accomplished for this project is to leverage the data discovery among multiple data centers, including:
  1. FGDC's Geospatial One Stop (GOS)
  2. FGDC's Global Change Master Directory (GCMD)
  3. NOAA's National Climatic Data Center (NCDC)
  4. NASA's Earth Observation System Clearing House (ECHO)
  5. NASA's Earth Science Gateway (ESG)

The open standards employed in this project closely relate and add to the existing Open Geospatial Consortium (OGC) standards such as OGC's Web Map Service, Web Catalogue Services (CSW), as well as W3C standards Resource Description Framework (RDF) and Web Ontology Language (OWL). These standards enables to automatic links of a semantic search engine to various web catalogues and provide the possibility to build an comprehensive virtual repository for end users to discover needed information.

3. Accomplishment

As an accomplishment of this project, we deployed and developed a semantic search prototype for Earth science data discovery. This prototype is integrated in ESIP Earth Information Exchange platform at The following figure is an screenshot for the prototype. Meanwhile, the search engine is also connected to CISC's other research outcomes like 2D WMS viewer and 3D WMS viewer.

4. Publications

P. Yang, W. LI, J. Xie and B. Zhou. Distributed Geospatial Information Processing - Sharing Distributed Geospatial Resources to Support the Digital Earth, International Journal of Digital Earth, Vol.1(3), 259-278. DOI:

W. LI, C. Yang and R. Raskin. A Semantic Enhanced Model for Searching in Spatial Web Portals. In: Proceedings of AAAI Spring Symposium on Scientific Knowledge Integration (AAAI/SSKI'08), Stanford U., Palo Alto, CA, US, 2008.

W. LI and P. Yang. A Semantic Search Engine for Spatial Web Portal. In: Proceedings of IEEE International Geosciences and Remote Sensing Symposium, IGARSS08, Boston, US, 2008.

W. LI, C. Yang and B. Zhou. "Internet-based Geographical Information Retrieval". Encyclopedia of GIS 2008: 596-599.

For All Inquiries: Research Building 1,4400 University Drive, Fairfax, Virginia 22030 Phone: 703-993-9341
Copyright © 2006-2013 George Mason University