Semantics Technologies Explained

Semantic technologies are based on a set of standards, developed by the W3C. These standards offer an entirely new method for accessing, combining, using and sharing data from disparate information sources, regardless of variations in underlying data structures. Creating systems that put these standards to work is extremely complex.

In their raw form semantic technologies are well out of reach of most IT professionals and virtually all non-technical users. Cambridge Semantics is the only company that provides a set of practical tools that allows any enterprise to put the new method into practice to quickly and cost effectively implement enterprise-scale solutions that provide immediate business value.

Semantic technologies provide a mechanism to build a standards based conceptual layer—distinct from the data itself—that describes all data in the same way: the way humans think about it. This conceptual model is one level of indirection away from the underlying data and is therefore agnostic to variations in data formats or locations. Because of this, a user or system that connects using the concept layer can simply describe what information they need at the time that they need it, and all the data that corresponds to those concepts can immediately be made accessible, on-the-fly, regardless of how the data is structured in the various contributing source systems. This sets up the potential for on-the-fly integration using real-time federated queries, as well as rules and inferencing capabilities that can run across multiple disparate systems and more.

To facilitate this, the W3C for standards for semantic technologies include:

• Ontologies which define concepts that describe data regardless of the source systems’ location or data structures. Ontologies use properties of the data to describe it: for example, a sub-prime mortgage is a loan (concept) that has an associated amount, borrower’s income, interest rate, etc. (properties) that match certain criteria (amount larger than $300,000, interest rate higher than 8.5%, etc.). There are lots of loans out there but only loans with these properties and meeting these criteria are sub-prime mortgages.

• RDF (Resource Description Framework) allows data to be represented and/or stored using the concepts embodied in ontologies

• SPARQL is a standard query language that allows one query to run across multiple sources using concepts embodied in ontologies to locate the requested data.

Once data is accessible and consumable via its conceptual description, previously complex, cross-system actions become very simple and straight forward.

• Information integration can happen just in time at the point of decision making
• Rules/Inference run at the same concept level and can thus be applied to any connected data in the enterprise
• New data sources and information types can be added on the fly by adding new concepts and linking data to them.

The real value in semantic technologies is that by applying it, non-technical users who understand the concepts can now quickly find, combine, visualize and act upon whatever information they need whenever they need it.