Getting Started with Semantic Technologies
If you’re brand new to semantic technologies the topic can be very overwhelming. Different sites and people will talk about everything from artificial intelligence to natural language processing to linked data and the Semantic Web. What are they all? How do they relate to each other? How do they relate to you?
This set of lessons aims to ground you in the basics. The lessons give you basic definitions and goals that form the foundation of a solid, unconfused understanding of semantics.
Introduction to the Semantic Web
The Semantic Web, Web 3.0, the Linked Data Web, the Web of Data…whatever you call it, the Semantic Web represents the next major evolution in connecting information. It enables data to be linked from a source to any other source and to be understood by computers so that they can perform increasingly sophisticated tasks on our behalf.
This lesson will introduce the Semantic Web, putting it in the context of both the evolution of the World Wide Web as we know it today as well as data management in general, particularly in large corporations.
After completing this lesson, you will know:
- How Semantic Web technology fits in to the past, present, and future evolution of the Internet.
- How Semantic Web technology differs from existing data-sharing technologies, such as relational databases and the current state of the World Wide Web.
- The three primary international standards that help define the Semantic Web.
The World Wide Web was invented by Sir Tim Berners-Lee in 1989, a surprisingly short time ago. The key technology of the original web—from an end user’s point of view, anyway—was the hyperlink. A user could click on a link and immediately (well, back then, almost immediately) go to the document identified in that link.
In summary, the great advantage of Web 1.0 was that it abstracted away the physical storage and networking layers involved in information exchange between two machines. This breakthrough enabled documents to appear to be directly connected to one another. Click a link and you’re there—even if that link goes to a different document on a different machine on another network on another continent!
In the same way that Web 1.0 abstracted away the network and physical layers, the Semantic Web abstracts away the document and application layers involved in the exchange of information. The Semantic Web connects facts, so that rather than linking to a specific document or application, you can instead refer to a specific piece of information contained in that document or application. If that information is ever updated, you can automatically take advantage of the update.
This may appear at first to be a very subtle advantage, but it is one that will be illustrated in detail in the various lessons here at Semantic University.
How is the “Semantic Web” Different?
The word semantic itself implies meaning or understanding. As such, the fundamental difference between Semantic Web technologies and other technologies related to data (such as relational databases or the World Wide Web itself) is that the Semantic Web is concerned with the meaning and not the structure of data. Note: Other semantic technologies include Natural Language Processing (NLP) and Semantic Search. We will compare these technologies in separate lessons.
This fundamental difference engenders a completely different outlook on how storing, querying, and displaying information might be approached. Some applications, such as those that refer to a large amount of data from many different sources, benefit enormously from this feature. Others, such as the storage of high volumes of highly structured transactional data, do not. Understanding when it is a good idea and when it is not a good idea to apply Semantic Web technologies is one of the primary objectives of the Semantic University. These topics will be addressed in much more detail in future lessons.
What Standards Apply to the Semantic Web?
From a technical point of view, the Semantic Web consists primarily of three technical standards:
- RDF (Resource Description Framework): The data modeling language for the Semantic Web. All Semantic Web information is stored and represented in the RDF.
- SPARQL (SPARQL Protocol and RDF Query Language): The query language of the Semantic Web. It is specifically designed to query data across various systems.
- OWL (Web Ontology Language): The schema language, or knowledge representation (KR) language, of the Semantic Web. OWL enables you to define concepts composably so that these concepts can be reused as much and as often as possible. Composability means that each concept is carefully defined so that it can be selected and assembled in various combinations with other concepts as needed for many different applications and purposes.
One way to differentiate a Semantic Web application vs. any other application is the usage of those three technologies. However, the Semantic Web has been called many things, such as Web 3.0 or the Linked Data Web. Some of these names carry great significance, even with regard to the technology stack, so we’ll cover this topic in a separate lesson.
Semantic Web technologies as a whole have made tremendous strides in the last decade. Some highlights include:
- The Open Linked Data movement has grown massively every single year and contains far more information than any single resource anywhere on the Web.
- Massive organizations—such as Merck, Johnson & Johnson, Chevron, Staples, GE, the US Department of Defense, NASA, and others—now rely on Semantic Web technologies to run critical daily operations.
- The Semantic Web standards—RDF, SPARQL, OWL, and others—were merely drafts in 2001, but they have now been formalized and ratified.
Truly, an entire industry has been born in the past ten years, complete with multiple trade shows on several continents, a growing user community, and active standards bodies.
That said, significant room for growth still can be found.
- In spite of recent huge strides on the part of Schema.org, Facebook’s Open Graph, and others, the vision of an entire Web of interoperable data has still not yet been realized.
- In spite of significant early corporate adoption by a select few frontrunners, most companies have not yet started using (or are even unaware of the existence of) Semantic Web technologies.
- The learning curve for using Semantic Web technologies is steep because few educational resources currently exist for users new to the concepts, and still fewer resources can be found that discuss when and how to apply the technologies to real world scenarios.
Here at Semantic University, we’re focusing on that last point.