Technology Introduction

Five Steps to Accelerated Insight

From a focused project up to an enterprise-wide service, creating a durable reservoir of insight is the same:

  1. Identify the data sources you want to encompass, regardless of format or structure.
  2. Create or expand the smart data lake - pull in the data, smarten it up, make it securely accessible.
  3. Browse, explore, analyze and ask any question of it, as authorized.
  4. Get immediate answers drawn from all related facts.
  5. Repeat 3 and 4 as needed for trusted insight.

Easier Said Than Done

The steps seem simple; in fact, the last three steps are simple in practice.

But how you implement step one and, especially, step two makes a universe of difference in what you can get out of a data lake and how fast.

Some factors to consider:

  • What the data lake can include
  • The quality of the data catalog that enables you to find and use that data
  • The speed and efficiency of loading and querying data, and
  • Whether you want a smart data lake or a more limited conventional one


Anzo Smart Data Lakes incorporate both the structured data held in databases, and the unstructured data of text, documents, and the like. No other data lake solution does this; all others incorporate only structured data, leaving volumes of valuable and relevant data unconsidered and out of reach.

Further, Anzo Smart Data Lakes can organize both the conventional big-data resources such as Hadoop databases, and the emerging cloud-based big-data resources. It's able to keep pace as the state of the art advances.

Smarter Lakes, Smarter Catalog

Data lakes themselves are simply conglomerations of data. The data catalog is what enables you to find and use that data. The essential characteristics of both the lake and the catalog critically affect the quality of insight that you can draw.

Two crucial differences underlie Anzo Smart Data Lake's superiority to the conventional approaches of other data lake solutions: semantic graph technology and speed.

Traditional lake/catalog structure relies on describing an entity - an account, a product, a person, etc. - through the use of tables... Alternatively, semantic graph technology describes an entity through very rich metadata that enables the rapid linking of multiple entities for analysis.

Conceptually, you could think of interlinked tables that describe an entity as a particular stack of cards. An entity in semantic graph technology conceptually looks more like a circle with multiple arrows on its circumference, each representing a linkable attribute.

Describing an entity through interlinked tables rapidly reaches a point where the complexity and sheer number of tables exceeds practical usefulness. Anzo's conceptual "circle and arrows" representation of an entity has no practical limits because unlimited attributes are all in metadata and not discrete tables.

The Result: Faster Access, Wider Visibility, Deeper Insight

Quality information is the most important contributor to valid insight, but timing runs a close second. Anzo delivers on both counts.

Anzo Smart Data Lake incorporates the broadest scope of data types and formats in the industry - both structured and unstructured - to give you wider visibility into the factors that influence decisions and insights.

And it gives you that visibility at record-setting pace.

How fast? A well-established benchmark test, the Lehigh University Benchmark (LUBM), evaluates performance of semantic web knowledge base systems. Anzo executed the benchmark more than 111 times faster than the previous comparable best: it processed a trillion facts (called triples) in under two hours vs Oracle's former record of 220 hours.

What Anzo can do right now, nothing else can do today or in the foreseeable future. Yet, its market-proven, enterprise scale platform is built on open-source technology and runs on economical commodity servers and resources.

Backing the actual technology is a dedicated company. Among our customers, we've built a reputation for being "there" every step of the way, and for taking accountability for the overall success of the program, even when integrating third-party technologies.

The proof? Most of our customers consider Anzo to be such an advantage that they prefer not to talk publicly about it. They'd rather keep their competitors in the dark.

"The rest of the world has to break everything down into tables because they think about an entity as a series of interlinked tables. We don't have to do that because we think of an entity as a thing—a drug, a person, an account, etc. - with metadata attributes. We're not constrained by a finite table structure, so we're able to create much richer metadata that makes it possible to connect entities across unlimited vectors, just the way our brains do."

— Sean Martin, CTO