The best thing about the Big Data hype in pharma is how effectively it's shed light on all of the Small Data problems the industry is facing...
There's a good reason for this. While it's true that voluminous Big Data problems are sexy and grab headlines easily with exotic talk of petabytes and exabytes, the number of people across a pharma company who actually deal with these volumes of information as part of their day-to-day job is vanishingly small. Put another way, while Big Data is a real problem, it's not a Big Problem. What is a Big Problem, on the other hand, is the challenge of dealing with the diverse variety of (small) data that's needed for decision-making throughout the drug discovery, development, and commercialization life cycles.
In this interview, published in advance of the 2013 Enterprise Data World conference, Cambridge Semantics co-founder Lee Feigenbaum talks about the company's solutions across industries, what's unique about the Anzo software, what's coming in the next couple of years, and more.
It's been about a year since I really talked with someone about the status of Semantic Web technology, so during a recent phone conversation, I asked Feigenbaum to provide a "state of Semantic Web technology." He explained it's definitely moving out of the protocol stage and into full production, particularly in industries such as financial services and pharmaceuticals.
It's ironic, but the term "semantic technology" isn't well defined and can be used by different vendors to mean completely different things. Lee Feigenbaum explains to IT Business Edge's Loraine Lawson the different meanings and how these technologies all relate — but he also explains the advances in what he considers the one that matters most: Semantic Web technology... In part two of this interview, Feigenbaum explains how Semantic Web technology is changing data integration.
Semantic Web technology makes data integration more flexible, reducing a lot of the work previously required for new integration and for revamping databases, according to Lee Feigenbaum, vice president of marketing and technology for Cambridge Semantics. He also co-chairs the W3C SPARQL working group, which is an RDF query language and considered a key technology for the Semantic Web. He also explains who some of the main players are in this space, and why you should buy rather than build. In part one of this interview, he explained the differences in semantic technologies and what's changed in the past year.
For a third year we bring you our list of up and coming vendors on our radar that are doing their part to shape the groundswell in information management technology in the 21st Century.
In this article, we'll take a high-level look at what the core Semantic Web technologies are, why they're different from conventional technology approaches and how they deliver tangible benefits for enterprise information management.
Together, these Semantic Web technologies are a Swiss Army knife for enterprise information management. They provide a cohesive foundation for agile data integration, for evolving applications as business requirements change, and for delivering to knowledge workers a greater understanding of the information in front of them.
The web is a lot of things corporate IT systems are not. Anyone can access any kind of information at any time—videos, documents, tables, your own emails, blog posts, customer profiles. If you want information on a customer, you go to LinkedIn—not your in-house customer relationship management system. If you want information on a competitor you go to their website—not long-form analyst reports. It's all right there, a couple clicks away. This information gentrification enables unparalleled flexibility and speed of action.
I wanted to bring this same power and flexibility to all corporate IT systems. A new world of Unified Information Applications in which all data is interlinked and available at the click of a button, to everyday business people, and where IT systems are as flexible as the web.
"Enterprises continue to struggle to know what data they actually have and what it really means," said Sean Martin, Cambridge Semantics' chief technology officer. "This is increasingly difficult given the myriad of distinct—and growing number of—data sources, the endless diversity of data models and formats, and the relative inflexibility of the software tools," he said.
Semantic Web technologies can help solve these problems, Martin said.
Every day the body of information on scientific discoveries and pharma advances from around the globe swells across many different sources, presenting a deluge of data for biopharma outfits to sift through to find potential licensing opportunities. Some Big Pharma and biotech companies have turned to analytics and data-mining technologies to scour disparate Big Data sources and deliver the exact information they seek.
Cambridge Semantics in Boston has won over some of the world's largest pharma groups with a semantic web technology, which enables business development groups to automatically pull together key data from a wide variety of different sources such as scientific publications, websites and subscription databases and internal troves of knowledge.
Consider some enterprise information trends over the past five years:
- Key enterprise data assets are no longer confined to predictably structured transactional databases and data warehouses. Decision makers are relying on data buried in spreadsheets, in emails, in Access databases and in documents.
- Companies are increasingly drawing on data from outside their organization. They're pulling together information from supply chain partners, from customers, from social media sites, from websites and public web databases.
- The information needed today is different from the information needed tomorrow, next week or next month. Changes come from everywhere: internal strategy changes, competitive pressures, new regulatory requirements.
Big data is everywhere today. It fills IT headlines and keynotes technology conferences. It's become a favorite topic for both industry analysts and technology investors. With lots of computing power and better database storage techniques, big data makes it practical to store and analyze petabytes and petabytes of detailed transactional and media data. But despite the headlines, "big data" is not the most compelling data need that the majority of business end users have. A far bigger challenge for most people is getting access to the "right data" to help them do their jobs better.
Databases of the petabyte size mostly represent billions of individual transactions, such as individual telephone calls or ATM transactions. No one would argue against analyzing that data to look for nuggets of insights that can only be found at that detailed transaction level. However, this kind of analysis is not easy. It requires sophisticated models and statistical techniques, and in the wrong hands can lead to all the classic errors of statistical analysis (e.g., correlation is not causation, and 5% of the time random events will be statistically significant at the 95% level). In general, analyzing truly big data needs to be left to the professional analysts.
Lee was interviewed for Slashdot TV. The interview goes into the basic benefits of Semantic Web technologies, particularly how they relate to current data problems in the enterprise.
Cambridge Semantics has a new way for users to get access to its Anzo solutions: Next week at the Semantic Technology & Business Conference in San Francisco it will announce a packaging of the technology, dubbed the Anzo Express Starter Edition, that can be downloaded for free by anyone. "This lets anyone really easily start with semantics without having to invest a lot of time and without learning every fundamental detail," says Rob Gonzalez, Director of Product Management & Marketing and a frequent contributor to this blog.
The Best of Show Awards offers exhibitors of the Bio-IT World Conference and Expo an exclusive opportunity to distinguish and highlight their esteemed products ranging from an innovative application, technology, tool, or solution from the competition. Judged by a team of leading industry experts and Bio-IT World editors, this awards program identified exceptional innovation in technologies used by life science professionals today.
Cambridge Semantics' new Competitive Intelligence Solution has been named Best in Show at the 2012 BioIT World Conference. The company reports, "Based on the flagship Anzo Software Suite, the Cambridge Semantics Competitive Intelligence Solution was recognized for its unique approach to helping biopharma companies handle their diverse and ever-changing information needs."
To the rank-and-file biopharma worker, the Semantic Web can be a confounding technology. Cambridge Semantics makes the technology accessible (first, perhaps, with straightforward explanations on its website) to almost anyone in the industry with a basic understanding of Excel. Lee Feigenbaum says that the Boston-based company has thus far found the greatest demand for its Anzo software in biopharma from business development groups, but it can be used to aid R&D and supply chain management too.
The software lets a person who has no coding skills design a spreadsheet that pulls in unstructured data from the web and documents as well as structured information from internal sources to, say, evaluate early-stage compounds for in-licensing. It's all about connecting information from myriad sources to make smart decisions. And Johnson & Johnson ($JNJ), Biogen Idec ($BIIB), Merck ($MRK) and Novartis ($NVS) have all tapped Cambridge Semantics' technology.
As part of their Semantic University, Cambridge Semantics has published a number of helpful "lessons" covering concepts related to the Semantic Web. Since we last checked in with this excellent tutorial series, they have added several lessons...
Any data used by a system must have structure of some kind, hence the term "unstructured" falls short of describing much of Big Data. However, determining or defining the structure of many data types can be quite challenging. Nonetheless, Big Data is now being harnessed far and wide, with all kinds of significant benefits. Of course, we must not forget about the other data, 'small' as it may be! Register for this episode of DM Radio to learn how some of the savviest companies are finding ways to "deconstruct" unwieldy data types. Hosts Eric Kavanagh and Jim Ericson will interview Analyst Jaime Fitzgerald, plus Suresh Chandrasekaran of Denodo, Sean Martin of Cambridge Semantics, and Grant Ingersoll of Lucid Imagination.
What does the compliance lifecycle look like at your company? In globally-operating industries such as finance, there's likely a herd of people charged with monitoring rules and regulations across countries, drafting policies and procedures for individual geographies or business units, and working to ensure controls are in place to prevent and detect violations. And that herd of individuals in some respects may be trying to herd cats, given how often aspects of compliance regulations change....
In addition to announcing its compliance solution, Cambridge yesterday launched a new web site called Semantic University. It aims to be the education spot for semantic technology, taking a vendor-agnostic approach to getting semantic web wanna-bes up to speed.
Cambridge Semantics is flourishing. The company recently announced "that due to demand for its enterprise semantic data management offering, the company has increased headcount by 30 percent and has relocated to new office space on the Boston Common. In the heart of downtown Boston, the new 5,500 square foot office will provide the organization with room to grow as they focus on R&D and marketing activities for the company's flagship Anzo software suite."
The growing product portfolio of Biogen Idec (Weston, MA) was resulting in rapid domestic and international expansion, leading to an increasingly complex pharmaceutical supply chain. Harmonized data reporting and real-time information were needed in order to move toward a risk-based model for contractor assessments and batch release. To meet this challenge, the company chose Anzo software from Cambridge Semantics (Boston, MA), which allows business users to search for, virtualize, analyze, act on, and make decisions with any internal or external, structured, or unstructured data. Based on the revolutionary flexibility of semantic web technologies, the software provides operational business process integration for just about any formal or informal business activity.
Jennifer Zaino of SemanticWeb.com interviewed Steve Kludt, VP Marketing, about the benefits of semantic technology as applied to enterprise software.
Recently, I had a chance to talk to the Sean Martin, CTO of Cambridge Semantics, a Massachusetts-based company that has developed a semantic middleware stack, complete with ESB. We largely discussed how companies are applying the company's solution to enterprise data that's thus-far un-managed and un-integrated: spreadsheet data.