Collibra tackles the ‘shadow AI’ problem with new governance application

Join us in returning to NYC on June 5th to collaborate with executive leaders in exploring comprehensive methods for auditing AI models regarding bias, performance, and ethical compliance across diverse organizations. Find out how you can attend here.

As companies rush to adopt artificial intelligence, many are finding that the hardest part isn’t building the models — it’s ensuring the data going into them is reliable and compliant. Without proper AI governance, businesses risk making flawed decisions, violating privacy regulations, or worse.

Enterprise data intelligence provider Collibra is aiming to solve this challenge with a suite of new AI tools announced today. The capabilities span governance, automation and democratization of data, reflecting the multi-pronged approach needed to instill trust in AI.

“Our customers are excited about AI, but are realizing that a lot more than just the algorithm goes into successfully deploying these use cases,” Collibra CEO Felix Van de Maele said in an interview with VentureBeat. “Data is of course central in AI, so strong data governance is a must.”

AI governance brings oversight to the Wild West of enterprise AI

Chief among the new offerings is an AI Governance application that provides a command center for the scattered AI initiatives across an enterprise. The tool aims to solve the disconnect between data science teams prototyping models and risk and compliance stakeholders who need to sign off before they reach production.

VB Event

The AI Impact Tour: The AI Audit

Join us as we return to NYC on June 5th to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.

Request an invite

“We help organizations deliver trusted AI by helping them better govern the AI use cases,” said Van de Maele. “There’s something like ‘shadow AI,’ where your organization is prototyping and exploring a lot but you don’t really have visibility around what’s happening. A big challenge is, how do you make sure the data scientists or engineers find the right data that is appropriate, that they’re allowed to use, that is compliant.”

“You need to put in place the change management, the processes, to bring AI use cases all the way from inception to production,” he added. “And there’s a lot of stakeholders that need to get involved. It’s not just your data scientists — it’s legal, risk, compliance. There are already regulations today, and there will only be more coming.”

AI Governance sits on top of Collibra’s data governance platform, which the company has been building for more than a decade. Van de Maele sees AI governance as a natural extension of data governance, noting that many of the fundamental principles are the same.

With capabilities for defining policies, roles and responsibilities, the tool aims to provide a standard workflow for registering, approving, documenting, and monitoring AI use cases across an organization. The goal is to give stakeholders visibility into how AI is being applied and confidence that it’s being done responsibly.

Collibra AI automates curation and stewardship tasks

Also included in the release is Collibra AI, which leverages large language models (LLMs) to automate many of the tedious data management tasks that have historically required painstaking manual effort.

One such capability is automatically generating descriptions and definitions for data assets, a key part of data cataloging and governance. When looking at cryptic database tables and column names, users often have no idea what the data means. Collibra AI can infer and generate this metadata by analyzing the data values.

“It’s kind of a copilot approach,” explained Van de Maele. “The system generates a draft and then the user can review and approve it. Having those descriptions and definitions is critical for trusting and understanding data.”

Collibra AI can also automatically generate data quality rules from a natural language request. For example, a user could specify that country codes need to adhere to a certain ISO standard, and the system will create the corresponding technical rule to enforce that requirement.

Finally, the system can automatically parse SQL queries and business intelligence reports to generate lineage graphs that show how data flows through an organization’s systems. This observability is important for pinpointing the source of data issues and analyzing the downstream impact of data changes.

“A problem we often see is a customer looking at a dashboard or a number and wanting to understand where it came from, how it was calculated, so they can trust it,” said Van de Maele. “That lineage traceability is really powerful.”

Democratizing data access with notebooks

Beyond governance, Collibra announced a new Data Notebook to help democratize access to trusted data assets. Building on the concept of the popular computational notebooks in the data science community, Collibra Data Notebook provides an interface for business users to search for and provision datasets.

“We’re giving them that one-click ability to find a data set in the catalog and then immediately get access to start exploring and deriving insights,” said Van de Maele. “We’re able to provide that convenience while in the background, invoking all the right governance workflows to maintain security and compliance.”

The Collibra Data Notebook is built into the company’s data intelligence platform, and integrates with the data governance and catalog capabilities. Van de Maele compared it to the Amazon shopping experience, where users can easily find products and have them seamlessly delivered.

Prospects and challenges

Collibra is betting that their deep experience with data governance and stewardship will give them an edge as companies seek to apply those practices to the burgeoning world of enterprise AI.

“We’ve been doing data governance for 15 years before most people even knew what data catalogs were,” said Van de Maele. “So I think we’re really well positioned to help companies get their arms around AI governance.”

However, Collibra will face increased competition from larger technology providers also building AI governance tools. Firms like Microsoft, IBM, AWS and Google have all released offerings aimed at the “responsible AI” trend.

Collibra, whose investors include Sequoia Capital, ICONIQ Capital and Google’s CapitalG, has raised over $590 million to date. The company will likely need to continue innovating and acquiring new capabilities to stay ahead of these deep-pocketed rivals.

Van de Maele seems unfazed, noting that Collibra already counts many of those same tech titans as both customers and partners. He believes Collibra’s focus on data governance as a specific discipline remains a key differentiator.

“Our mission is to change the way organizations use data, with the ultimate goal of changing the world with data,” said Van de Maele. “If we can become that essential piece of the enterprise AI stack, the trusted steward of the data fueling our AI future, then the opportunity ahead of us is immense.”