Databricks acquires AI-centric data governance platform startup Okera
San Francisco-based Databricks announced today that it has acquired Okera, the world’s first AI-centric data governance platform. The two companies did not disclose the terms of the acquisition. A spokesperson for Databricks said the completion of the acquisition is “imminent.” Databricks said the acquisition will enable the company to address the challenges of data governance in this new world.
The news of the acquisition comes two months after Databricks released its own AI-powered chatbot called Dolly to take on OpenAI’s ChatGPT. However, unlike ChatGPT, Dolly is an open-source chatbot tool developed using large language models (LLMs) from projects at Stanford and Meta.
“Okera solves data privacy and governance challenges across the spectrum of data and AI. It simplifies data visibility and transparency, helping organizations understand their data, which is essential in the age of LLMs and addressing concerns about their biases,” Databricks said in a news release.
“As data continues to grow in volume, velocity, and variety across different applications, CIOs, CDOs, and CEOs across the board have to balance those two often conflicting initiatives – not to mention that historically, managing access policies across multiple clouds has been painful and time-consuming,” writes Li in today’s announcement. “Many organizations don’t have enough technical talent to manage access policies at scale, especially with the explosion of LLMs. What they need is a modern, AI-centric governance solution. We could not be more excited to join the Databricks team and to bring our expertise in building secure, scalable and simple governance solutions for some of the world’s most forward-thinking enterprises.”
According to the announcement, Databricks intends to integrate Okera’s technology into its Lakehouse Platform to expand its data governance capabilities, which is an essential need for AI, machine learning, and large language models (LLMs) that rely on vast amounts of data.
“Our customers will benefit from being able to use AI to discover, classify and govern all their data, analytics, and AI assets (including ML models and model features) with attribute-based and intent-based access policies. Additionally, they will benefit from end-to-end data observability on the lakehouse that allows them to centrally audit and report sensitive data usage across analytics and AI applications, and automatically trace data lineage down to the column level,” Databricks said.
We covered Databricks a year ago after the data lake company acquired machine learning operations (MLOps) startup Cortex Labs. The total amount of the deal was not disclosed. Headquartered in San Francisco, Cortex helps data scientists and engineering teams deploy native machine learning models in the cloud.
Founded in 2016 by Amandeep Khurana, Nong Li, and Sriram Subramanian, Okera offers secure data access at scale to enable data teams to leverage the power of their data for growth. Its platform also ensures appropriate data access controls are in place to comply with evolving data privacy regulations. Okera has raised nearly $40 million since its inception seven years ago. The startup was valued at $100 million as of its latest 2021 round, according to PitchBook. Its backers include VCs like Bessemer Venture Partners and Felicis Ventures.
Databricks was founded in 2013 by Ali Ghodsi (CEO), Andy Konwinski, Ion Stoica, Matei Zaharia, Patrick Wendell, Reynold Xin, and Scott Shenker. The startup provides an enterprise software platform that helps its customers unify their analytics across the business, data science, and data engineering. More than 5,000 organizations worldwide, including Comcast, Condé Nast, Nationwide, H&M, and over 40% of the Fortune 500, use Databricks’ unified data platform for data engineering, machine learning, and analytics.