Intro & Methodology
A few years ago, I spent a ton of time immersed in the data privacy space. Recently, I became curious about what happened in the intervening years, so I decided to analyze exit activity. I built a Notion database (access it ➠ HERE) of data privacy companies, to explore growth and consolidation in the market. The database is not an exhaustive list of companies. It is focused on companies built upon privacy-preserving technology such as encryption, differential privacy, federated learning, and secure multiparty computation (see a full list of these technologies in the “Approaches” section below).
Effective data protection will unlock opportunities in healthcare, government, finance, and almost every other industry. Regulation has significant potential to affect market dynamics in this space.
Trends Report
Messaging - A few years ago, messaging in data privacy was largely centered around the technology approach companies were taking, and has since evolved to focus on hero use cases (finance, healthcare (secure sharing of medical data is a big one), securing AI/ML workloads, etc.).
Acquisitions - There have been some acquisitions, but not as many as I’d expected when I started this market research. Leapyear (acquired by Snowflake), Datafleets (acquired by Liveramp), Statice (acquired by Anonos) Okera (acquired by Databricks), and Privitar (acquired by Informatica) are notable. I could not find any acquisition terms for these transactions.
Similar Profiles - many companies were similar in their fundraising histories, employee counts, GTM approaches, and were founded within a year or two of one another (average founding year of the companies I looked at was 2018).
Hubs - many companies were based in NY, SF, Europe, and there were a handful in Toronto, Canada.
Escape Velocity - I think it is still too early to determine what “escape velocity” for privacy companies looks like. It is possible that strategic investors will become increasingly important in enabling partnerships with significant financial and technology components (like the Owkin / Sanofi partnership). The escape velocity picture will remain murky until technology and regulation hit additional inflection points.
Privacy by Design - A potential inflection point is the shift towards an elegant user experience, where privacy is integrated into the workflow. Like AI/ML, I think data privacy will be embedded in everything we do, potentially making it less of a defined “category” over time.
Strategic Investors - Strategic investors in healthcare (Bristol Meyers Squibb, Mayo Clinic), financial services (JP Morgan, Wells Fargo, Capital One), and data warehouses / lakes (Databricks and Snowflake) are paying attention to and investing in privacy companies. For example, Snowflake’s fingerprints are all over this space - companies are building on top of Snowflake (Privacy Dynamics1), selling through its data marketplace (Samooha), and getting acquired by Snowflake (LeapYear).
Institutional Investors - the investors that showed up as the most active data privacy investors are generally active investors, including General Catalyst, Insight, Index, Lightspeed, and Tiger Global.
No “Best” Approach - I don’t believe we’ve seen a core technology approach “pop” quite yet. Encryption seems to be the trusted approach, but we’re seeing a variety of technology pairings (encryption + trusted execution environments, federated learning + differential privacy, and more). Companies need multiple privacy-preserving technologies to deliver a complete solution.
Takeaway: The landscape shows us that there are dozens of startups, all at similar stage, headcount, and years of existence, hard at work on their respective technology approaches. As we will see below, all of the core approaches to data protection have their pros and cons, known as the “privacy / utility trade off.” This analysis didn’t convince me of the best approach, but it did convince me that ease of use is paramount.
Approaches
The White House’s amazing National Strategy to Advance Privacy-Preserving Data Sharing and Analytics includes the below chart of privacy-preserving techniques, which are the ones we focus on in this piece. I’d encourage you to check out the report if you want to learn more about these approaches in depth (start on page 15). Note - use of these technologies is not binary, the following approaches can and should be used in a complementary way.
My friend
put together a post on commercial implementation of these techniques, Developer-first secure computation:“Unfortunately, thus far the promise of these techniques has significantly outweighed their impact. Adoption has been bottlenecked by poor performance, terrible developer experiences, statistical inaccuracies that limit trust and utility, and a general lack of scalability across use cases. More broadly, almost all of these techniques have traditionally been extremely complex to implement, configure, and use, often requiring completely rewriting your code to fit into their paradigms.”
“We felt that to build a truly foundational company in the secure compute space, you needed a solution that combined developer friendliness, performance, correctness, and support for a wide array of computational techniques…”
Today, architecting a great privacy technical stack is getting simpler and cheaper. In The Threat Show! Graham Thompson (CEO and Founder of Privacy Dynamics) explains his suggested approach:
Use a tool like Okteto to spin up and down remote development environments as needed.
Select a source database, anonymize the data in it, and put it in your remote development environment.
When you’re done with that database environment, break it down.
Other important privacy workflow trends I noticed include the increasing importance of specialized hardware, like in the secure enclaves and hardware used by AWS (Nitro Enclaves), Intel (SGX), Azure, and Google’s (Titan chip for isolated computing). More on this in a future post!
Regulatory Environment
The regulatory environment will influence data privacy technology, but for now, regulations and technology feel somewhat disconnected.
In California, the California Privacy Rights Act (CPRA) amended the 2018 California Consumer Privacy Act (CCPA), and enforcement of CPRA begins in July 2023. There is now a dedicated agency in place to enforce the CPRA. This infographic provides a good overview of additional differences between CPRA and CCPA. Perhaps CPRA and similar efforts will elevate the importance of privacy-preserving technology solutions.
Predictions
Gartner’s August 2022 Hype Cycle for Data Security provides a useful framework for predictions.
I expect acquisitions of companies in the “innovation trigger” and “peak of inflated expectations” phases will create further consolidation in the data privacy market.
Again, according to
in Developer-first secure computation, the holy grail in data privacy is the company / project / technology that “facilitates secure cross-party data sharing in industries like finance, healthcare, and security.” It will be a combination of approaches that moves us towards this holy grail.A huge thanks to Davis Treybig, Zoe Weinberg, and Tommy Jones for reviewing this piece! 🙏🏽
I am an investor in Privacy Dynamics!
Excellent analysis.