The desire to extract value from enterprise data has only grown as the pandemic prompts organizations to digitize their operations. But organizations often lack the right set of tools to do so. According to a recent Fivetran survey, 82% of companies are making decisions based on stale information, 85% of which say is leading to incorrect decisions and lost revenue.
With the popularization of real-time database technologies, stale data and the problems surrounding it might soon become a thing of the past — if vendors’ sales pitches are to be believed. Real-time databases can deliver insights immediately, in theory, enabling companies to address line-of-business issues as they come up and act on short-term changes.
As evidenced by the larger and larger funding tranches, investors believe that there’s a sizeable market for real-time databases. Last September, SingleStore, which provides a platform to help enterprises integrate, monitor and query their data as a single entity, raised $80 million in a financing round. Just today, another real-time database vendor, Imply Data, announced that it closed a $100 million in Series D round that values the startup at a $1.1 billion post-money.
Thoma Bravo led Imply’s round with participation from OMERS Growth Equity, Bessemer Venture Funds, Andreessen Horowitz, and Khosla Ventures. Co-founder and CEO Fangjin Yang says that the new money will be put toward product development and expanding the company’s workforce from 200 employees to 300 by the end of the year.
“The industry at large is upon the next wave of technical hurdles for analytics based on how organizations want to derive value from data. The first wave in the 2000s was trying to solve the large-scale data processing and storage challenges, which [tools like] HDFS, MapReduce, and Spark addressed,” Yang told TechCrunch in an email interview. “The second wave in the 2010s was then trying to solve the problem of large-scale query processing, which created the emergence of cloud data warehouses (e.g., Snowflake, Redshift, and BigQuery) and distributed SQL engines (e.g., Impala, Presto, Athena). Now, the challenge organizations are trying to solve are large scale analytics applications enabling interactive data experiences. That’s where … Imply comes in.”
Burlingame, California-based Imply was founded in 2015 by Yang, Gian Merlino, and Vadim Ogievetsky. Yang was an R&D engineer at Cisco focusing on optimization algorithms for networking, while Merlino was a server software developer at Yahoo! (full disclosure: TechCrunch’s parent company).
Yang, Merlino, and Ogievetsky met at Metamarkets, an analytics platform for programmatic advertising that was acquired by Snap in 2017. While at Metamarkets, they developed Druid, an open source, distributed data store written in Java that uses a cluster of specialized processes to analyze high volumes of real-time and historical data.
Druid — which is currently used in production by companies including Netflix, Salesforce, and Confluent — moved to an Apache license in 2015.
“What’s consistent across companies using Druid … is their ability to unlock more value from data, specifically real-time, streaming events,” Yang said. “Where cloud data warehouses are for reports and dashboards described by infrequent queries by a handful of analysts, Druid enables analytics applications that power interactive, live conversations with data with a limitless number of internal stakeholders and external customers at instant query response times.”
Imply builds fully managed databases using Druid. Moreover, Yang positions Imply as the “enterprise version” of Druid, providing ostensibly more capabilities and services than the standalone Druid project. For example, Imply recently launched Polaris, a cloud database service designed to simplify streaming, visualization, and other aspects of real-time analytics.
“From a business standpoint, the need for data-driven insights and applications has increased due changes brought about by the pandemic, which in turn has increased the need for Imply’s technology,” Yang said. “[I]nsights are no longer confined to a predefined set of queries in a report. Organizations can now take real-time and historical events at tremendous scale (e.g., clickstream data, cloud application and services metrics, and internet of things telemetry) for operational intelligence, for real-time recommendations, and for extending insights to their customers.”
Imply stands to gain as enterprises increasingly favor the cloud for database management. Gartner once predicted that, by this year, 75% of all databases will be deployed or migrated to a cloud platform. Imply has competition in alternative databases and data warehouses like PostgreSQL, Snowflake, Elastic, and Clickhouse, but Yang asserts that the startup is sufficiently differentiated by its high concurrency and “value” — particularly in the areas of streaming and batch data.
Barring thorough comparisons of each platform, we’ll have to take Yang’s word for it. But for what it’s worth, Imply claims to have over 150 customers, including Atlassian, Cisco ThousandEyes, InterContinental Exchange, and Reddit.
“Applications built on Imply’s database are accessed by many thousands of users,” said Yang, who demurred when asked about Imply’s current revenue. “Druid is unlocking a whole new world of analytics use cases, and it’s creating tremendous value for organizations as they undergo digital transformations.”
To date, Imply has raised $215 million in venture capital.