A deep dive.

Apr 29, 2024

Welp, we failed. DeepDive is dead.

Though there are a few reasons I could articulate as to why, there’s only one that matters.

We quit working on DeepDive because we didn’t believe it solved a real problem.

DeepDive was an AI-powered tool for data analysis and visualization.

It enabled users to interact with data in natural language.
Given a question (e.g, “get average salary by department”), it would generate a SQL query, execute it, and visualize the result.

And while natural language was the primary mode of interaction, it wasn’t the only.
Users could edit resulting visualizations through our UI and correct any mistakes that our models would make. Those edits would then serve as training data for our models, allowing us to effectively learn from direct user feedback.

Text within this block will maintain its original spacing when published

                                                               (full demo)

We worked on DeepDive for about 5 months.

We started with a problem: data analysis for machine learning.
We both come from data-driven companies and spent a lot of time debugging large, complex, machine learning models. Most of that took place in notebooks: web-based IDEs where we wrote one-off scripts in Python and SQL to fetch and visualize data. That process is painful - and we knew it well.

So our initial goal was to build a tool that would solve that problem.

We did with a technology we believed to be revolutionary: LLMs.
We hacked up a prototype, tested it on a few data sources, and soon came to a rather rude awakening.

How was this different from a notebook with a ChatGPT plugin?

At the time, GPTs weren’t launched, but many others had the same idea we had, and we knew. It worked well, surprisingly well, for how little effort it took to build.

And that was the issue.

If our target audience were other tech companies building machine learning models, why wouldn’t they just build an internal tool to do the same? Would they be willing to pay per seat subscription fees, share private data, and take a bet on an unproven startup?

We didn’t think so.
At least, for our tool as we had defined it.

For a few weeks, DeepDive lived as an ambiguously defined business intelligence platform.

We began to think: isn’t the real value in our tool to people who don’t know SQL?
Engineers already know how to write SQL.
Product managers don’t.

By enabling them to analyze data themselves, we’d be empowering them to do something that they couldn’t before. But there was a big problem with that: LLMs are inherently wrong.

While we could easily generate a SQL query, how could we guarantee its correctness?
Engineers can read SQL and point out mistakes.
Product managers can’t.

Our answer was to build.

We built a visual editor that exposed SQL through a series of dropdown and drag-and-drop gestures. We wrote a SQL decompiler, came up with an intermediate language for a “viz spec”, and created intuitive UIs around it. Those UIs both explained the underlying SQL query and enabled users to edit them as needed.

That not only made it okay to be incorrect, it gave us a long-term plan on how to be correct.

That is, if a user asks “top market segments,” then clicks a few times to indicate that his definition of “top” really means highest revenue and the “market segments” he cares about are limited to automobiles, he’s really just given us a training sample. He’s given us a question (i.e, “top market segments”) and a SQL query that he’s satisfied with.

That data is rare; the state-of-the-art dataset on natural language to SQL only has ~10K samples. We thought we could get millions. And our plan was to use that data to train fine-tuned models for SQL generation that would be unequivocally superior.

That plan failed on its first step: getting customers.

Our go-to-market was nebulously defined.
We’re two introverted engineers. We don’t have many friends, don’t have sales experience, and found it awkward (and daunting) to cold-call strangers.

Yet we tried.
We paid for a LinkedIn sales account, compiled spreadsheets on leads, and started sending dozens of emails daily with no reply. We talked a lot and we learned a lot.

In that process, we realized our tool wasn’t built for business intelligence.

Business intelligence wasn’t what we thought
1. We thought business intelligence was data analysis (e.g, “why did our ranking models change behavior on June 16th?”)
2. We learned business intelligence was really reporting.
3. Most users ask the same questions: "revenue by year”, “KPIs by month,” and the real need is to surface a report that answers those questions in a consistent way.
It’s hard to differentiate BI platforms
1. Most BI platforms are the same.
2. They all have dashboards, they all have reports, they all have hundreds of visualizations.
3. They’re more mature than we are - and our one differentiator, natural language, they were building too.
It’s hard to sell to companies
1. We didn’t know what it meant to be a B2B company.
2. We had the notion that it was a safer business than selling directly to customers, but didn’t know how enterprise sales worked.
3. We didn’t have the connections, the experience, or the motivation to pursue it.

But what was unexpected was that we still thought our tool was useful as-is.

We were in the habit of testing DeepDive on local CSV and Excel files.
We would download sample datasets off Kaggle (i.e, 120 years of Olympics history), and then use DeepDive to quickly explore and create reports from those datasets.

And in conversations with data analysts, we learned how much of their day-to-day still revolved around Excel. They were asked to create reports off spreadsheets on an ad-hoc and regular basis. One of them mentioned that he’d never be rid of it; despite his company’s BI platforms and modern tooling.

So we thought. Why don’t we focus on Excel?

Excel users had many of the same needs that BI users did. They had data and wanted visualizations. Our product worked well for their use; in fact, even better than it did for BI.

Our natural language models struggled with many tables, but did very well on a few. Latency was now a none-concern, since we could cache the entire spreadsheet into memory. And, we’d be able to go to market quicker by targeting the end user; rather than attempting to upsell companies on their existing BI platforms.

And so we built.

A few weeks later, we went to market again.
Our target customer was: “anyone who creates graphs in Excel.”

That was a lot of people: marketers, accountants, consultants, brand managers, insurers, even an e-sports team manager.

But in talking to them, we found that most didn’t have a real need.
They found the product cool and thought of how it could be used, but ultimately their real needs were different: whether that was substantiating their pitch deck or coming up with a League of Legends draft.

But we did find one that was real: lab experiment analysis.

We found a few material science PhD students who were running experiments day-in and day-out. The lab machines they used to run their experiments outputted CSV files and a significant portion of their day (~30%) would be dedicated to importing these CSV files into Excel, formatting it, applying a few pivot columns and then creating visualizations.

We thought this was it. This was a real problem, a real need for data analysis, and our product solved it.

We were initially hesitant to pursue it, as we knew, it was a smaller market and an industry we knew little about. But we still chose to do so: we thought it was a real problem, and that, motivated us more than anything else.

We tried for a while.
We built a PowerQuery-esque data pipeline to sanitize CSVs. We implemented visualizations our customers asked for: Seurat violin plots, RNA heatmaps, spiral luminescence charts. We cold-called PhD students, material science professors, and read papers to understand what exactly it is that our customers did.

At last, we found a problem and had a solution that worked.
But slowly, we realized that we weren’t the right people to solve this problem.

We didn’t know much about experimental science.
And that hampered us. It made us unsure whether we were doing the right thing and made us reliant on our customers for our vision.

So we quit.

Failure is the right word for DeepDive.

I don’t like the word pivot.
In each of our attempts, in each of our efforts to build something great, if we saw some success, we would’ve kept going. The reality is: we did not.

So we failed. And that’s something I’m not ashamed to admit.
We had much hubris to eat, coming as two engineers from big-tech.

We looked down on and disdained writing front-end code. But we did.
We were reluctant to cold call people and hated rejection. But we did.

We were scared, we were anxious, we were tired.
We didn’t want to work so hard for nothing but an idea.

But we did.

We failed and I’m proud that we did.

We’re trying something different now.
DeepDive was a tech-first approach where we started with the technology and tried to find a problem for it. What we’re doing now is the opposite.

We’re working backwards from a problem that we know is real.
We’re trying to solve my problem: physical therapy.

I have a long history with physical pain, and though I won’t use this as an opportunity to pitch, it is a problem that I’ve tried desperately to solve on my own terms. And now - we’re building a product to solve it for all.

Thank you for reading my eulogy.
I am sure, in some senses, this may read as trite.
But regardless, the lessons here are those we’ve taken to heart.

(p.s., if anyone would like to learn more about DeepDive, we’ve decided to open source it here. please do feel free to reach out: pybbae@gmail.com)
(p.p.s., i’ll be making an effort to write and publish regularly on here, please do subscribe if you’d like)

baepaul's weekly notes

Ready for more?