How Clearbit’s team uses AI

I’m exploring how to implement AI in business and today I’m excited to show you how Clearbit does it.

Clearbit provides company data to sales and marketing teams to find and engage their ideal customers. They were acquired by HubSpot in November 2023 for $150M.

Thank you, João Moura, Director of AI Engineering at Clearbit, for spending time with me and opening up behind the scenes. He also shares his AI agent framework, CrewAI.

What stood out to me from this conversation:

exploring use cases with an AI champion in charge
rolling it out to the team by forming small, cross-functional teams and speaking with external experts
the challenges you’ll face when doubling down on AI implementation in your product and team
which tools they use and why
leveraging AI agents to automate your life and get time back
how much AI has changed the way different teams work
steps for AI adoption in your team

The exploration phase

When João first brought up AI into Clearbit, it seemed obvious. They have all this data about so many entities, what can they do with it?

So Clearbit began experimenting with AI, and João developed a predictive model for the likelihood of winning sales opportunities, which could help customers forecast their deal pipelines using their own and Clearbit’s data.

They also started building a GPT plugin and using LLMs for improved data parsing, as much of their work involves extracting internet data gems.

It basically integrated with some of our core APIs, it would allow Clearbit customers to pull data on the website visitor, search for similar companies and do some early prospecting. The good thing was that by getting that data into the Chat you could easily translate from checking visitors to finding similar ones and to asking it to help you write emails.

This quickly evolved into complex use cases with RAG systems and large-scale embeddings.

Retrieval-augmented generation (RAG) is like a smart assistant that can look up information and talk about it naturally, while Regular Expressions (Regex) is a tool for finding specific words or patterns in a text.

RAG models are really good at figuring out the context to answer questions or create text, using lots of information beyond what Regex can do with its basic pattern finding, making them more suitable for complicated language tasks.

Before there were only so many cases you could cover with Regex, with RAG we can tap into data points we wouldn't be able to easily extract otherwise.

Embeddings turn words or items into numbers so that computers can understand how similar or different they are to each other.

Embeddings were crucial to achieving effective RAG because they provide a nuanced understanding of language that is essential for both the retrieval and generation components of RAG models, like tagging and classification.

Embeddings were intrinsical to get some good RAG, but we also have leveraged them to do actual tagging and classification in a scale and level of precision we wouldn't be able to before by relying only on ingesting data from sources

AI unlocked numerous use cases, diverse data sources, and processes, enhancing data coverage and quality, making their company data superior to any in comparison.

As of today we now have 100% coverage across three of the most important company data points. Every single company domain requested by a customer within the last 3 months (~4M) has a clear english description, accurate industry categorization, and detailed company tags.

Matt Sornson - GM & VP of Product

They were able to infer data and tag companies with higher precision and that they wouldn't have been able to in the past.

Take a look at the % increases (in green) below for their metrics:

Rolling out AI to the team

João led the rollout top-down by demoing his prediction model to the team—he has kept the recording.

AI offers so much potential that the question often becomes 'Where do we start?' Having a champion on your team to demo their build helps the team understand capabilities and spark new ideas.

João formed a tiger team to explore further with a product lead and select others.

Many companies, including Zapier, have established small teams to explore applications of AI for customer-facing and internal tools—I was part of that team.

João contacted industry experts, tested approaches, and got on calls to find the best solution.

Relying on others' expertise can reveal what you may overlook due to your focus on your specific company. It allows for new perspectives and lessons that you can avoid or copy.

I would really recommend for others to chat with people that have built more complex AI systems, like we now did ourselves, to understand some of the gotchas around it, we saved a lot of time and avoid using tools that were not necessary by learning from others, that in the end resulted on a way simpler design making it easier to maintain. I wanna make sure you watch for not getting locked into vendors unnecessarily and don’t put your business into a weak position depending on someone else's system entirely if you don’t have any control over it. We are very mindful of that.

The Engineering and Product teams quickly embraced AI tools like ChatGPT and Github Copilot for writing code, debugging, and analysing data.

I personally can’t envision myself not doing my best work fast without AI nowadays and I do think that teams embracing it faster have an unfair advantage towards others.

Empower your teams to embrace AI tools. Allowing free exploration without a fixed agenda by setting less concrete goals. Many interesting ideas may surface in areas previously not considered for AI usage.

But, small focused experiments help initiate team-wide exploration

Also, being open to AI changing your processes or product features can be hugely beneficial. For example, Zapier now offers an AI-powered ‘create a Zap’ function that allows you to write in natural language what you want to automate, and AI sets it up for you, replacing the need to manually find and connect the right apps and actions.

Doubling down on AI

Given their promising exploration of AI, Clearbit embraced it across the board—all departments have access to ChatGPT. They were among the first 10 OpenAI business customers.

Rolling out ChatGPT access to the team has been positive, with individuals discovering diverse applications for it.

Like internal communication—helping get messages across to other team members and being a good sounding board to bounce ideas off.

Sharing between team members using AI has been a focus too. It allows others to learn and adapt it for other purposes.

A significant challenge lies in disseminating this knowledge among a wider audience. For instance, if someone devises a compelling and innovative prompt or use case, you wanna make sure others learn from it and adapt it for other use cases.

Focusing on data, they kept a tight hold on business-related metrics with historical values, such as data coverage and quality. This allowed Clearbit to objectively assess the impact of its AI initiatives. And question it’s existing processes.

we decided to double down on AI efforts with the clear goal in mind of improving our data, that meant rethinking a lot of our systems and pipelines with now AI in mind.

The initial challenge is exploring AI use cases; once you find a successful application, ensure you track and measure its success over time.

João ensured the team had logs to measure performance as they deployed the systems into production and iterated over time.

Another challenge is scaling—off-the-shelf AI models can get expensive and slow. This made Clearbit look into open-source models and fine-tuning.

Clearbit is handling very big datasets and running API calls to closed models that have rate limits could take months to process.

you can run OSS fine tuned models that will perform close to GPT performance for a fraction of the cost given you now pay per uptime vs tokens and you don't need to deal with rate limits necessarily. That said, OpenAI models are amazing and we use them a lot.

Exploring or implementing AI in your company involves trying things, monitoring performance, and iterating over time. There is no ‘plug in AI here’ solution that works out of the box and is truly impactful.

“A lot of the AI world right now is new for many people but in the end the same engineering principles prevail, you wanna be curious, plan and test things.”

The tools Clearbit uses

Clearbit uses the following tools:

Github Copilot—An AI-powered coding assistant
- Used by the engineering team for writing code
ChatGPT—A conversational AI model
- Used by all teams for general AI use; asking questions, communication, analysis, etc
Hugging Face—A platform for building, training, and deploying machine learning models.
- Used by the engineering team to explore fine-tuning
LlamaIndex—A tool to connect LLMs to other agents and data sources, enhancing their information access and relevance.
- Used by the engineering team for their RAG logic
LangChain—A framework that enables LLMs to facilitate more effective information retrieval and task execution.
- Used by the engineering team for experimentation
Weights and Biases—A platform for visualizing and tracking machine learning experiments
- Used to track experiments
Pinecone—A vector database service for large-scale similarity search applications.
- Used for specific semantic search use cases

On ChatGPT usage:

the team overall across departments were pretty quick to adopt ChatGPT, we have used it for many things, from helping with sales, to helping analyze calls and support tickets.

João also put together a custom GPT to help folks doing support rotations to help them debug issues, run queries for logs and in case they get blocked, know who they should be reaching out to in the team specific to that bug.

On GitHub Copilot usage:

Copilot got more of an uneven adoption where it really clicked for a portion of our team but not for another, it became clear its quality varies depending on the language you are programming in and how the code is structured, that said it has had a big impact on how many on our team write code.

How AI has impacted work

AI is significantly impacting João’s work (and his life).

my wife jokingly (?) said GPT is now my new best friend because of how much time I spend talking with it.

It simplifies learning—it’s like having a dedicated teacher who can explain things in your preferred format or style, draw parallels to familiar topics, and allow you to not worry about asking ‘wrong’ questions.

João uses Github Copilot and now gets frustrated without it, as it’s profoundly impacted his workflow and changed how he codes.

The funny thing to me is how I've changed the way I code to leverage more comments and to strategically have files open in a way that Copilot can use their content when suggesting changes.

He also has about 10 AI agents working for him, handling tasks from social media posts to writing simpler code. This gives him back time and the agents can do stuff that he usually wouldn’t.

Because of how helpful these agents have been, João built a side project called CrewAI, a framework for orchestrating role-playing, autonomous AI agents.

I started putting agents together for myself back in October and since then have started automating my life away with other agents, grouping all my learning in this framework called crewAI

Having a mix of management and technical work means João isn’t scared of AI threatening his job. But feels that engineering will change.

it has become increasingly clear that engineering as we know today is changing very quickly and it’s still unclear what that will look like a couple years from now.

What is clear to me is that a lot will get automated away, so being able to control these automations and steer these system will be a valuable skill, also being good at defining the “why” and “what” to build will gain an even higher relevance than the “how”.

We have way less control over many aspects of our life that most would like to believe. That said, there are things I think engineers and people managers, like myself, could do to take better advantage of these AI tools and prepare ourselves for the future, the main one being actually using them.

Implementing AI adoption in your team

Here are a few takeaways to encourage your team to explore and adopt AI:

Start with small, focused AI experiments
Demo early AI prototypes to get broader buy-in. Make it easy for other teams to see the potential
Form a cross-functional "AI tiger team" to explore applications
Provide access to leading AI models and tools like GPT-4, Copilot, etc. so teams can self-experiment
Develop in-house tools e.g. a custom GPT for team members to debug issues and answer questions
Implement guardrails to prevent misuse or overspending
Show the impact on business metrics to justify further investment
Celebrate "quick wins" from AI experiments to build momentum
Plan for the future by evaluating necessary skills and identifying opportunities to streamline workflows
Provide AI access by offering training and support resources e.g. outside experts, internal experts (the AI champion), courses and educational material.

If you found this post valuable, share it with a friend, and consider subscribing. Feel free to suggest new topics for me to cover.

Cheers,

Yikes, a paywall!

‍70+ tutorials, courses and case studies wait behind it. No subscription, $150 paid once.

✅ Full course & tutorial access
✅ Case studies on companies using AI
✅ Private community access
✅ No subscription, $150 paid once
✅ Expense it using this template. Or get a team account.

Get all access ->