If you’re an AE or SDR in compliance consulting or regtech, you know the drill. You have your target. You need qualified leads in financial services. An approach I’ve heard quite a lot recently is to catch companies early on in their journey that might not have the know-how to be directly authorised by the FCA or the solutions in place to stay compliant.
So the trick is to spot newly incorporated companies that are on the cusp of requiring FCA authorisation—before your competitors find them. I’ve spoken to some reps that resort to painfully manual methods to find these companies such as trawling Companies House one-by-one or spraying-and-praying new company pages on LinkedIn.
Not scalable. Not repeatable. Definitely not fun.
But here’s a less painful alternative: leverage Companies House bulk data and some Python (dw I have written for you) to find new UK firms poised to need regulatory support. No gimmicks, just data.
The hidden goldmine: Companies House bulk data
Companies House provides a regularly updated bulk dataset, absolutely free:
Companies House: Free Company Data Product
Fair warning, it’s nearly 500 MB zipped and expands into a hefty 2.8 GB CSV. But don’t panic—Python’s got your back.
Finding your ideal firms: targeted SIC codes
To zero in on potential FCA authorisation candidates, start by targeting specific Standard Industrial Classification (SIC) codes relevant to financial services. Think investment trusts, fund managers, brokerage firms—the good stuff.
Not sure where to start? Companies House has a searchable SIC list, or you could jump straight into Section K (Financial and insurance activities).
Enough talk—show me the code
You’re going to need an environment to run this Python code in. I use Jupyter Notebook via the Anaconda Distribution. You could also get access to a similar environment via Google Colab. It’s beyond the scope of this guide to go into detail about getting an environment set up but there are plenty out there and ChatGPT is ready and willing to be the world’s most patient guide.
Here’s your step-by-step for extracting the leads from the Companies House bulk data:
- Load your dataset into a Pandas DataFrame (think Excel but for large datasets):
import pandas as pd
from datetime import datetime
from dateutil.relativedelta import relativedelta
df = pd.read_csv("BasicCompanyDataAsOneFile-2025-03-01.csv")
- Specify your SIC codes:
sic_list = ["64301", "64302"] # Example: Investment trusts and unit trusts
- Filter efficiently—no need to crunch the entire dataset if you can narrow it down early:
# Define your timeframe (e.g., last 6 months) - change the 6 to whatever you like
today = datetime.today()
x_months_ago = today - relativedelta(months=6)
# Convert date strings to datetime
# Speedy first filter - just Active UK firms
filtered_df = df[
(df['CountryOfOrigin'] == 'United Kingdom') &
(df['CompanyStatus'] == 'Active') &
(pd.to_datetime(df['IncorporationDate'], format='%d/%m/%Y', errors='coerce') >= x_months_ago)
]
- Narrow down by SIC code:
sic_cols = ['SICCode.SicText_1', 'SICCode.SicText_2', 'SICCode.SicText_3', 'SICCode.SicText_4']
filtered_df = filtered_df[
filtered_df[sic_cols].apply(lambda x: any(sic in str(x.dropna().values) for sic in sic_list), axis=1)
]
- Review and export your targeted leads:
# Preview t he first few results
print(filtered_df.head())
# How many leads did you uncover?
print(filtered_df.shape)
# Export for CRM import or further prospecting
filtered_df.to_csv("newly_incorporated_firms.csv", index=False)
Just like that, you’ve got a CSV with fresh leads including:
- Legal name
- Companies House registration number
- Full primary address
What’s next?
You’ve found the firms. Now, to avoid chasing ghosts, quickly cross-reference these with the FCA register to see if they’ve already got authorisation or are still ripe opportunities. Hackford can help streamline this part—we’ve already got the FCA data sorted, updated, and ready to roll.
Less manual prospecting means more selling time. Happy hunting!