Beyond Compliance — how to find Product-Market fit for Data Startups

11 min readApr 6, 2023

I’ve been part of a UK government project scoping out data portability, personal data stores and payment rails, with the goal of unlocking the estimated 5% of GDP (roughly $150Bn a year) languishing in data silos.

It struck me that most of the products and projects offering such solutions in the Privacy Tech space were the same I saw five years ago, and how many of these are still in the early startup/seed stage.

This isn’t a UK-specific observation. Spending time in Europe last year at gatherings celebrating new EU data laws, there was a similar landscape; small projects and teams, some limited pilots and huge ambitions. They just needed the rest of the world to fall into step…

The challenge is that such ambition and novelty is often disliked by customers, used to a certain way of doing things; “If I had asked people what they wanted, they would have said faster horses” (Henry Ford). How can we avoid the pitfalls and be among the 10% of start-ups who don’t fail, but scale?

Such thoughts took me back to the early days of my own dive into PrivTech; back in 2015 I was running a digital agency which built websites, mobile apps and data-driven marketing solutions.

To me and my team, cookies were simply tools, small snippets of code which ‘make the web work’ in terms of personalisation. We were alerted to the challenge around EU cookie compliance by our existing clients — they said, “this is a problem, and as you’re responsible for executing our web strategy, you need to do something about it!”

A problem can be a great place to start when building a new product — the one here being that a new compliance requirement was unleashed upon an out-of-control business practice (web tracking).

When this landed on our plate as developers, the product requirements — discover, delineate and disclose cookies — had been predefined. Teams of lawyers (or associates at professional service firms) had manually audited websites for cookies, and created some legal text generic enough to translate for local markets.

We picked up this process, did our own audits for clients and could then rely on the developer’s innate laziness (I mean this in a loving way) to invent an automated crawler for the audit, which we’re familiar with from search engines.

We found the same cookies popped up consistently in different contexts, and created our own taxonomy and risk profiles in collaboration with the marketing/UX teams who needed to integrate all this with their existing view of the adtech world.

Soon we had audited hundreds of sites using the tool and came up with cookiepedia, a sales funnel where you could enter the address of any website and get a free assessment of what cookies it ran, and a rough idea of your risk exposure.

The software indexed these sites and grew our knowledge centre for cookies until it became a reference point for Google and others. Most importantly, it gave our potential customers what they needed most, information and insight into their new problem.

The fact we were plugged into the ecosystem already as a supplier, but slightly down the food chain as ‘developers’ (someone who executes rather than sets strategy) meant we weren’t seen as a threat to other actors and our solution could spread quickly.

Offering a premium service with branding and localisation meant the phones started ringing and we quadrupled orders almost overnight; adding a turnkey ecommerce solution meant by the time we sold the business to OneTrust, there were 150 international clients and a SAAS product which makes money while you sleep.

Like all stories involving success, ours benefited from more lucky timing than we admitted to, and the cleverness of the product was its lack of technical innovation. Our cookie solution, ironically, worked by way of dropping its own cookie: those few lines of javascript dropped into the code of an existing site left a frictionless, lightweight footprint on existing web estates and metrics operations.

When we designed our cookie product I never thought it would survive more than a few years before evolving to something closer to the GDPR principles of control and fairness. Instead, we’re stuck at the ‘dial up’ connection stage of user experience for consumer control of their data, the cookie banner, which proliferated to the point where many sites are now unusable because of the real estate the banners take up, and the forbidding complexity of interacting with them.

It’s been equally difficult to get new data on the how and why of tracking, to build on what we learned in the early days, so people haven’t been in a position to provide genuine, informed consent…

BE CAREFUL WHAT YOU WISH FOR…

We now all click on cookie banners to simply make them go away, without any real expectation that our wishes are being respected along the complex supply chains of consent, even if we’re aware of what’s going on with our personally identifiable info.

Moving from cookies to how the various elements of the Enterprise Privacy Tech stack have developed, they tend to follow the existing structure, needs and priorities of the actors I mentioned earlier.

While the demand for consent management, data mapping, DPIA workflows and privacy dashboarding fuelled practical technology development, it hasn’t led to innovation in terms of changing behaviour, at least not in the way that I and others had imagined.

We wanted to look ahead to the new opportunity for a killer product, to where we thought the puck is heading, and that’s where the risk is. You find yourself holding onto a rising balloon, outside a recognised role in the existing ecosystem — often an exposed and lonely place…

What attracted me was the largely untapped privacy product market for consumers and what are perceived as their top problems.

DMA Research showed growing concern amongst consumers around online privacy and the ‘transaction’ that takes place when data is captured, shared and used. In particular, 78% of people believed businesses get the best deal from data exchange, and a majority felt they had lost control of their personal data

Providing tools to reset the balance and put people back in the driving seat seemed the perfect product fit, and a MUCH larger Total Addressable Market — cookies were just the dress rehearsal for the real deal, or so I believed…

Launching Tapmydata as a startup we had the opportunity, merch and some seed funding to build a consumer-focused, PrivTech product and platform using Privacy by Design principles from the ground up, without legacy architecture or similar baggage.

We built a mobile app for citizens to discover which organisations hold records, make subject access requests, repatriate their data and store it securely in a wallet.

The other element was to build a secure channel for people and organisations to engage in dialogue around their data and rights. So all good, yes?

Well actually, the first problem here was we needed to rethink our basic approach to building apps and systems. For technologists, developers and data entrepreneurs, the default has long been to capture all the data, store it and work out what to do with it later. We wanted the opposite, to stay true to our ideals and keep our own risk profile low.

With a clean sheet, we took the 7 principles of Privacy by Design and boiled them down to 2 simple operating practices…

1: Collect as little data as possible from the user in the first instance
2: Don’t hold anything personal that isn’t encrypted

So, the first challenge with our app — users generally need profiles, user profiles need data. Common practice says we need to collect an email and password. This immediately violated principle 1 and also leant towards violating point 2.

Our solution? We generated a user profile on the fly without your intervention. We assign a random email address to you, and create a super secure password you’ll never know or need. You haven’t given us any data. You have an account. Your app can log in. We have no idea who you are. We don’t want your data!

So having built this foundation, how to embrace new tech safely, in this case blockchain? After all, there has been a lot of contentious debate along the lines that blockchain and privacy don’t mix. But when we use blockchain not to store data, but rather as the record of data transactions, it starts to make sense.

We chose Stellar which is an open-source, decentralised blockchain. It provides a public ledger that allows transactions between two or more parties. When a user makes a data request to an organisation, a record of this transaction (but no personal data) is recorded on the public ledger, and another when they respond.

One current challenge with data governance is that each organisation effectively ‘marks their own homework’ in terms of responses to data subject rights. By using the blockchain for transacting the requests and responses (in hashed form), we could build up a central, permanent record which can be accessed and interrogated by third parties like regulators, without compromising privacy or security. We also set up a secure P2P channel for data dialogue between individuals and organisations, using a process of public key cryptography.

How Tapmydata uses public key cryptography

Security now largely came down to key management. And what we discovered when our product met the real world is that key management is hard! With around 70 organisations using the channel on free trial, our largest overhead in terms of support was dealing with situations where they had lost their secure key. They found it hard to understand we couldn’t tell them what their key passphrase is, didn’t have an admin password and certainly no ‘god view’ of users of the system!

Our answer to organisations’ frustration was (naively) that we all just need to sacrifice some convenience, at least in the short term, to do the right thing and move forward. We just needed time for prospective customers to get their heads round this new tech, which surely couldn’t be too long, right? What we found is yes, it did take a long time, and most organisations’ processes stayed the same, for good reason.

Our phones kept ringing but the people on the other end were pissed off, from all sides of the privacy and political spectrum. We got a lot of attention and some mega fans, but sadly that doesn’t pay the bills.

So while the innovative use of Privacy by Design and blockchain by talented developers like Tom Holder inspired me to lead Tapmydata, the real challenge was finding a viable business model which didn’t rely on data monetisation or government subsidy. It was hard to sell the value of better technology in a climate where companies are required only to provide a basic, but well understood level of service.

COVID provided new use cases for the tech we’d built with a compliance focus. Because we had a secure, dedicated channel for data dialogue, this lent itself well to being used for COVID checkin for venues — in particular, the Rugby Football Union and Church of Scotland became clients precisely because they handle a lot of childrens and special category data, and didn’t in their own words want to use a “system which had been knocked up opportunistically to handle people visiting the pub”.

Those organisations which took up the COVID service found it easy and convenient to then use the data rights solution, but were only willing to pay for the former and unlike cookies, COVID check in apps were not a long term business opportunity — so what might be?

Until now, it’s been the case firstly that the burden for operational and technical compliance and best practice has fallen upon each business, to ensure they are up to scratch. This means a lot of smaller and medium size companies are not in compliance or even in ignorance of their responsibilities.

Secondly, there is a lack of consumer-grade tech for people to exercise their choice and rights over their data. Emails, web forms and drop-box style methods are still the norm, which breaks the customer experience in the mobile space.

This is now starting to be addressed: as more core providers bake Privacy by Design into their tech stack, things like proper, granular consent management gain traction. There are efforts to reduce the 99 articles of GDPR to a single API, even a line of code for developers, or a smart contract in a blockchain context.

This will reduce the pressure on individual companies and grow the privacy stack, while consumers can start to make choices based on privacy and ‘sustainable data’. We’re seeing this with the likes of Brave, Signal, Duck Duck Go and Swash.

We’ve also seen the arrival of what I call ‘nuisance tech’ — attempts to bombard companies with requests for deletion or right to be forgotten as a prelude to legal action, or a cute pivot to providing tools for data controllers to manage them, but it’s a long game.

Privacy-enhancing technologies (PETs) are a wide field, designed to extract data value in order to unleash its full commercial, scientific and social potential without risk to privacy and security. In this area, pseudonymisation and anonymisation are techniques for de-risking data which already existed but became very popular post GDPR. But there’s no silver bullet when it comes to data security: despite any measures you put in place, you can re-identify pseudonymous data precisely because it is a reversible process. Neither is data anonymisation a failsafe option.

However, both pseudonymisation and anonymisation have their uses, if implemented well. They should be used hand in hand with a data minimisation approach to both new and existing tech and processes.

Cryptographic algorithms also play an increasing role: Differential privacy protects from organisations sharing any information about individuals. This cryptographic algorithm adds a “statistical noise” layer to personally identifiable information (PII) which enables us to describe patterns of groups within the dataset while maintaining the privacy of individuals.

If you are interested in privacy-enhancing technologies because you need to transform your data into an environment where third-party users have access, generating synthetic data that has the same statistical characteristics is a great option. This means you can put it to collective use and potential insights, without the owner’s efforts in gathering it being commercially devalued.

The EU wants to fund development in these technologies in its mission to rein in gatekeeper platforms’ use of proprietary data and create a level playing field, so regulation is certain to play a role in accelerating these trends and exposing new product opportunities.

So where does this leave us? In conclusion, the only other area I can think of with a similar level of self-belief, not yet backed up by user numbers and real world adoption is Web3.

Much of the opprobrium directed towards crypto and the Web3 space in recent times has been the ‘Wild West’ nature of its governance, or lack thereof. But it was successful in attracting billions of funding and capturing the public imagination, in a way which so far has eluded Privacy Tech.

For every daft cartoon NFT and metaverse la-la-land there are Web3 products which fulfil a real, useful purpose, be that decentralised marketplaces for data, wallets, messaging and community apps, or oracles, which ingest data from real world sources like pricing into on-chain smart contracts.

Among the projects I’ve advised in this space, there was a real willingness to embrace the principles of Privacy by Design, even if they weren’t sure where to start or how to go about it. This is because such practices fit with the Web3 belief system, not because projects are obliged to by regulations (yet).

If we can help projects operationalise and scale privacy processes in a new area of growth and public recognition like web3, maybe we’d get closer to the product market fit and commercial scale data privacy so clearly deserves.

What do you think? Let me know here or reach out…

Beyond Compliance — how to find Product-Market fit for Data Startups

Written by Gilbert Hill