6
1 Comment

Hands-on: Use Web Search Data to Build and Market Products Better

In this article, you will learn how to derive quantitative evidence for the unique value propositions of your product: understand the respective levels of demand and competition, and more.

Hey indiehackers! I will speak in the Product section of a large IT event in Russia. I took my presentation along with the speaker notes and laid it out as this post with slides and comments. The talk is 35 minutes long; the read will take you about 15.

The perspective: ‘if you’re a founder or a product person, this is how you can get demand and competition stats for your product from web search data.’ I’ve recently posted about demand from the Jobs-to-be-Done perspective, and this ended up as a hands-on follow-up 😌


New products often fail to deliver on expectations

Here’s a common problem: 72% of new products fail, 42% of them fail due to lack of demand

I know right, if you’re a product person, you totally know this. I want to talk about this from a different perspective.

We can lower the rate of failure by surveying the market to assess the demand for products prior to their launch or pouring budgets into expanding the market share.


Customer Development may help

To discover and measure demand, we use CustDev

While Product folks know that sociology can be split into qualitative and quantitative, quite often professionals still refer to qualitative methods to validate demand. This is wrong by design.


CustDev could help, yet everybody lies

In-depth interviews are subjective situations where respondents often provide wrong answers out of so-called shared thinking. Simply put, everybody lies.

People lie in interviews. Intentionally or not, they inject wrong information into your product development process.

There are ways to minimize the margin of error. Yet, if the qualitative foundation is built on wrong information, every next level multiplies the lies exponentially.


Digital traces are more honest than people

Digital traces have a lower probability of lying. Web search is one of the largest pieces of evidence of digital traces. For decades, people were telling everything about themselves to search engines in a form of search queries.

Seth Stephens also reflects on things like racism and sexism. If we look at facts from web search as pure data, those things are integral to our society.

Looking for ways to make passive income in 'digital shared thinking,' we encountered asking boyfriends for money. Is it sexist? Yes. However, it exists.

And, in a subjective in-depth interview situation, you won’t be given such an answer to a ‘ways-to-make-passive-income’ kind of question.


Digital traces? What? Like Googling things?

SERP looks like neither qualitative nor quantitative sociology.


While Google hides figuers, others don't

Search engines hide most of the important figures. However, you can get to them using an instrument like Ahrefs. You give it a URL, ask for a report on organic keywords, and derive keywords attributed with it.

Keyword attributes are well known by every SEO specialist out there, but not necessarily by Product specialists. We use Ahrefs to derive those, but you can use the tool you trust.

Keyword Volume stands for the approximate count of (arguably) unique monthly searches. Keyword Difficulty is an index representing the level of ‘organic’ competition. CPCs are cost-per-clicks in Paid Search and can be used to assess the level of paid competition.


Captain, can we fast forward, please

Organic keywords is a great source of evidence for your Customer Development.

'Economists and other social scientists are always hunting for new sources of data, so let me be blunt: I am now convinced that Google searches are the most important dataset ever collected on the human psyche.' Seth Stephens-Davidowitz.

Thing is, we can today build and employ datasets to help better design and market our products.


The tools of the trade

Two frameworks we'll use: Value Proposition Canvas and Awareness Ladder

We will use Value Proposition Canvas (or VPC) to form a semantic description of a product and its value propositions.

The awareness ladder will help us segment VPC items (audience) by the level of buying intent.


A closer look at the ladder

Awareness ladder is a hierarchical model of the human cognitive process. In practice, it allows segmenting audiences by the level of their buying intent.

One thing is to discover the volume of existing demand, and another is to understand the consumer behavior within this demand.

Employing the Awareness Ladder, we can assess to which extent consumers are ready to purchase a product depending on the value proposition we convey.


How will all this change my CustDev?

The process of Customer Development through web search data

The idea is to have a product-related search query formulated as if it was at a certain level of the Awareness Ladder.

Then, it’s about fetching related keywords and attributing them to that query followed by analyzing the keyword attributes of a derived ‘cluster.’


Now, let's move step-by-step

Example company: SaaS platform automating cryptocurrency trading. Hypothesis: if we convey the value propositions to English-speaking audiences unaware of crypto, we’ll bring a bunch of new customers.

Let’s take the two Jobs-to-be-Done for a crypto trading platform:

  • make passive income
  • trade cryptocurrencies

While the latter is about a crypto-aware segment of the audience, the first one corresponds to a ‘higher-level’ job.

Simon Sinek once said that ‘if you focus on money, you make money; if you focus on impact, you make impact.’ That’s the case with our first Job-to-be-Done. It’s full of impact, and to get it done, one could employ an algotrading platform we’re talking about.


Search and discover

Search engines were created to answer human questions. But their answers are special: they are URLs with associated semantics that helps relate human questions to content.

We formed the search queries so that the ‘passive income’ one is around the ‘solution aware’ step of the Awareness Ladder and the ‘crypto trading one’ refers to the ‘coldest’ audience asking their ‘what’.


Take URLs that best fit the queries

Google is built to provide the best answers first. Our goal is collecting SERP in the range of 1-5, we’ll only account for organic results: no paid or featured.

Depending on the volume of data we want to fetch, we can aim for wider SERP ranges. This will also reduce the factor of ‘putting the most SEO optimized things first.’


Fetch organic keywords for the URLs

Let's discover keywords associated with each URL. While each URL has its unique metadata, we need something that aggregates behaviors of many search subjects at a time: keywords are a good fit.

At this step, we put each collected URL to the Ahrefs Site Explorer and derive the related keywords, export them as CSVs to use with Google Spreadsheets.

While the whole pipeline we’re discussing can be automated, it’s important to describe the step-by-step manual process, so that every product owner could run the manual assessment.


Match keywords and queries: build clusters

Combining VPC-based queries and keywords. Query: passive income guide; let’s take a look at the keyword cluster.

We used Google Spreadsheets and VLOOKUP with URLs acting as index to build the clusters.

We see that in the ‘digital shared thinking’ passive income is associated with investments, ease of getting the income, and income-generating assets.

Another good thing is that there’s no direct sign of the crypto industry: its penetration into the passive income space is fairly low.

However, the concepts positioned in the minds of the web search audience segment well correlate with the value propositions of a crypto trading platform. That makes the ‘passive income’ job worth looking into to acquire new audiences.


What's in the crypto land?

Combining VPC-based queries and keywords. Query: what is crypto trading; let’s take a look at the keyword cluster.

When it comes to the ‘crypto trading’ job, we immediately see the presence of branded keywords in the cluster.

From the perspective of Latent Semantic Indexing, this would mean that within the web search context, the Binance brand is connected with the very idea of crypto trading.

This indicates a higher overall level of competition with what-level keywords having their keyword difficulty rating at over 80 and CPCs way above zero.

It means that other brands on the market are already paying for ‘cold’ search traffic, meaning that they’ve already paid for ‘warmer’ audiences.


A conclusion from the two data pieces

Competition and market insights we can get looking at the data.


What else is there?

How can we use the derived evidence: 1) assess the levels of demand and competition; 2) understand the exact wording audience uses: fetch the context of ‘real shared thinking’; 3) the fetched keywords are a naturally filtered semantic kernel.

Note, the cluster related to the ‘passive income’ job also has lower average keyword difficulties compared to the ‘trade crypto’ cluster. It’s also a signal of a generally lower level of competition in the segment.

To assess the demand for a Job-to-be-Done in its full, we should look at the aggregate stats for all the levels of the Awareness Ladder.

To go for the demand, a crypto trading platform could roll out a dedicated landing landscape to test this acquisition hypothesis.

It brings us to the point where you’d require a budget to really test the hypothesis. Yet, there’s a ‘lean’ version too, it’s about running tests with your main and product pages copy.

To set up the experiment properly, you’d require to set up a testing framework (a Spreadsheet would do) and Google Optimize.


Yeah, the trading platform did use all this

Crypto SaaS applied the evidence to build a new information product based on expert content. The product is useful as both TOFU and MOFU content: it works to better acquire and retain customers.

The case with a discussed crypto trading platform is about rolling out a vast landing landscape for search traffic. I’ll update you on how this turned out exactly.


How many 'jobs' do I really need?

Two Jobs-to-be-Done aren’t enough. The minimum acceptable VPC contains about 24 elements. Then, it’s still doable by hand.

Laying out aggregate statistics is useful to see the bigger picture.

Here, we see that large brands are simply buying the ‘most aware’ traffic with the highest level of buying intent.

We also see that the organic difficulty there is lower than average across the entire spectra of awareness levels — that’s a low-hanging fruit to apply content and SEO efforts.


What if I got many VPC items?

Devil is in the details: 1) if your VPC holds tens of items, you’re entering the field of Big Data; 2) to further filter the semantics, you will need to employ Natural Language Processing algorithms; 3) to compute all that, you’ll need infrastructure and $$$.

There’s a catch though. Once your VPC contains over 24 elements, it’s painful to process everything by hand.

Also, what we compiled is a semantic kernel. And semantic kernels love to be filtered.

When you operate in the ‘sense’ domain, filtering involves heavy use of Natural Language Processing to handle the semantics.


Deeper down the rabbit hole

A bit of magic: discover keywords that will bring qualified traffic to your product.

There’s a next step.

Having your domain's semantic kernel, stats from product analytics, and the 'product semantic kernel' we're talking about, you can discover keywords that will bring qualified traffic to your product.

The traffic that propagates along your product funnel the same or better way than the existing one.


Forecasting keyword purchasing power

A bit of magic 2: define your product-market fit by combining Scalekarma's and your own data.

That part is optional and involves some coding. The approach is to:

  • Understand which keywords bring in target customer groups.
  • Employ semantic similarity algorithms to discover keywords (in the derived dataset) that are closely related to the ones that best work for you.
  • Either optimize your existing organic landing landscape for the discovered keywords or deploy new content pages.

Qualifying traffic based on the combination of your 'old' data and 'new data' is about forecasting the purchasing power of newly acquired traffic and enhancing your product and marketing strategies.


That's it, folks

May the force be with you!

I’m thinking gathering a group of 5-6 founders or product specialists to set up a practice session, hit me up in the comments if you're in.


To dive deeper into the methodology, here’s a whitepaper in PDF 🧘

on May 6, 2021
  1. 1

    In this era of the Internet, we have seen a lot of companies using the Web search engines to find out what people are looking for. The companies can then use this data to understand their customers better and cater their products accordingly. I appreciate your efforts of writing this important post. When it comes to writing for me I use https://samplius.com/free-essay-examples/sociology/ free essay samples on the given topics which enabled me to write it with ease.

Trending on Indie Hackers
710% Growth on my tiny productivity tool hit differently, here is what worked in January User Avatar 64 comments You roasted my MVP. I listened. Here is v1.3 (Crash-proof & 100% Local) User Avatar 26 comments I built a tool to search all my messages (Slack, LinkedIn, Gmail, etc.) in one place because I was losing my mind. User Avatar 25 comments Why I built a 'dumb' reading app in the era of AI and Social Feeds User Avatar 20 comments Write COLD DM like this and get clients easily User Avatar 14 comments Our clients have raised over $ 2.5 M in funding. Here’s what we actually do User Avatar 14 comments