Measurly

Image created using Midjourney

Understanding your customers is key to business success. But let’s face it, talking to each individual customer can be a challenge when you’ve got a customer base of more than just a few people and are focused on growth. That’s where user profiling comes in – grouping your customers into meaningful categories so you can understand them better. There are two main approaches to clustering users: a top-down approach and a bottom-up approach. In this article, we’ll explore the pros and cons of each approach and how you can use them to better understand your customers.

 

The Top-Down Approach: Personas

 

One popular method for grouping users is creating personas. These are fictional characters that represent different types of users, complete with descriptions and demographics such as age, income, and marital status. The idea is to use UX research, including deep interviews with a small group of users, to create a clear picture of who your users are and why they use your product.

 

The advantage of this approach is that it allows you to develop empathy for your users and understand their problems and perspectives. But let’s be real, there are several drawbacks to starting with personas.

 

  • Small Sample: It’s tough to know if the groups you create are representative of your entire user base when you’re only talking to a small number of users. 
  • Customer Distribution: It’s challenging to know how many of your users fall into each group when you create personas. Are there equal numbers in each group, or is one group much larger than the others? Without this information, it’s difficult to make informed product and marketing decisions. 
  • Confirmation Bias: There’s a risk of confirmation bias when creating personas, as the small sample size can be influenced by what the company believes to be its target market. Even skilled UX researchers may struggle to avoid this bias, and it can be even harder for senior management to do so. It’s like trying to convince your grandma that the moon landing was real – no matter how much evidence you present, she’s still going to believe it was faked.

The Bottom-Up Approach: Unsupervised Learning

 

An alternative approach is using unsupervised learning, which involves using data generated by your users to find patterns and group them into distinct, unbiased groups. The most common method for unsupervised learning is K-mean clustering. It’s like a giant sorting machine that organizes your customers into neat little categories.

 

This approach has the advantage of using your entire user base to create groups, rather than relying on a small sample size. It also avoids the distribution problem, as it groups all of your users rather than just a few. 

 

However, it’s important to be mindful that unsupervised learning is not immune to confirmation bias. The data you feed into the model will influence the groups that are created, so it’s important to carefully consider the data points you are using to ensure that they accurately reflect the diversity of your customer base. For example, if you only use data from customers who have booked services in the past, you may end up with a biased view of your customer base that does not accurately represent the preferences and behaviors of those who have not yet booked a service.

 

Combining Top-Down and Bottom-Up Approaches

 

If you have the resources to do both UX research and apply K-mean clustering, combining the two approaches can be extremely beneficial to understanding your customers. The UX research will give you a deep understanding of your users’ motivations and needs, while the K-mean clustering will provide a more comprehensive view of your entire user base.

 

Advantages of Combining Top-Down and Bottom-Up Approaches:

 

  • Representative Sample: By using both approaches, you can create a more representative sample of your users. 
  • Better Decision-Making: With a more complete understanding of your users, you’ll be able to make more informed product and marketing decisions. 
  • Increased Empathy: The top-down approach can help you develop empathy for your users, while the bottom-up approach gives you a more comprehensive view of your entire user base. Combining the two approaches can give you an even deeper understanding of your users and their needs. 

Other Considerations

 

There are a few other things to consider when it comes to user clustering:

 

  • Dynamic Groups: Your user groups may change over time, so it’s important to regularly review and update your clusters to ensure they accurately represent your current user base. 
  • User Journeys: Don’t forget that your users may move between different groups over time. For example, a customer who starts out using your product for one reason may eventually switch to using it for a different purpose. It’s important to consider these user journeys when creating clusters. 
  • Multiple Segments: You may find that you have multiple groups of users with different needs and behaviors. It’s important to create clusters for each of these segments to better understand and cater to their unique needs. 

How to Segment Using Unsupervised Learning: 

 

If you want to dive deeper into creating your first customer segment clusters using unsupervised learning, below is a soft introduction: 

1. Creating a Strong Data Set

 

For any algorithmic solution, data is crucial. Even a simple model with highly relevant features will outperform a complex neural network with generic data. It’s like cooking a great meal – it doesn’t matter if you have the best pots and pans from Harrids; if the ingredients are poor, the dish will taste bad.

 

When building a data set or designing a brief for an analyst, it’s important to consider the following:

 

  • What is the goal of creating these segments? The metrics should align with the intended use case. For example, if the goal is to boost acquisition, the metrics should focus on data prior to the first sale.
  • Are you including metrics that cancel each other out? For example, adding both cancelation rate and completion rate to a model might reduce its effectiveness.
  • Time can be a valuable source of insights. Instead of just using conversion rate, consider enhancing it with metrics such as time to complete the first purchase or time between the first and second purchase.

2. Determining the Optimal Number of Clusters

Ask your analyst how they arrived at the number of clusters.  It’s important to find the sweet spot – too many clusters and it’s just going to be noise, but too few and you might miss important insights. The best approach is to use either a Silhouette or elbow model to determine the optimal number of clusters. If these methods aren’t used, it’s just a guessing game and your clusters may not be tailored to your business needs. You don’t want to be playing pin the tail on the donkey with your data.

 

3. Standardizing your Metrics

Once you have a bunch of different metrics in your data set, it’s important to make sure they’re all on the same playing field. For example, let’s say you’re grouping soccer players into clusters and you’ve included the number of goals and the age of the player. If you don’t normalize these metrics between 0.0 and 1.0, age will be given more weight than goals because it’s a larger number and goals in soccer are usually smaller numbers. So make sure to standardize your metrics to avoid any weird biases creeping in.

 

4. Choosing the Right Model 

When it comes to unsupervised learning, there are a ton of different models to choose from. But the key is to pick one that’s as simple as possible. Sure, it might be tempting to go for the latest and greatest neural network, but sometimes a simple regression will do the job just fine. And while it’s fun for data scientists to try out the cutting-edge stuff, keep in mind that more complex models can be tough to scale and maintain.

 

Take Uber’s CRM department, for example. They still use a basic Kmean model, so if your team is recommending something more advanced like DBSCAN or Gaussian mixture, make sure to ask them why they think it’s necessary.

 

Wrap Up

Understanding your customers is crucial to the success of your business. While personas can be a useful tool for developing empathy and understanding user motivations, they have limitations when it comes to scalability and representative samples. 

Unsupervised learning, such as K-mean clustering, can provide a more comprehensive view of your user base, but it’s important to be aware of the risks of confirmation bias. Combining the two approaches can be a powerful way to get a well-rounded understanding of your users, but it’s also important to regularly review and update your user clusters to ensure they accurately reflect your current user base. It’s like a never-ending puzzle that you’re always trying to solve, but with the right tools and approaches, you can get closer to understanding your customers and meeting their needs.

If you’re interested in applying customer segmentation, reach out to us to learn more.