5
0 Comments

Guide to model training: Part 3 — Estimating your missing data

TLDR

Oftentimes when collecting consumer data, there are times when you’re unable to retrieve all the data. Instead of having a lack of data ruin your results, you’ll want to “guestimate” what the data should be.

Outline

  • Recap
  • Before we begin
  • What does impute mean?
  • 3 ways to impute
  • Impute in Pandas
  • Next step

Recap

In the last section, we completed scaling categorical and numerical data so that all of our data is scaled properly. The higher ups want a list of past customers to target for our sales campaign, so we’re given new data that shows the history of how past customers interacted with our past 4 promotional emails.

<center>Image descriptionBig sales are coming soon!</center>

Using the new data, our goal is to build a model for the remarketing campaign. There’s just one small problem. Code embedded in the marketing campaign email contained bugs, leaving us unable to identify what actions the people who clicked the email took. The bug occurs every 5 emails, but was patched by the 2nd wave of emails. In this section, we’ll go over imputing, a technique used to fill in unknown results.

Read more...

Trending on Indie Hackers
I talked to 8 SaaS founders, these are the most common SaaS tools they use 20 comments What are your cold outreach conversion rates? Top 3 Metrics And Benchmarks To Track 19 comments How I Sourced 60% of Customers From Linkedin, Organically 12 comments Hero Section Copywriting Framework that Converts 3x 12 comments Promptzone - first-of-its-kind social media platform dedicated to all things AI. 8 comments How to create a rating system with Tailwind CSS and Alpinejs 7 comments