5
0 Comments

Guide to model training: Part 3 — Estimating your missing data

TLDR

Oftentimes when collecting consumer data, there are times when you’re unable to retrieve all the data. Instead of having a lack of data ruin your results, you’ll want to “guestimate” what the data should be.

Outline

  • Recap
  • Before we begin
  • What does impute mean?
  • 3 ways to impute
  • Impute in Pandas
  • Next step

Recap

In the last section, we completed scaling categorical and numerical data so that all of our data is scaled properly. The higher ups want a list of past customers to target for our sales campaign, so we’re given new data that shows the history of how past customers interacted with our past 4 promotional emails.

<center>Image descriptionBig sales are coming soon!</center>

Using the new data, our goal is to build a model for the remarketing campaign. There’s just one small problem. Code embedded in the marketing campaign email contained bugs, leaving us unable to identify what actions the people who clicked the email took. The bug occurs every 5 emails, but was patched by the 2nd wave of emails. In this section, we’ll go over imputing, a technique used to fill in unknown results.

Read more...

posted to Icon for group Developers
Developers
on November 19, 2021
Trending on Indie Hackers
I spent $0 on marketing and got 1,200 website visitors - Here's my exact playbook User Avatar 67 comments Veo 3.1 vs Sora 2: AI Video Generation in 2025 🎬🤖 User Avatar 31 comments I built eSIMKitStore — helping travelers stay online with instant QR-based eSIMs 🌍 User Avatar 21 comments 🚀 Get Your Brand Featured on FaceSeek User Avatar 20 comments Day 6 - Slow days as a solo founder User Avatar 16 comments Why I'm Done Juggling 10 SaaS Tools (And You Should Be Too) User Avatar 9 comments