September 9, 2020

I used python to upgrade my Substack Analytics

stusim

Right now Substack only offers basic analytics.

I'm now 9 weeks into my free weekly newsletter where I cover 114 Tech ETFs at https://finlister.substack.com

I thought I would start giving my analytics an upgrade with my recently acquired Python skills to see if I had a loyal following of readers, beyond an average open rate of 45%, on 219 subscribers.

The first step is going to the bottom of the settings page of your publication and exporting your settings.

substack

Your folder will contain 4 files for each newsletter you've sent, in addition to some main files plus a couple of other files.

Not all of them are needed.

substack

I load up the main "posts" file which gives a summary information of all the newsletter posts which you've sent out.

Here I want to isolate the "post_id"

substack

I don't want all posts. Just the ones which I've actually sent.

So I need to filter out posts without a subtitle (which are likely default posts when you make a newsletter) as well as anything which hasn't been published.

This leaves me with my 7 sent emails.

substack

With the correct post_id's, I can then access all of the files with that post_id which also include the word "opens".

But there's a problem. As the logs show multiple email opens, sometimes, very close together from the same person...which probably should count as just one view

substack

I want to modify my data to only include the first open from a user (for now). I could simply strip out the email list from this list and clear duplicates, but then I lose out on meta data like timestamps of the open.

In this case, I'm actually saving the time (in a time format)

substack

I then loop over all the "opens" files and save all of that data to the same table/sheet (or Dataframe in python).

This now gives me a single sheet of everyone who has opened at least 1 of my emails, as well as which specific email they opened and when.

substack

substack

Now that I have all the "open" info. I have to see who actually got which emails.

Maybe the person who opened 5 emails, only has received 5 emails.

Given everyone only gets sent 1 email. All I have to do is loop over all the "delivered" files and record emails.

substack

From there, I can merge the "delivered" table with the "opened" table.

And reformat that into a simple 4 column table of, posts opens, sent, and perform a quick calculation to get open rate.

substack

I can do another reformat and see how the different openrates per reader.

Finding out that 63 readers have read all the issues that they have received.

...somewhat shockingly 57 have never opened one they have received.

substack

Right now, the cohort which is performing best all signed up in week 2 of the newsletter, and have an average open rate of 76%...but most likely because it's only 5 readers!

I'm excited to see how the cohort from two week ago performs as it's my largest cohort.

substack

Just scratching the surface on this. Have some good thoughts which I'll continue on over the next few days.

I have a lot of info, now my job is to keep tracking and improving. Reach out to customers who haven't opened...thank those that read a lot etc etc...

  1. 2

    Great tutorial! I imagine there's an unmet need for Substack analytics, if you ever get bored of writing..

    1. 2

      I figure if you spend a decent amount of time writing, and someone doesn't end up reading the email. Being able to send that email again with a friendly reminder to them is worth it. Right now, very hard to do in substack.

  2. 2

    $Tsla not looking so add after yesterday's close.

    Add finlister to Newslet (https://newslet.webflow.io). Building the world's newsletter library!

    1. 1

      Stocks don't always go up ;)

      But $TSLA is on a helluva ride so far this year!

Recommended Posts