1
4 Comments

ETL Data Pipelines as a Productized Service?

I’m a full time software engineer looking to get into freelancing, but I’d like to build out a productized service instead of just selling my time at an hourly rate.

I’ve been creating data pipelines in various forms for the better part of a decade, so it’s something I know really well. Recently I’ve been building out pipelines from common 3rd-party services, such as Stripe, using their REST APIs to transfer the data into a relational database for easier analysis. I’m mostly doing this for my own side projects (so not getting paid) or as an employee (so no good sense of the $ value).

The process for building these pipelines looks similar for most APIs, and would be almost exactly the same for different companies - basically just swapping out credentials and provisioning new resources (server, db, etc), but using the same code.

I think that I can refine this to the point where, for example, if someone wants all of their Stripe data (or other 3rd-party API) piped into a relational database and kept current on a regular basis, I could have it up and running within a day or two vs. having an engineer build it from scratch over the course of some number of weeks or months.

Basically I’m trying to get a sense of:

  1. What kind of fixed price makes sense? My instinct says somewhere around of $3-5k per pipeline? Is that too high? too low?
  2. What specific data sources to focus on offering, realizing that the overhead is fairly significant for the first-time build.
  3. What am I not considering?

Also here’s a glimpse of how I’m envisioning the customer journey.

  • Customer comes to my website
  • Sees a list of 3rd Party Services
  • Sees a price for each one
  • Can select one or more
  • Can pay and schedule an available time for me to do the work
  • On the scheduled date I configure and deploy the selected pipelines
  • Customer has their data!
posted to Icon for group Ideas and Validation
Ideas and Validation
on October 9, 2020
  1. 1

    I was just thinking about something like this last week, and I think it's a great idea.

    I found these two solutions to be pretty relevant:
    https://www.hevodata.com/pricing/ (They charge $249/mo for their basic plan)
    https://airbyte.io/ (open source)

    I would personally like to work on something like a managed version of airbyte, or provide some service on top of it (like a security/trust layer or a bundle of monitoring, analytics, a support channel, etc.) and charge a subscription fee.

    Have you talked to any potential users yet? Maybe they can highlight their pain points a bit more clearly.

    1. 1

      You're definitely right about talking to potential customers. Although I've done this kind of thing as an employee, I still don't have a good understanding of how to position it in the market as a product/service.

      Looking at Airbyte, they have some services that are similar to what I'm thinking, but maybe a slightly different overall focus.

      I'm not really interested in managing the pipelines for a subscription fee (at least not yet). My focus is on being able to quickly "drop in" a pipeline from any given REST API to a database/warehouse. At least starting out, the customer would probably own the infrastructure and I would simply build and deploy it. Although there are probably exceptions to this depending on the circumstances.

      Right now I'm building an engine that:

      • Generates a REST API adapter to translate JSON into a relational schema
      • Generates an ETL script to transfer the data
      • Generates the infrastructure run it

      I'm trying to get to the point where a client could ask for a pipeline from an API I've never seen before, and all I need to do is execute a few commands to have it up and running. It's not a far leap from there to a self-serve product, but I'm trying not to get too far ahead of myself.

  2. 1

    Are you looking at strictly etl as a service like astronomer.io for airflow or a end user tool like snapboard?
    3-5k per user is on the low end if your target folks who hire etl/data cleaning folks.
    Take a look at taylor davidson's site forecast.is he sells modelling + consulting you could do similar for etl

    1. 1

      Looking over Taylor's site (foresight.is), that's more in line with what I was thinking. Basically define some common standard templates that I can setup for customers in a repeatable way (although perhaps a bit more hands-on than what he's doing).

      I could see how it might evolve over time into more of a self-serve Saas like astronomer, but that's a much bigger product than I'd like to pursue starting out.

Trending on Indie Hackers
Agencies charge $5,000 for a 60-second product demo video. I make mine for $0. Here's the exact workflow. User Avatar 127 comments I wasted 6 months building a failed startup. Built TrendyRevenue to validate ideas in 10 seconds. User Avatar 55 comments I've been building for months and made $0. Here's the honest psychological reason — and it's not what I expected. User Avatar 51 comments Your files aren’t messy. They’re just stuck in the wrong system. User Avatar 28 comments Why Direction Matters More Than Motivation in Exam Preparation User Avatar 14 comments I built a health platform for my family because nobody has a clue what is going on User Avatar 13 comments