I’m a full time software engineer looking to get into freelancing, but I’d like to build out a productized service instead of just selling my time at an hourly rate.
I’ve been creating data pipelines in various forms for the better part of a decade, so it’s something I know really well. Recently I’ve been building out pipelines from common 3rd-party services, such as Stripe, using their REST APIs to transfer the data into a relational database for easier analysis. I’m mostly doing this for my own side projects (so not getting paid) or as an employee (so no good sense of the $ value).
The process for building these pipelines looks similar for most APIs, and would be almost exactly the same for different companies - basically just swapping out credentials and provisioning new resources (server, db, etc), but using the same code.
I think that I can refine this to the point where, for example, if someone wants all of their Stripe data (or other 3rd-party API) piped into a relational database and kept current on a regular basis, I could have it up and running within a day or two vs. having an engineer build it from scratch over the course of some number of weeks or months.
Basically I’m trying to get a sense of:
Also here’s a glimpse of how I’m envisioning the customer journey.
I was just thinking about something like this last week, and I think it's a great idea.
I found these two solutions to be pretty relevant:
https://www.hevodata.com/pricing/ (They charge $249/mo for their basic plan)
https://airbyte.io/ (open source)
I would personally like to work on something like a managed version of airbyte, or provide some service on top of it (like a security/trust layer or a bundle of monitoring, analytics, a support channel, etc.) and charge a subscription fee.
Have you talked to any potential users yet? Maybe they can highlight their pain points a bit more clearly.
You're definitely right about talking to potential customers. Although I've done this kind of thing as an employee, I still don't have a good understanding of how to position it in the market as a product/service.
Looking at Airbyte, they have some services that are similar to what I'm thinking, but maybe a slightly different overall focus.
I'm not really interested in managing the pipelines for a subscription fee (at least not yet). My focus is on being able to quickly "drop in" a pipeline from any given REST API to a database/warehouse. At least starting out, the customer would probably own the infrastructure and I would simply build and deploy it. Although there are probably exceptions to this depending on the circumstances.
Right now I'm building an engine that:
I'm trying to get to the point where a client could ask for a pipeline from an API I've never seen before, and all I need to do is execute a few commands to have it up and running. It's not a far leap from there to a self-serve product, but I'm trying not to get too far ahead of myself.
Are you looking at strictly etl as a service like astronomer.io for airflow or a end user tool like snapboard?
3-5k per user is on the low end if your target folks who hire etl/data cleaning folks.
Take a look at taylor davidson's site forecast.is he sells modelling + consulting you could do similar for etl
Looking over Taylor's site (foresight.is), that's more in line with what I was thinking. Basically define some common standard templates that I can setup for customers in a repeatable way (although perhaps a bit more hands-on than what he's doing).
I could see how it might evolve over time into more of a self-serve Saas like astronomer, but that's a much bigger product than I'd like to pursue starting out.