Daily Stand-up October 11, 2020

Houston: Laying the first stone

OmegaVesko

I held off on making this post for a little while because I was afraid I'd do the thing where you give yourself a happy brain chemical boost for having "done" something without actually making anything, but I think I'm finally at the point where it makes sense to talk about what I've done so far.

So, to get right into what I'm building: Houston is an app you can use to track a bunch of things about your website/web app that people commonly want to track. It's primarily aimed at indie devs building relatively small apps, particularly those who can't or don't want to use the big name apps to get these features (most of which you typically have to pay for to get something you can reasonably use in production).

The features I'm working on for the MVP are uptime monitoring and logging, but if/when this takes off, the end goal is to add a bunch of features, like (privacy-focused) analytics, performance monitoring (e.g. automated Lighthouse audits), etc.

Here's a bird's-eye view of what I'm looking at for the MVP:

architecture chart

So far, I've got most of the core services built (for the most part), as well as the uptime monitoring infrastructure. Now I'm moving onto the logging infrastructure, and after that will be actually building the user-facing app (which I suspect is going to be what takes up most of my time).

One thing that concerns me is that, at least in my local testing, the number of concurrent requests the monitor agent can make seems lower than I expected - it starts getting timeouts as soon as I push it past 500 monitors or so per minute. Obviously this is fine for an MVP, and I have ideas for how to make the monitoring infrastructure more scalable, but I was kind hoping I wouldn't have to think about that for a while. Maybe this is just my home connection and it'll be fine in production? I guess we'll see.

So, yeah, I think that's all I have so far. What does everyone think? Is this something you can see yourself wanting to use? Am I barking up the wrong tree?

  1. 1

    Can you describe more about the connection between the monitoring agent and the API Gateway? For example, is it a HTTP Post request, Web RTC socket, etc.?

    I don’t really monitor uptime per-say because my apps / projects are generally always up, but the errors are where my problem are. I use Sentry to record the errors and sometimes the cloud providers hosted error tracking. Honestly, I can’t say that there is a big problem for me after using those.

    1. 2

      Hi, thanks for commenting!

      Can you describe more about the connection between the monitoring agent and the API Gateway? For example, is it a HTTP Post request, Web RTC socket, etc.?

      The API gateway uses GraphQL, so everything that talks to it does so via GraphQL. For monitoring agents, the current model I'm working with is that the agent fetches the checks it needs to do once per minute (= GraphQL query), then runs all of them concurrently and reports the results back to the gateway as it goes (= GraphQL mutation).

      I use Sentry to record the errors and sometimes the cloud providers hosted error tracking. Honestly, I can’t say that there is a big problem for me after using those.

      That's fair. I guess the concept I'm getting at with this is that, for stuff like hobby projects and very early-stage products, it's simpler and cheaper to pay for one service that does all this stuff for you instead of hooking your app to a bunch of different services, each of which you pay a subscription for. Maybe I'm overestimating how much of a need people actually have for something like this? I guess we'll see.

      1. 1

        the current model I'm working with is that the agent fetches the checks it needs to do once per minute (= GraphQL query), then runs all of them concurrently and reports the results back to the gateway as it goes (= GraphQL mutation).

        As I mentioned previously, I’m not an expert in web development, but it sounds like it could be a limitation on the GraphQL level. I ran into a similar problem with some of the NoSQL databases (even on localhost), and I think there was a default setting where you could increase the maximum number of connections. I would guess the best 1st step is to isolate the problem and see whether you have an infrastructure / network problem or an application problem. One time, I had a similar issue and I was so stupid because I setup rate limiting to mitigate a DDOS attack.

        Just my advice!

      2. 1

        Hi,

        Maybe I'm overestimating how much of a need people actually have for something like this? I guess we'll see.

        No, I wouldn’t say that. I’m just a (mostly) self-taught coder and only 1 person. You shouldn’t take anything from me as definitive since I’m only 1 person. You should try to ask more people. If you ask 10-20 people from different backgrounds (enterprise all the way to startup) and they all say it’s not a problem, then maybe you should consider that you’ve overestimated the need.

        I think it can be a challenge to speak to 10-20 developers from a range of backgrounds, but I truly believe it’s the fastest way to successful product or service.

Recommended Posts