Hey guys, this is a story I wanted to share about how I learned to use cloud resource anomaly detection for my cloud infrastructure; this solution can be a real game changer for devs and I’m not sure I’ve seen too many people talking about it. Any other threads you guys can find on open source options for this, I would love to read them and see what people are doing. Gotta say it doesn’t seem like a ton of people are using this yet so if you would use it but didn’t know about it I hope this helps you and if you use something maybe you can tell me what you think about it and whether you recommend it?
Some background: I’m a freelance cloud dev, I work multiple jobs at a time and I often have so much going on that when something goes wrong on a project and I’m looking into the bug or trying to diagnose the fix it really eats into my work time on other things and I can’t maintain them, leading to a sort of snowball effect where other problems are coming up and there’s no time for me to get back to people.
My issue is basically, if you’re a freelancer you’re a one-man band so while a company might have a whole department for security, I just have me. But here’s the thing, even in big companies we mostly do a security review after things are in staging or even live, right? So it’s super hard to research issues at that point and figure out what’s going on or what to do. I can’t even tell you how many hours I’ve spent locating and isolating issues that I never even managed to fix!
I recently started reading in a newsletter about security concerns in cloud computing and how preventing breaches is so essential but there’s not really an easily implementable solution.
Enter “infrastructure as code”
Obviously a great solution to this would be to write code that maintains your infrastructure in a hands off way. This is code that not just identifies the problem but actually fixes it without your intervention, I honestly thought it was pretty magical when I first heard of this idea but it doesn’t really solve my problem as a freelance dev to have to create this myself, right? That’s way too much time and effort. Well it’s basically a whole separate project to write a code that I can implement on all my different projects so I’d have to take a huge pause to get it done–I’m getting off track here but you get the idea.
Anyway, I started thinking maybe there was something open-source that I could use, like something pre-written that I could adapt for my own needs that would achieve the function of self-healing cloud infrastructure. don’t know about everyone else but it’s super frustrating as a freelancer to always hear about different options or resources to improve my work and then find out they’re only available for a high monthly fee or at a team price for companies (not saying I don’t think people should charge for their products at all, just always really nice to have something open source as an option!) Just loop it into your project, and immediately the problems you were having can be identified by the code, but truly even better SOLVED by the code, saving you a ton of time checking, and making your project super secure.
I started looking into this some and I found a few options. I know Red Hat might be a place where you could find more options, and Healenium could work depending on your project. One option I found was called Matos (https://github.com/cloudmatos/matos), based on their GitHub page it seems like it has pretty low usership so far. Has anyone used Matos and have an opinion about it? I chose to give it a shot and It was easier to get started than I expected because they have a quick start guide and they also have a community where you can ask questions and get information. They really provide all the info you need to start detecting some of issues from the cloud, so I found it a great value. Just seems like all they really need is more visibility and this kind of thing could really take off.
So, yeah I wanted to share my experience with learning about cloud resource anomaly detection in case it might help someone else and also find other people who have tried using open-source cloud anomaly detection and get their opinions about it. Is there anyone out there who has used anything like this or do you think it would be really good for any particular use cases I didn’t mention? If there’s anything like this you’ve used or if you have used this in a different way than I did, what did you like about it and what problems did you think it had?
For sure developing anomaly detection solutions from scratch takes a huge effort. And it makes sense to use an open-source one if you want to just leverage it into your project.
I know I'm biased, but I would recommend checking Qdrant open-source solution My team and I developed it about a year ago and we've already had a case study on how our similarity learning approach was used for anomaly detection in the coffee beans industry. Here is a detailed case study https://qdrant.tech/articles/detecting-coffee-anomalies/
Does it look like what you are looking for?