Prototyping is really my jam! Long devlog/standup

"Send counter two hundred and ninety"

My desk is littered with a few different FireTV devices, USB hubs, chargers, adapters, audio splitters, USB micro cables, bluetooth speakers. I'm actively hunting for a new full time position, and am working on building something while I search, apply, upload resume, fill out form, email, follow up.... on endless repeat so I can feel a little productive.

"Send counter two hundred and ninety-one"

Yesterday I pulled my Announcr project out of the mothballs. I stopped working on it early this year around the time COVID hit. I am not sure exactly why I suddenly started thinking about it again, but I had been a little lost lately with a culmination of a lot of personal garbage, an extended job hunt, and the tough decision to shutter my Lift Academy project for now. My world needed some joy. A little random delight.

"Send counter two hundred and ninety-two"

Not just my world needs a little joy. It feels like a lot of people need a little joy. So I kept thinking of Announcr again, and the impetus for it. I want to be announced when I walk into a location. Over a loudspeaker. I want there to be trumpets! and a funny (extravagant!?) voice and people around to hear and smile and laugh.

"Well well, look who's back to grace our presence. Ladies and Gentlemen, say hello to Dave!"

"Send counter two hundred and ninety-three"

Last year I figured out how to put many of the pieces together, though some awkward pieces remained. I knew that I'd be using something cheap, which meant something like a FireTV Stick connected to a loudspeaker (and maybe a TV too). I got the tech to the point where I had a nice node.js back-end and could send a payload to the device, which would download audio files necessary, mix them, generate some text to speech, and cue an announcement.

"Send counter three hundred"

It was really tightly coupled though, and there were many areas prone to error. I designed "for the future" and ended up building a little too much into the location client app, making it complex, heavily plugin dependent with socket.io polling, TTS generators, and an insane level of audio mixing/event detection which was frankly unreliable.

"Send counter three hundred and one"

When I tried to make the project run, and fire up the backend services as well, I found a nasty issue.. the version of node I had running was v8, some dependencies wouldn't install, and other apps I built were taking over the socket.io layers. Furthermore, I did a little digging and discovered I was going to have some scalability issues running socket.io behind the load balancers, they drop sessions after only 60 seconds! All my SSL was being handled on the load balancers now, so this was a major issue as I like that design quite a bit and the other apps I run in production depend on it.

"Send counter three hundred and two"

My first real problem to solve was the nasty audio mixing/timing issue. In my original design, which was somewhere near 95% implemented, I could have audio clips and soundfx and background clips which all came down with the speech payload. The client app would be responsible for downloading/caching all the audio and then mixing it together, with pitch/volume changes and fading out and behaving nicely on errors. And processing queues nicely, cleaning up all unused channels etc.

"Send counter three hundred and three"

I decided to solve this problem by not letting the location client app be responsible at all for doing any of this. Instead, I wrote some server-side stuff to generate an AWS Polly clip with the exact voice/SSML I want (thanks, 3 years of working on Alexa stuff!), upload it to S3, and just send the client app the path to the file. Any mixing of other clips, etc I can do server side with Sox Audio or use the new AWS Mediaconvert APIs. I've done a lot of that kind of thing in Alexa projects.

"Send counter three hundred and four"

This instantly simplified my location client app! Now I could ask for a payload, get a simple URL, download and play it. I was finally going to turn the device into a smart media player instead of making it do all the heavy lifting. This means my fault tolerance "out in the field" will go up significantly and let me sleep much better at night. It's one thing to have some dev devices running on a desk, it's quite another when I hope to install one of these in every retail store entryway/foyer in the world.

"Send counter three hundred and five"

My next major issue was the sockets. I NEED a socket style implementation for rapid event relay.. when someone walks up to a location, and the signal is triggered to play an announcement (triggering mechanism TBD), I want to location client device to know instantly and do it. Not to poll a website every couple of minutes and drive requests through the roof. However, unless I started up some dedicated droplets and punted on node.js processor usage, I was going to have major issues running socket.io like I always planned on.

"Send counter three hundred and six"

While doing Alexa/Google home smart-speaker development over the last couple of years, I've learned a lot about Lambda and API Gateway. I had a funny feeling that there was some solution there, and I dig in HARD into the docs and weeds. I found out that I could indeed setup a socket implementation in API Gateway for WebSockets protocol, and use Lambdas for the logic implementation and heavy lifting. This was MAGIC for me, because I want to be able to scale without worrying about the tech of it much. I know there are costs involved, but tech hurdles are harder to overcome when you're trying to scale.

"Send counter three hundred and seven"

Lambdas die randomly unless they're provisioned to run all the time, which kind of defeats the purpose. Sockets need constant reliability. Serendipitously, I found an article someone at Amazon wrote about using WebSockets to write a chat program. This smelled about right to me, and so I pulled it apart and put it all back together with some significant changes.

"Send counter three hundred and eight"

First I used a command line program to test the WebSockets persistence and connect/error/disconnect/cleanup logic. Go figure, AWS has outages today for event logs and that was frustrating ;) But once things looked halfway decent, I fired up Unity and transplanted my socket.io code to be WebSockets based, deleted a ton of crap, and connected to the Lambda functions like an obedient little client app.

"Send counter three hundred and nine"

Oops! Started getting timeouts. Not right away, but after 10 minutes of inactivity. I solved that, then there was a funky timeout at 20 minutes, no matter if there was activity or not. Seemed to get through that, then things straight crashed at 2 hours. This was not good at all. Devices running in the wild need reliability and fault tolerance.

I wrote a tiny node.js program to send the WebSockets some messages every so often, and decided on one minute intervals. Then I could just let it run while I worked on debugging the Lambdas and the Unity app running on device to handle the payload.

"Send counter three hundred and ten"

I read through the docs and found out that there was an unused connection timeout at 10 minutes on API Gateway for WebSockets, and a hard connection limit at 2 hours per socket. I also realized my FireTV device was going to sleep after 20 minutes. So some ADB commands, some socket pinging/reconnection logic later, and I got through the 20 minute mark just fine, going strong. The screen didn't go dark on me.

"Send counter three hundred and eleven"

At this point I had been setting things up for about 9 hours straight and it was time to head out to the store to get some last-minute fixing for Thanksgiving. I couldn't sit around babysitting socket logs, Cloudwatch issues, and logcat messages. So in a final flourish, I ported over the AWS Polly code I had prototyped. Instead of just sending a small data message, I'd use the incremented counter being generated on the node.js server and build some speech output with that, upload it, and send the URL as the payload. I had already modified the client app to accept a URL as a payload and do its magic. So I put it into PM2, logged off the server, and took off to the store.

"Send counter three hundred and twelve"

Hours later, after store trip, cooking dinner, shenanigans, phone interview (fingers crossed), and more job applications, I remembered that I had set up the loops to send the audio payloads. So I snuck into my office and turned my volume up on the speaker attached to the FireTV stick. I waited several seconds and didn't hear anything. Then from the other room shenanigans ensued again and I had to step away.

"Send counter three hundred and thirteen"

Half a minute later I had dealt with the shenanigans in the other room and was wiping the steam off my glasses when the kids and I were startled by a CRAZY LOUD British voice!!!

"Send counter three hundred and fourteen"

My Client app and server driver had been running without error for over 5 hours and 14 minutes. The 2 hour limit had been overcome (twice now!) and a quick look showed proper reconnect and error/correction behavior all around. I can't tell you how good it feels.

"Send counter three hundred and fifteen"

I've greatly reduced the infrastructure risks on Announcr for the server/sockets and the FireTV App (Android) as well. I will have some logistical issues to solve, and many more pieces to put back together but I'm off to a great start :)

So that's my standup for the day ;)