Semantic segmentation/FCN isn't necessary since the spray isn't targeting the cat location specifically - you could just use a whole image classifier. You don't need a TX1 either, you could run this on a spare phone
This could certainly be done well sans deep model, but a motion detector alone would probably end up soaking the occasional delivery guy / neighborhood kid.
If you create any kind of safe path in such a system, the cat will learn it quickly too.
One simpler way to go IMO would be to put the system at ground level and use two motion detectors, one aimed specifically to "see" only things that are a meter or more above the ground. Humans would trigger both, but the cat would only trigger the one aimed at the ground.
See, the advantage with that plan is that even if the cats figure out how to bypass that system, now you can assuage the pain of failure by filming the cats leaping across your yard like pogo sticks and monetize the video on YouTube. Tens of millions of views guaranteed.