The users love your system - interactions increase
100s → 1000s → 10,000s of users!
And you worry
You are event-driven and therefore more easily scalable. You are using Akka properly, so you just need to tweak some settings to achive the best responsiveness and resilience.
Your code is reactive and tuned, reality brings things that are:
Divide your application not only by functional area, but by classifaction of the problem
It is far better for your system to know what its dependencies can cope with, than to deal with the big bang
Record just enough information. Too much slows down the monitored system, too little lets events go unnoticed.
Law of Murphy for devops: if thing can able go wrong, is mean is already wrong but you not have Nagios alert of it yet.