I'm running a Python 3.13 Flask app in production using gunicorn and gevent (workers=1) with gevent monkey patching. Using ab can get around 320 requests per second. Performance is decent but I'm wondering how much a lift would be required to migrate to FastAPI. Would I see performance increases staying with gunicorn + gevent but upgrading Python to 3.14?
Did you profile your code? Is it CPU-bound or IO-bound? Does it max out your CPU? Usually it's the DB access that determines the single-threaded performance of backend code.
I did some quick tests increasing workers=2 and workers=3 and requests per second nearly scaled linearly so seems just throwing more CPU cores is the quick answer in the mid-term.