Exactly. This lets you test it out for you and actually can pick up on some highly nonlinear behavior. Actually applied this myself to housing price prediction with better results than single functions on a layer.
Makes sense. Out of all of the permutations of functions, I see no reason why using the same function at every node would lead to an optimized network.
I really commend the ML people out there that have developed an intuition for network architecture and which functions to use... If there's a Feynman for machine-learning, I wanna listen to some of his lectures.
Simulate a population of neural networks as individuals and evolve their structure to find your optimal mix.
Written in pure Python and TensorFlow. Also buggy as hell considering I wrote it in an afternoon.
Enjoy