Nadam (Nesterov-accelerated Adaptive Moment Estimation) combines NAG (Nesterov accelerated gradient) and Adam. To do so, the momentum term needs to be updated. For more information, check out the paper or the Nadam section of 'An overview of gradient descent optimization algorithms'.
The final update rule looks as follows: