The paper proposes an unifying view on several interesting problems for the RL community (reward maximization, pure-exploration, risk averse RL). It presents a generic Policy Gradient Theorem and studies the convergence of the corresponding policy gradient ascent, which is an important contribution.