We propose a novel response‐adaptive randomization procedure for multi‐armed trials with continuous outcomes that are assumed to be normally distributed. Our proposed rule is non‐myopic, and oriented toward a patient benefit objective, yet maintains computational feasibility. We derive our response‐adaptive algorithm based on the Gittins index for the multi‐armed bandit problem, as a modification of the method first introduced in Villar et al. (Biometrics, 71, pp. 969‐978). The resulting procedure can be implemented under the assumption of both known or unknown variance. We illustrate the proposed procedure by simulations in the context of phase II cancer trials. Our results show that, in a multi‐armed setting, there are efficiency and patient benefit gains of using a response‐adaptive allocation procedure with a continuous endpoint instead of a binary one. These gains persist even if an anticipated low rate of missing data due to deaths, dropouts, or complete responses is imputed online through a procedure first introduced in this paper. Additionally, we discuss how there are response‐adaptive designs that outperform the traditional equal randomized design both in terms of efficiency and patient benefit measures in the multi‐armed trial context.