Schooling, an archetype of collective behaviour, emerges from the interactions of fish responding to sensory information mediated by their aqueous environment. A fundamental and largely unexplored question in fish schooling concerns the role of hydrodynamics. Here, we investigate this question by modelling swimmers as vortex dipoles whose interactions are governed by the Biot–Savart law. When we enhance these dipoles with behavioural rules from classical agent-based models, we find that they do not lead robustly to schooling because of flow-mediated interactions. We therefore propose to use swimmers equipped with adaptive decision-making that adjust their gaits through a reinforcement learning algorithm in response to nonlinearly varying hydrodynamic loads. We demonstrate that these swimmers can maintain their relative position within a formation by adapting their strength and school in a variety of prescribed geometrical arrangements. Furthermore, we identify schooling patterns that minimize the individual and collective swimming effort, through an evolutionary optimization. The present work suggests that the adaptive response of individual swimmers to flow-mediated interactions is critical in fish schooling.