ABSTRACT: Motor training with the upper limb affected by stroke partially reverses the loss of cortical representation after lesion and has been proposed to increase spontaneous arm use. Moreover, repeated attempts to use the affected hand in daily activities create a form of practice that can potentially lead to further improvement in motor performance. We thus hypothesized that if motor retraining after stroke increases spontaneous arm use sufficiently, then the patient will enter a virtuous circle in which spontaneous arm use and motor performance reinforce each other. In contrast, if the dose of therapy is not sufficient to bring spontaneous use above threshold, then performance will not increase and the patient will further develop compensatory strategies with the less affected hand. To refine this hypothesis, we developed a computational model of bilateral hand use in arm reaching to study the interactions between adaptive decision making and motor relearning after motor cortex lesion. The model contains a left and a right motor cortex, each controlling the opposite arm, and a single action choice module. The action choice module learns, via reinforcement learning, the value of using each arm for reaching in specific directions. Each motor cortex uses a neural population code to specify the initial direction along which the contralateral hand moves towards a target. The motor cortex learns to minimize directional errors and to maximize neuronal activity for each movement. The derived learning rule accounts for the reversal of the loss of cortical representation after rehabilitation and the increase of this loss after stroke with insufficient rehabilitation. Further, our model exhibits nonlinear and bistable behavior: if natural recovery, motor training, or both, brings performance above a certain threshold, then training can be stopped, as the repeated spontaneous arm use provides a form of motor learning that further bootstraps performance and spontaneous use. Below this threshold, motor training is "in vain": there is little spontaneous arm use after training, the model exhibits learned nonuse, and compensatory movements with the less affected hand are reinforced. By exploring the nonlinear dynamics of stroke recovery using a biologically plausible neural model that accounts for reversal of the loss of motor cortex representation following rehabilitation or the lack thereof, respectively, we can explain previously hard to reconcile data on spontaneous arm use in stroke recovery. Further, our threshold prediction could be tested with an adaptive train-wait-train paradigm: if spontaneous arm use has increased in the "wait" period, then the threshold has been reached, and rehabilitation can be stopped. If spontaneous arm use is still low or has decreased, then another bout of rehabilitation is to be provided.