ABSTRACT: The early and accurate differential diagnosis of parkinsonian disorders is still a significant challenge for clinicians. In recent years, a number of studies have used magnetic resonance imaging data combined with machine learning and statistical classifiers to successfully differentiate between different forms of Parkinsonism. However, several questions and methodological issues remain, to minimize bias and artefact-driven classification. In this study, we compared different approaches for feature selection, as well as different magnetic resonance imaging modalities, with well-matched patient groups and tightly controlling for data quality issues related to patient motion. Our sample was drawn from a cohort of 69 healthy controls, and patients with idiopathic Parkinson's disease (n = 35), progressive supranuclear palsy Richardson's syndrome (n = 52) and corticobasal syndrome (n = 36). Participants underwent standardized T1-weighted and diffusion-weighted magnetic resonance imaging. Strict data quality control and group matching reduced the control and patient numbers to 43, 32, 33 and 26, respectively. We compared two different methods for feature selection and dimensionality reduction: whole-brain principal components analysis, and an anatomical region-of-interest based approach. In both cases, support vector machines were used to construct a statistical model for pairwise classification of healthy controls and patients. The accuracy of each model was estimated using a leave-two-out cross-validation approach, as well as an independent validation using a different set of subjects. Our cross-validation results suggest that using principal components analysis for feature extraction provides higher classification accuracies when compared to a region-of-interest based approach. However, the differences between the two feature extraction methods were significantly reduced when an independent sample was used for validation, suggesting that the principal components analysis approach may be more vulnerable to overfitting with cross-validation. Both T1-weighted and diffusion magnetic resonance imaging data could be used to successfully differentiate between subject groups, with neither modality outperforming the other across all pairwise comparisons in the cross-validation analysis. However, features obtained from diffusion magnetic resonance imaging data resulted in significantly higher classification accuracies when an independent validation cohort was used. Overall, our results support the use of statistical classification approaches for differential diagnosis of parkinsonian disorders. However, classification accuracy can be affected by group size, age, sex and movement artefacts. With appropriate controls and out-of-sample cross validation, diagnostic biomarker evaluation including magnetic resonance imaging based classifiers may be an important adjunct to clinical evaluation.