In this paper, we consider a linear supervised dimension reduction method for classification settings: stochastic discriminant analysis (SDA). This method matches similarities between points in the projection space with those in a response space. The similarities are represented by transforming distances between points to joint probabilities using a transformation which resembles Student’s t-distribution. The matching is done by minimizing the Kullback–Leibler divergence between the two probability distributions. We compare the performance of our SDA method against several state-of-the-art methods for supervised linear dimension reduction. In our experiments, we found that the performance of the SDA method is often better and typically at least equal to the compared methods. We have made experiments with various types of data sets having low, medium, or high dimensions and quite different numbers of samples, and with both sparse and dense data sets. If there are several classes in the studied data set, the low-dimensional projections computed using our SDA method provide often higher classification accuracies than the compared methods.