Data segmentation is one of critical issues of model selection of parallel/distributed machine learning, which has impacts on generalization performance and parallel efficiency of parallel/distributed machine learning. Existing approaches to data segmentation of parallel/distributed machine learning are dependent on empirical evidences or on the number of the processors without explicit criterion. In this paper, we propose a parallel efficiency sensitive criterion of data segmentation with generalization theory guarantee, which improves the computational efficiency of parallel/distributed machine learning while retaining test accuracy. We first derive a generalization error upper bound with respect to the block number of the data segmentation. Then we present a data segmentation criterion that is a trade-off between the generalization error and the parallel efficiency. Finally, we implement large-scale Gaussian kernel support vector machines (SVMs) in the random Fourier feature space with the alternating direction method of multipliers (ADMM) framework on high-performance computing clusters, which adopt the proposed data segmentation criterion. Experimental results on several large-scale benchmark datasets show that the proposed data segmentation criterion is effective and efficient for the large-scale SVMs.