Abstract:Potential information with high value are carried by short text information flow in transmission. A model of decision tree for hot topic is established with the information entropy of training data set, according to the characteristics of short text information flow. The average amount of information of each topic categories and the information gain ratio of each characteristic word for distinguishing short text information flow are computed in the first step by the above algorithm of decision tree. Then, the characteristic word with maximum information gain ratio is selected for the job of test, while the top down construction process of the decision tree is accomplished. Finally, the hot topic is determined according to the leaf node type. The experiment result on real short text information flow shows that the proposed algorithm is more stable and faster than others.