Machine Learning on Non-Homogeneous, Distributed Text Data - PhD thesis: Dunja Mladenic, University of Ljubljana, Slovenia. An approach to automatic document categorization based on a large categorization hierarchy is proposed. www.cs.cmu.edu/~TextLearning/pww/PhD.html