March 12, 1999
School of Information Technology and Engineering
University of Ottawa
Machine Learning with Skewed Class Distributions
Many real-world concept learning applications involve detecting rare events. Datasets for such applications will be highly skewed, with positive examples (the events of interest) being far outnumbered by negative examples. This severe imbalance often causes existing concept learning systems to perform poorly. In this talk I will report on my current investigations into this phenomenon, which focuses on the nearest neighbour learning algorithm (IB1).