This project is a WEKA (Waikato Environment for Knowledge Analysis) compatible implementation of MODLEM - a Machine Learning algorithm which induces minimum set of rules. These rules can be adopted as a classifier (in terms of ML). It is a sequential covering algorithm, which was invented to cope with numeric data without discretization. Actually the nominal and numeric attributes are treated in the same way: attribute's space is being searched to find the best rule condition during rule induction. In result numeric attribute's conditions are more precise and closely describe the class. This algorithm contains some aspects of Rough Set Theory: the class definition can be described accordingly to its lower or upper approximation. For more information, see: Stefanowski, Jerzy. The rough set based rule induction technique for classification problems. In: Proc. 6th European Congress on Intelligent Techniques and Soft Computing, vol. 1. Aachen, 1998. s. 109-113.
Download File
This version of MODLEM does not consider an important condition for developing decision rules from noisy datasets. As described in "J. Stefanowski, On combined Classifier, Rule Induction and Rough Sets, Transactions on rough sets VI, Pages 329-350"
Abstract.
In the paper we discuss inducing rule-based classiers from
imbalanced data, where one class (a minority class) is under-represented
in comparison to the remaining classes (majority classes). To improve
the ability of a classir to recognize this class, we propose a new se-lective pre-processing approach that is applied to data before inducing a
rule-based classier. The approach combines selective ltering of the ma-
jority classes with focused over-sampling of the minority class. Results
of a comparative experimental study show that our approach improves
sensitivity for the minority class while preserving the ability of a classier
to recognize examples from the majority classes
Looking for the latest version? Download weka-modlem.jar (7.1 MB)
Download File
This version of MODLEM does not consider an important condition for developing decision rules from noisy datasets. As described in "J. Stefanowski, On combined Classifier, Rule Induction and Rough Sets, Transactions on rough sets VI, Pages 329-350"
Abstract.
In the paper we discuss inducing rule-based classiers from
imbalanced data, where one class (a minority class) is under-represented
in comparison to the remaining classes (majority classes). To improve
the ability of a classir to recognize this class, we propose a new se-lective pre-processing approach that is applied to data before inducing a
rule-based classier. The approach combines selective ltering of the ma-
jority classes with focused over-sampling of the minority class. Results
of a comparative experimental study show that our approach improves
sensitivity for the minority class while preserving the ability of a classier
to recognize examples from the majority classes
Looking for the latest version? Download weka-modlem.jar (7.1 MB)
Introduction
Many real-life knowledge discovery problems involve learning from
imbalanced data
, which means that one of the classes (further called a
minority class
) in-
cludes much smaller number of examples than the others (further referred to
as
majority classes
). Moreover, examples from the minority class are usually of
primary interest. Such situation is typical for medical problems, where the num-
ber of patients requiring special attention (e.g., therapy or treatment) is much
smaller than the number of patients who do not need it. Similar situations oc-
cur in other domains { in [4, 14] the following problems are reported: detecting
fraud/intrusion, managing risk, detecting of oil spills in satellite images, predict-
ing technical equipment failures and information ltering.
Learning methods usually do not work properly on imbalanced data as they
are \somehow biased" to focus on the majority classes while \missing" examples
from the minority class. As a result created classers are also biased toward
better recognition of the majority classes and they usually have di±culties (or
even are unable) to classify correctly new objects from the minority class. This
problem also aects rough set rule-based classers as elementary sets for the
minority class are \weaker" than the ones for the majority classes and conse-
quently rules generated on their basis have a lesser chance to contribute to the
See the Downloads section for installer and a plain zip file.
No comments:
Post a Comment