Implements voting ensemble selection step in sequential search type of methods to possibly improve robustness and stability of feature selection result. More...
#include <seq_step_ensemble.hpp>
Classes | |
class | SubsetCandidate |
Nested class to hold feature/subset candidate info in the course of the ensemble voting process. More... | |
Public Types | |
typedef Sequential_Step < RETURNTYPE, DIMTYPE, SUBSET, CRITERION > | parent |
typedef boost::shared_ptr < CRITERION > | PCriterion |
typedef boost::shared_ptr < Criterion< RETURNTYPE, SUBSET > > | PAbstractCriterion |
typedef std::vector < PAbstractCriterion > | PAbstractCriteria |
typedef boost::shared_ptr < PAbstractCriteria > | PEnsembleCriteria |
typedef boost::shared_ptr< SUBSET > | PSubset |
Public Member Functions | |
Sequential_Step_Ensemble (const PEnsembleCriteria ensemble) | |
bool | evaluate_candidates (RETURNTYPE &result, const PSubset sub, const PCriterion crit, const DIMTYPE _generalization_level=1, std::ostream &os=std::cout) |
chooses among subsets offered by sub->get*CandidateSubset() | |
virtual std::ostream & | print (std::ostream &os) const |
Protected Types | |
typedef boost::shared_ptr < SubsetCandidate > | PSubsetCandidate |
typedef std::list < SubsetCandidate > | CANDIDATELIST |
typedef boost::shared_ptr < CANDIDATELIST > | PCANDIDATELIST |
typedef std::vector < PCANDIDATELIST > | CANDIDATELISTS |
typedef std::map< DIMTYPE, PSubsetCandidate > | FINALVALUES |
Protected Member Functions | |
bool | order_candidates (const PSubset sub) |
bool | test_candidate (const PSubset sub) |
Protected Attributes | |
PEnsembleCriteria | _ensemble |
CANDIDATELISTS | _lists |
FINALVALUES | _final |
PAbstractCriteria::iterator | citer |
CANDIDATELISTS::iterator | oiter |
CANDIDATELIST::iterator | iter |
FINALVALUES::iterator | fiter |
Private Attributes | |
boost::scoped_ptr< SUBSET > | tmp_step_sub |
bool | track |
Implements voting ensemble selection step in sequential search type of methods to possibly improve robustness and stability of feature selection result.
Criteria ensembles may improve generalization and stability properties of the selected feature subset, see "P. Somol, J. Grim, and P. Pudil. Criteria Ensembles in Feature Selection. In Proc. MCS, LNCS 5519, pages 304–313. Springer, 2009." The idea is to reduce possible over-training, i.e., excessive result adjustment to particular criterion properties. By employing more different criteria the result is likely to be more robust in different contexts. Criteria ensembles as implemented here are based on voting about feature preferences. In sequential algorithm step, feature candidates are ordered separatly according to each considered criterion. The various orderings are then joined (feature position index averaged) to produce final feature ordering. The best feature is then selected for addition to the current working subset. Note that this mechanism allows to use completely unrelated criteria, as the various criterion values are never combined - the only information that is combined is position index in ordered feature lists. This is advantageous as it enables combinations of filter and wrapper criteria (that yield values from different intervals) etc. Note, however, that the value obtained as result of ensemble evaluation is not usable for assessing the whole feature subset - it can be used only within the selection step to identify one feature (can be extended to feature c-tuple) for next inclusion/removal. For this reason it is also necessary to employ one single criterion to be used by selection algorithm for evaluating current subsets (and thus directing next search steps). This single criterion is used in the same way as in other FST3 Sequential_Step implementations, i.e. it must be passed in evaluate_candidates call. The criteria that form the ensemble are to be passed to Sequential_Step_Ensemble constructor.