Feature Selection ToolboxFST3 Library / Documentation

demo52t.cpp File Reference

Example 52t: (Threaded SFRS) Result regularization using secondary criterion. More...

#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_5050.hpp"
#include "data_splitter_cv.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_wrapper.hpp"
#include "criterion_subsetsize.hpp"
#include "criterion_negative.hpp"
#include "distance_euclid.hpp"
#include "classifier_knn.hpp"
#include "seq_step_straight_threaded.hpp"
#include "search_seq_sfrs.hpp"
#include "result_tracker_regularizer.hpp"
Include dependency graph for demo52t.cpp:

Functions

int main ()

Detailed Description

Example 52t: (Threaded SFRS) Result regularization using secondary criterion.


Function Documentation

int main (  ) 

Example 52t: (Threaded SFRS) Result regularization using secondary criterion.

It is known that feature selection may over-fit. As in the case of over-trained classifiers, over-selected feature subsets may generalize poorly. This unwanted effect can lead to serious degradation of generalization ability, i.e., model or decision-rule behavior on previously unknown data. It has been suggested (Raudys: Feature Over-Selection, LNCS 4109, 2006, or Somol et al., ICPR 2010) that preferring a subset with slightly-worse-than-maximal criterion value can actually improve generalization. FST3 makes this possible through result tracking and subsequent selection of alternative solution by means of secondary criterion maximization. In this example we show a 3-Nearest Neighbor Wrapper based feature selection process, where the final result is eventually chosen among a group of solutions close enough to the achieved maximum, so as to optimize the secondary criterion. The group of solutions to select from is defined by means of a user-selected margin value (permitted primary criterion value difference from the known maximum). In this case we show that even the simplest secondary criterion (mere preference of smaller subsets) can improve classifcation accuracy on previously unknown data.

References FST::Search_SFRS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::search(), FST::Search< RETURNTYPE, DIMTYPE, SUBSET, CRITERION >::set_output_detail(), and FST::Search_SFRS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::set_search_direction().


Generated on Thu Mar 31 11:36:36 2011 for FST3Library by  doxygen 1.6.1