Feature Selection ToolboxFST3 Library / Documentation

demo55.cpp File Reference

Example 55: Evaluating Similarity of Two Feature Selection Processes. More...

#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_cv.hpp"
#include "data_splitter_randrand.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_normal_bhattacharyya.hpp"
#include "criterion_wrapper.hpp"
#include "distance_L1.hpp"
#include "classifier_knn.hpp"
#include "seq_step_straight.hpp"
#include "search_seq_sffs.hpp"
#include "search_seq_dos.hpp"
#include "result_tracker_stabileval.hpp"
Include dependency graph for demo55.cpp:

Functions

int main ()

Detailed Description

Example 55: Evaluating Similarity of Two Feature Selection Processes.


Function Documentation

int main (  ) 

Example 55: Evaluating Similarity of Two Feature Selection Processes.

To study the difference in feature preferences among principally different feature selection methods or among differently parametrized instances of the same method FST3 provides measures capable of evaluating the level of similarity between two sets of trials (Somol Novovicova, IEEE, TPAMI, 2010). In analogy to stability evaluation (see Example 54: Feature Selection Stability Evaluation) for each of the two feature selection scenarios a series of trials is conducted on various samplings of the same data. In this example ten feature selection trials are performed per scenario, each on randomly sampled 95% of the data. In the first scenario in each trial the resulting subset is obtained using DOS procedure, optimizing the 3-Nearest Neighbour accuracy estimated by means of 3-fold cross-validation. In the second scenario in each trial the resulting subset is obtained using SFFS procedure, maximizing the Bhattacharyya distance based on normal model. A selection of standard stability measures is evaluated separately for each of the two scenarios. Eventually the similarity of the two scenarios is evaluated using analogously founded similarity measures. All measures yield values from [0,1], where values close to 0 denote low stability/similarity and values close to 1 denote high stability/similarity. Note that in this experiment the inter-measures (IATI, ICW, IANHI) yield markedly lower values than the corresponding stability measures (ATI, CW, ANHI). This illustrates well that considerably different results can be expected from differently founded feature selection methods.

References FST::Search_SFFS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::search(), FST::Search_DOS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::search(), FST::Search_DOS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::set_delta(), and FST::Search_SFFS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::set_search_direction().


Generated on Thu Mar 31 11:36:44 2011 for FST3Library by  doxygen 1.6.1