Feature Selection ToolboxFST3 Library / Documentation

demo32t.cpp File Reference

Example 32t: Threaded individual ranking (BIF) with SVM wrapper in very high-dimensional feature selection. More...

#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_cv.hpp"
#include "data_splitter_randrand.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_multinom_bhattacharyya.hpp"
#include "criterion_wrapper.hpp"
#include "classifier_svm.hpp"
#include "search_bif_threaded.hpp"
#include "seq_step_straight.hpp"
#include "search_seq_os.hpp"
Include dependency graph for demo32t.cpp:

Functions

int main ()

Detailed Description

Example 32t: Threaded individual ranking (BIF) with SVM wrapper in very high-dimensional feature selection.


Function Documentation

int main (  ) 

Example 32t: Threaded individual ranking (BIF) with SVM wrapper in very high-dimensional feature selection

Very high-dimensional feature selection is applied, e.g., in text categorization, with dimensionality in the order of 10000 or 100000. Individual feature ranking (or Best Individual Feature, BIF) is the most commonly applied approach because of its key advantages -- speed and high stability. In this example we illustrate a less common but effective approach based on Support Vector Machine feature selection wrapper. We use randomly sampled 50% of data to be used in the actual feature selection process, another (disjunct) 40% of data is randomly sampled for testing. The selected subset is eventually used for validation - SVM classifier is trained on the training data on the selected subspace and classification accuracy is finally estimated on the test data.

Warning:
Threaded search on very-high-dimensional data based on a complex wrapper as presented here may need large amount of memory.

References FST::Search_BIF_Threaded< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, max_threads >::search(), and FST::Search< RETURNTYPE, DIMTYPE, SUBSET, CRITERION >::set_output_detail().


Generated on Thu Mar 31 11:36:06 2011 for FST3Library by  doxygen 1.6.1