Example 32t: Threaded individual ranking (BIF) with SVM wrapper in very high-dimensional feature selection. More...
#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_cv.hpp"
#include "data_splitter_randrand.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_multinom_bhattacharyya.hpp"
#include "criterion_wrapper.hpp"
#include "classifier_svm.hpp"
#include "search_bif_threaded.hpp"
#include "seq_step_straight.hpp"
#include "search_seq_os.hpp"
Functions | |
int | main () |
Example 32t: Threaded individual ranking (BIF) with SVM wrapper in very high-dimensional feature selection.
int main | ( | ) |
Very high-dimensional feature selection is applied, e.g., in text categorization, with dimensionality in the order of 10000 or 100000. Individual feature ranking (or Best Individual Feature, BIF) is the most commonly applied approach because of its key advantages -- speed and high stability. In this example we illustrate a less common but effective approach based on Support Vector Machine feature selection wrapper. We use randomly sampled 50% of data to be used in the actual feature selection process, another (disjunct) 40% of data is randomly sampled for testing. The selected subset is eventually used for validation - SVM classifier is trained on the training data on the selected subspace and classification accuracy is finally estimated on the test data.
References FST::Search_BIF_Threaded< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, max_threads >::search(), and FST::Search< RETURNTYPE, DIMTYPE, SUBSET, CRITERION >::set_output_detail().