Feature Selection ToolboxFST3 Library / Documentation

demo33.cpp File Reference

Example 33: Oscillating Search in very high-dimensional feature selection. More...

#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_randrand.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_multinom_bhattacharyya.hpp"
#include "criterion_wrapper.hpp"
#include "classifier_multinom_naivebayes.hpp"
#include "search_bif.hpp"
#include "seq_step_straight.hpp"
#include "search_seq_os.hpp"
Include dependency graph for demo33.cpp:

Functions

int main ()

Detailed Description

Example 33: Oscillating Search in very high-dimensional feature selection.


Function Documentation

int main (  ) 

Example 33: Oscillating Search in very high-dimensional feature selection.

Very high-dimensional feature selection in text categorization, with dimensionality in the order of 10000 or 100000. The standard approach is BIF, yet we show here that a non-trivial search procedure (OS) can be feasible. Here OS is applied in its fastest form (delta=1), initialized by means of BIF. We use Multinomial Bhattacharyya distance as the feature selection criterion (it has been shown capable of overperforming traditional tools like Information Gain etc., cf. Novovicova et al., LNCS 4109, 2006). Randomly sampled 50% of data is used for multinomial model parameter estimation to be used in the actual feature selection process, another (disjunct) 40% of data is randomly sampled for testing. The selected subset is eventually used for validation; multinomial Naive Bayes classifier is trained on the training data on the selected subset and classification accuracy is finally estimated on the test data.

References FST::Search_OS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::search(), FST::Search_BIF< RETURNTYPE, DIMTYPE, SUBSET, CRITERION >::search(), and FST::Search_OS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::set_delta().


Generated on Thu Mar 31 11:36:10 2011 for FST3Library by  doxygen 1.6.1