Feature Selection ToolboxFST3 Library / Documentation

demo23.cpp File Reference

Example 23: Combined feature subset contents, size and SVM parameters optimization. More...

#include <boost/smart_ptr.hpp>
#include <exception>
#include <iostream>
#include <cstdlib>
#include <string>
#include <vector>
#include "error.hpp"
#include "global.hpp"
#include "subset.hpp"
#include "data_intervaller.hpp"
#include "data_splitter.hpp"
#include "data_splitter_cv.hpp"
#include "data_splitter_randrand.hpp"
#include "data_scaler.hpp"
#include "data_scaler_void.hpp"
#include "data_accessor_splitting_memTRN.hpp"
#include "data_accessor_splitting_memARFF.hpp"
#include "criterion_wrapper.hpp"
#include "classifier_svm.hpp"
#include "seq_step_straight.hpp"
#include "search_seq_dos.hpp"
Include dependency graph for demo23.cpp:

Functions

int main ()

Detailed Description

Example 23: Combined feature subset contents, size and SVM parameters optimization.


Function Documentation

int main (  ) 

Example 23: Combined feature subset contents, size and SVM parameters optimization.

Support Vector Machine performance strongly depends on parameters. Moreover, optimal SVM parameters on a subspace of the original space may differ. Therefore we suggest to optimize both the feature subset and SVM parameters in a repeated consecutive process. In this example feature subset search is followed by SVM parameter optimization for the current subset; this sequece of two operations is repeated as long as the criterion value increases. For this purpose we use the DOS procedure which is capable of starting the search from the previously obtained subset. We illustrate this approach here on an SVM wrapper with sigmoid kernel. 50% of data is randomly chosen to form the training dataset (remains the same for all the time), 40% of data is randomly chosen to be used at the end for validating the classification performance on the finally selected subspace (training and test data parts are disjunct and altogether cover 90% of the original data). The training data part is accessed by means of 3-fold cross-validation in the course of search. In the course of search SVM parameters are optimized on the currently best known feature subset, which is then used to initialize next DOS search. The calls are repeated as long as better SVM performance (on the training data) is achieved.

References FST::Search_DOS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::search(), and FST::Search_DOS< RETURNTYPE, DIMTYPE, SUBSET, CRITERION, EVALUATOR >::set_delta().


Generated on Thu Mar 31 11:35:48 2011 for FST3Library by  doxygen 1.6.1