We make science discovery happen

This is the page dedicated to help users of data mining web application (the beta release currently available here), in case of selection of SVM Model to make experiments. This page is also directly reachable from the web application, in case users select the help button. The following contents are mainly dedicated to assist the user during the model parameter selection and setup, by giving details about each parameter, its role in the model, default value and suggestions about the right choice. What follows is organized into two main sections, referred to the functionalities that can be associated to the SVM model, respectively Classification and Regression.

For each of these two functionalities, there are 4 sub-sections, related with the use cases that users can select to perform experiments, respectively Train, Test, Run and Full.

- Input dataset
- kernel type
**0**linear: u'*v**1**polynomial: (gamma*u'*v + coef0)^degree**2**radial basis function: exp(-gamma*|u-v|^2)**3**sigmoid: tanh(gamma*u'*v + coef0)- svm type
**0**C-SVC**1**nu-SVC- gamma
- degree
- coeff 0
- error tolerance
- C
- nu
- weight
- shrinking

**this parameter is a field required!**

This is the dataset file to be used as input for the learning phase of the model. It typically must include both input and target columns, where each row is an entire pattern (or sample of data). The format (hence its extension) must be one of the types allowed by the application (ASCII, FITS, CSV, VOTABLE).

This is the kernel type selection parameter. It defines the kind of the projection function used to define the support vectors in the parameter space. Take care about this choice. If left empty, the default is radial basis function. Following options are:

It indicates the possible choice between three different model engines:

The two types are two different implementations with the same behavior. If left empty, the default is C-SVC (0)

It is one of the parameters included into the kernel definition function (see above kernel type). It can be a positive real number. If left empty, the default value is 0.

It is one of the parameters included into the kernel definition function. If left empty, the default value is 3

This is one of the kernel parameters. If left empty, the default value is 0

This is the threshold of the learning loop. This is the stopping criteria of the algorithm.

If left empty the default value is 0.001

This is the penalty parameter of the C-SVC model engine. If left empty, its default is 1.

It is the nu-SVC and one-class SVM model engine parameter. If left empty the default value is 0.5

This parameter must be selected only in case of C-SVC choice. It represents the weight of the stopping penalty parameter.

It is a multiplicative factor for the penalty parameter C (C*weight).

If users leaves empty this parameter field, the default value is set to 1

It is a flag indicating whether to use or not the shrinking heuristics, to speed up the algorithm. The default value is 1 (enabled)

(See the user manual for more details)

- Input Dataset
- Model File

**this parameter is a field required!**

Dataset file as input. It is a file containing all input columns and the single target column.

It must have the same number of input and target columns as for the training input file.

For example, it could be the same dataset file used as the training input file.

**this parameter is a field required!**

It is a file generated by the model during training phase. It contains the resulting network topology as stored at the end of a training session. Usually this file should not be edited or modified by users, just to preserve its content as generated by the model itself. The extension of such a file is usually .model

(See the user manual for more details)

- Input Dataset
- Model File

**this parameter is a field required!**

Dataset file as input. It is a file containing only input columns (without target column).

It must have the same number of input columns as for the training input file.

For example, it could be the same dataset file used as the training input file, BUT WITHOUT THE TARGET COLUMN.

**this parameter is a field required!**

It is a file generated by the model during training phase. It contains the resulting network topology as stored at the end of a training session. Usually this file should not be edited or modified by users, just to preserve its content as generated by the model itself. The extension of such a file is usually .model

(See the user manual for more details)

- Training Set
- Test Set
- kernel type
**0**linear: u'*v**1**polynomial: (gamma*u'*v + coef0)^degree**2**radial basis function: exp(-gamma*|u-v|^2)**3**sigmoid: tanh(gamma*u'*v + coef0)- svm type
**0**C-SVC**1**nu-SVC- gamma
- degree
- coeff 0
- error tolerance
- C
- nu
- weight
- shrinking

**this parameter is a field required!**

This is the dataset file to be used as input for the learning phase of the model. It typically must include both input and target columns, where each row is an entire pattern (or sample of data). The format (hence its extension) must be one of the types allowed by the application (ASCII, FITS, CSV, VOTABLE).

**this parameter is a field required!**

Dataset file as test input. It is a file containing all input columns and the single target column.

It must have the same number of input and target columns as for the training input file.

For example, it could be the same dataset file used as the training input file.

This is the kernel type selection parameter. It defines the kind of the projection function used to define the support vectors in the parameter space. Take care of this choice. If left empty, the default is radial basis function (option 2).

It indicates the possible choice between three different model engines:

The two types are two different implementations with the same behavior. The default is C-SVC (0).

It is one of the parameters included into the kernel definition function. It can be a positive real number. If left empty, the default value is 0.

It is one of the parameters included into the kernel definition function. If left empty, the default value is 3

This is one of the kernel parameters. If left empty, the default value is 0

This is the threshold of the learning loop. This is the stopping criteria of the algorithm.

If left empty the default value is 0.001

This is the penalty parameter of the C-SVC model engine. If left empty, its default is 1.

It is the nu-SVC and one-class SVM model engine parameter. If left empty the default value is 0.5

This parameter must be selected only in case of C-SVC choice. It represents the weight of the stopping penalty parameter.

It is a multiplicative factor for the penalty parameter C (C*weight).

If users leaves empty this parameter field, the default value is set to 1

It is a flag indicating whether to use or not the shrinking heuristics, to speed up the algorithm. The default value is 1 (enabled)

(See the user manual for more details)

- Input dataset
- kernel type
**0**linear: u'*v**1**polynomial: (gamma*u'*v + coef0)^degree**2**radial basis function: exp(-gamma*|u-v|^2)**3**sigmoid: tanh(gamma*u'*v + coef0)- svm type
**3**epsilon-SVR**4**nu-SVR- gamma
- degree
- coeff 0
- tolerance of termination criterion
- C
- nu
- epsilon
- shrinking

**this parameter is a field required!**

This is the dataset file to be used as input for the learning phase of the model. It typically must include both input and target columns, where each row is an entire pattern (or sample of data). The format (hence its extension) must be one of the types allowed by the application (ASCII, FITS, CSV, VOTABLE).

This is the kernel type selection parameter. It defines the kind of the projection function used to define the support vectors in the parameter space. Take care of this choice. If left empty, the default is radial basis function (2).

It indicates the possible choice between two different regression model engines:

**this parameter is a field required!**

It is one of the parameters included into the kernel definition function. It can be a positive real number. If left empty, the default value is 0.

It is one of the parameters included into the kernel definition function. If left empty, the default value is 3

This is one of the kernel parameters. If left empty, the default value is 0

This is the threshold of the learning loop. This is the stopping criteria of the algorithm.

If left empty the default value is 0.001

This is the penalty parameter of the epsilon-SVR model engine. If left empty, its default is 1.

It is the nu-SVC and one-class SVM model engine parameter. If left empty the default value is 0.5

Loss function in the epsilon-SVR. If left empty the default value is 0.001

It is a flag indicating whether to use or not the shrinking heuristics, to speed up the algorithm. The default value is 1 (enabled)

(See the user manual for more details)

- Input Dataset
- Model File

**this parameter is a field required!**

Dataset file as input. It is a file containing all input columns and the single target column.

It must have the same number of input and target columns as for the training input file.

For example, it could be the same dataset file used as the training input file.

**this parameter is a field required!**

It is a file generated by the model during training phase. It contains the resulting network topology as stored at the end of a training session. Usually this file should not be edited or modified by users, just to preserve its content as generated by the model itself. The extension of such a file is usually .model

(See the user manual for more details)

- Input Dataset
- Model File

**this parameter is a field required!**

Dataset file as input. It is a file containing only input columns (without target column).

It must have the same number of input columns as for the training input file.

For example, it could be the same dataset file used as the training input file, BUT WITHOUT THE TARGET COLUMN.

**this parameter is a field required!**

(See the user manual for more details)

- Training Set
- Test Set
- kernel type
**0**linear: u'*v**1**polynomial: (gamma*u'*v + coef0)^degree**2**radial basis function: exp(-gamma*|u-v|^2)**3**sigmoid: tanh(gamma*u'*v + coef0)- svm type
**3**epsilon-SVR**4**nu-SVR- gamma
- degree
- coeff 0
- error tolerance
- C
- nu
- epsilon
- shrinking

**this parameter is a field required!**

**this parameter is a field required!**

Dataset file as test input. It is a file containing all input columns and the single target column.

It must have the same number of input and target columns as for the training input file.

For example, it could be the same dataset file used as the training input file.

This is the kernel type selection parameter. It defines the kind of the projection function used to define the support vectors in the parameter space. Take care of this choice. If left empty, the default is radial basis function (2).

It indicates the possible choice between two different regression model engines:

**this parameter is a field required!**

It is one of the parameters included into the kernel definition function. It can be a positive real number. If left empty, the default value is 0.

This is one of the kernel parameters. If left empty, the default value is 0

This is the threshold of the learning loop. This is the stopping criteria of the algorithm.

If left empty the default value is 0.001

This is the penalty parameter of the epsilon-SVR model engine. If left empty, its default is 1.

It is the nu-SVC and one-class SVM model engine parameter. If left empty the default value is 0.5

Loss function in the epsilon-SVR. If left empty the default value is 0.001

(See the user manual for more details)