Stopping Criterion. As previously explained, the evolution process of solutions is essentially an iterative process. Therefore, it will be necessary to specify a criterion that allows establishing when the execution is completed. Once more, there are different options, but the most common ones are shown as follows. The fittest individuals in the population represent solutions good enough so that the problem could be solved.

### Publisher Description

The population has converged. Once all the genes reach convergence it is said that the population has converged. When this phenomenon happens, the average goodness of the population is close to the goodness of the fittest individual. The difference of the best solutions found between different generations is reduced.

This may indicate, at the very best, that the population has reached an overall solution or on the contrary that the population has come to a standstill at a local minimum value. It may be worth mentioning that the advantage of such techniques is the simplicity of their implementation.

No technical knowledge is required to solve the problem, only one way that allows evaluating a possible solution in order to define the fitness function. Moreover, it should be also noteworthy the simplicity of the ideas taken from the natural environment, on which the evolution of solutions is based. In addition, this type of techniques is easily adaptable to multimodal problems those with multiple solutions [ 59 ] or multiobjective problems those in which different criteria are optimized simultaneously [ 60 ].

- Operator Algebras. Theory of Casterisk-Algebras and von Neumann Algebras.
- Peer-reviewed publications!
- Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications;
- Starcrossed (Starcrossed Trilogy, Book 1).
- Neueste Beiträge.
- Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications!
- The Cochlea!

When the computational cost is a criterion to be considered due to its inherently parallel operation, we are dealing with easily distributable techniques or at least the evaluation of the solutions, which often becomes a hurdle with a marked improvement in the response time arising from such distribution. Finally, we should note that these techniques, unlike others, always provide a solution to the problem raised and, in addition, this solution will be improving as implementation is carried out over time.

The support vector machines are general methods for solving problems of classification, regression, and estimation. They are learning systems based on the studies performed by Vapnik on the statistical learning theory [ 24 , 25 ]. From their inception to the present day, they have become the subject of continuous research and application. The interest raised by this method has increased considerably, becoming a referent for the other disciplines of machine learning and data mining.

At first, the SVM were developed to solve problems of binary classification two classes , but currently, and throughout their evolution, they have widened their field of action, dealing with any kind of problems.

The SVM are aimed at finding a linear optimal hyperplane distributing the data into two or more classes, so that all those elements which belong to the same class are located on the same side. This is equivalent to solving a classical quadratic programming problem, which guarantees the existence of a single solution and a reasonable efficiency for real problems with thousands of examples and attributes.

Intuitively, it seems obvious to come to the conclusion that when solving a linear classification problem, there is a high probability of obtaining several solutions which could correctly classify the information, as shown in Figure 2. Therefore, the question to be answered is which of the alternatives is the ideal one? In his studies, Vapnik answered this question by defining the concept of optimal hyperplane.

## Adversarial Examples: Attacks and Defenses for Deep Learning

Once defined the concept of optimal hyperplane, and after carrying out several studies, it was observed that the hyperplane could be defined only if considering certain data from the training set. However, in most of the existing problems, the data are not linearly separable, so that the implementation of the above-mentioned process does not achieve a good result.

To solve this drawback, we should tackle these problems with different strategies, thus achieving a linear separation but in a different space.

To this end, a transformation of input variables is performed in a dimensional space greater than the one to which they belong the greater dimensional space being a Hilbert space :. The next step is to find a hyperplane actually a scalar product of vectors that can be expressed as a function of the input space x in this new dimension that allows separating the data linearly. The result of this scalar product is called kernel, and the most common ones are as follows:. In general, a kernel is any function K u , v that verifies Mercer's theorem [ 62 ], that is, any function that verifies.

Given the above, if the transformed function gives rise to a linearly separable space search, Vapnik and Chervonesky [ 63 ] showed that maximizing the separation margin between classes is equivalent to the minimization of the Euclidean norm of the weight vector. But what happens when some datum is still not linearly separable in this new dimension? Therefore, the problem leads to the search of a classification function f x that minimizes the sum of these losses reflected by the slack variables.

### Foire du Valais | 01.10.12222 | 14h00

More specifically, the greater is the value of this parameter, the higher is the assigned penalty to errors. Depending on the value of C , the margins of a boundary decision will vary their forms. As a result, we can conclude that the higher is the value, the narrower is the margin and the lower is the classification error in the training phase. On the contrary, the wider is the margin, the higher is the classification error in the training phase.

The method proposed in this work is based on creating a hybrid model that combines a GA and support vector machines, with the aim of classifying samples before selecting the minimum number of significant variables. The GAlib library [ 64 ], developed by Matthew Wall in and last modified in , was used to encode the genetic algorithms. WinSVM provides as output a mean squared error MSE to measure the distance between the samples incorrectly classified on the optimal hyperplane.

Thus, the obtained MSE value will be deterministic; hence, this will be one of the criteria which will be subsequently considered when comparing different executions. According to those mentioned in the previous section, the SVM are extremely useful when trying to make dichotomous classification of data, that is, to distinguish between two classes.

- Jorge Pezoa Núñez.
- Publications – da/sec.
- Juvenile Hormones and Juvenoids: Modeling Biological Effects and Environmental Fate.
- Join Kobo & start eReading today.
- Treating Comorbid Opioid Use Disorder in Chronic Pain?
- Management Control in Small and Medium-Sized Enterprises: Indirect Control Forms, Control Combinations and their Effect on Company Performance.

However, this idea can be generalized to identify, among a set of n categories, to which a certain datum belongs. In this work we have chosen to raise the following approach: considering that in the total data set n categories C 1 , C 2 , …, C n can be defined for each possible C i category existing in the input set, and an SVM is created. This latter will try to distinguish whether a datum belongs to the given C i category or to the remaining set. Finally, to determine to which specific category each datum belongs, we have simply implemented all the defined SVM and we have selected that output which indicated a greater degree of belonging to a particular class e.

With the aim of creating the above-mentioned hybrid model, we have modified the traditional operation of the GA in such a way that it could generate a population of individuals of varying length [ 66 ]. To this end, we have implemented an initialization function that will be responsible for, firstly, either performing a random generation or with a predetermined size, of the length of each individual making up the population of the genetic algorithm and, secondly, initializing the value of each gene. For this purpose, we have selected from the total set of variables a subset of random size so that, if the variables generated are different, the subset will be assigned to the individual.

Otherwise, a new random combination will be generated until the condition stipulated is met. This procedure is repeated until all individuals in the GA population have an assigned subset. Therefore, the result is a set of individuals as shown in Algorithm 1. In this paper, we suggest the use of the SVM as a fitness function of the genetic algorithm. Thus, for each individual, and depending on the indicated positions, a training and validation set is created from the initial data.

These sets will be applied to the SVM which, once the prediction is made, will yield a mean squared error MSE to be used as a scale to determine the fittest individual see Figure 5. Nevertheless, when the information refers to a nonbinary or dichotomic classification, it is necessary to modify a small aspect.

In this case an individual SVM is applied, using the variables specified by the genetic individual, to discriminate between each of the classes c i and the rest a MSE being obtained. In this case of multiclass problems, the value of kindness of the individual will be the sum of the MSE obtained for every classification. This choice of the fitness function allows, among other advantages, a repeatability of results, which hardly ever applies to other techniques such as artificial neural networks, used in similar problems tackled in previous studies [ 13 , 67 , 68 ].

Therefore, this paper is a step forward regarding previous studies on selection of variables in an experimental field on which numerous tests have been carried out. Still, as shown below, the obtained results significantly improved the previous ones. Nowadays the society awareness has evolved into the need for a more and more healthy diet in order to improve the quality of life.

Undoubtedly, the juice manufacturing industry, influenced by these new circumstances, has enjoyed a boom in both production and sales. However, the increasing production of these industries leads to an increase regarding the level of adulteration of their products. Consequently, the search for new methods that allow identifying the exact amount of pure juice used to produce these products has become an issue of great importance in recent years [ 69 ].

In order to prevent and detect adulteration in food, this latter must be subjected to an increasingly strict series of quality control tests. This is due to the fact that the commonly used analysis techniques have become obsolete with their development and progress. Different techniques such as HPLC high performance liquid chromatography gas chromatography, or isotope methods are too slow and relatively expensive and, therefore, they are not suitable for carrying out routine analysis.

On the other hand, IR infrared spectroscopy provides a quick and cheap alternative, which, besides these already mentioned characteristics, provides great information about the main components of the juice. Therefore, in order to perform testing, we have used information that allows verifying the authenticity of the apple juice quality.

**keramdeha.ml**

## ETRO-VUB Department of Electronics and Informatics

As a result, we have obtained a series of samples of diluted pure juice, which will be used to obtain the different training and validation sets. Two sets of samples will be considered: one for samples with high concentration of juice see Table 1 and one for samples with low concentration of juice see Table 2. Consequently, beverages with a low concentration of juice fall within what is called energy drinks among which soft drinks are included , while beverages with a higher concentration receive the generic name of juices.

All samples were characterized by means of an infrared spectroscopy. As a result of this characterization, a spectrum as shown in Figure 6 is obtained for each sample, which represents the amount of energy absorbed or absorbance for a total of wavelengths or variables.

The objective is to determine the amount of juice in a sample using only the information provided by this spectrum. However, using all the information of the spectrum does not lead to fully satisfactory results, as shown below. That is why the need for applying a process of selection of variables is arisen. This would have two obvious advantages.

On the one hand, it is time-saving both when obtaining the spectrum and in the subsequent classification, since a smaller amount of information is involved. On the other hand, and perhaps most importantly, the expert is provided further information about which part of the spectrum, that is, what specific type of sugar fructose, sucrose, etc.

## Attern Lassification 2nd Dition Olution Anual

The first experiment that should be performed before running the developed model is the choice of setting parameters for both vector machines and GA. To this end, we have applied the following procedure. This function is based on a random generation of different combinations of the SVM parameters, which are then applied to the initial data obtaining a mean squared error MSE as a result of the application. A total of different combinations were generated for this purpose, and from among them, and taking as scale the training MSE, we have chosen those with the best results in the training cases.

However, when performing this first test, there were no entirely satisfactory solutions see Table 3. On the other hand, this fact is not really significant since the final aim in this phase is to determine the configuration parameters of the algorithm, not to carry out a real data classification. The configuration of the genetic algorithm parameters, carried out similarly as in the case of the SVM, should be selected after performing a series of tests by varying their value. From the range of tests, the configuration shown in Table 5 was taken as optimal.

First, to allow the comparison of results, a reference model should be established. In this case, we have chosen the original set of variables provided by the IR spectrometer to build different classification models starting from the former. More specifically, we have chosen some of the most widely used models in the field of analytical chemistry with this type of data partial least squares: PLS, SIMCA or potential functions , together with a model generated from the use of ANN and another one using SVM for the classification.