In the previous post I talked about an SVM implementation in Matlab. I consider that post and implementation really interesting since it is not easy to find a simple SVM implementation. Instead, I found tons of files which may implement a very interesting algorithm but they are insanely difficult to examine in order to learn about how it works. This is the main reason why I put so much effort on this implementation I developed thanks to the algorithm [1].
First improvement
After I implemented that algorithm everything seemed to work:
Example 1:
Data points coordinates and target: Distance to the border from each point: 
Very acceptable results, right? However, when I added more points I experienced an odd behavior.
Example 2:

It clearly fails at finding the optimum boundary, however, the funny thing here is that the distance between the second and third point and the boundary are the same. This means that the rest of the samples were ignored and the algorithm focused only on those two. Actually, and were the only nonzero values.
After debugging the code everything seemed to be right according to the algorithm I followed [1] but after many trials I saw what definitely brought me to discover the error. In my trial the first two elements belong to class 1 whereas the rest of them belong to the other one. As you can see in the following examples, when I changed the order of the elements in the second class I got that the boundary was different depending only on the first element of the second class, ergo, the third sample.
Example 3:

Example 4:

In this last trial we get the best solution because in this case the algorithm has to focus on the third sample which is the closest one to the other class. However, this is not always true, so I had the need of fixing it. The fix is very simple but was not easy to find (at least quickly).
When I was debugging the code, I realized that in the first loop (iterating over ) it never reached the samples 4th and 5th. The reason was easy to understand: after calculating the temporal boundary (even if it is not the best, that is why it is called “temporal”), there were no errors because the algorithm classified it correctly, so it never entered that loops which needed to pass the “if” which takes care of the tolerance. In other words, if there is no error, it does not try to fix it because it is able to classify it correctly (and this actually makes sense).
If the samples were not encountered on the loop on purpose, then they should be encountered on the other loop. Surprisingly, the algorithm did not encountered any of them in the inner loop. After I checked that I wrote the code accordingly to the algorithm [1], I thought that there had to be a mistake in the algorithm itself. And the mistake was the “Continue to “. Because of that line, it ignored the rest of ‘s, so it should be “Continue to “.
Thus, the fix in the Matlab code was pretty simple: changing from “break” to “continue“. Break allows to stop iterating over the loop and therefore it continues in the outer loop whereas continue makes the current loop stop and start iterating over the next value in that loop.
Second improvement
After the first improvement was implemented, it seemed that it worked for many trials, but when I tried more complex examples, it failed again.
The original algorithm [1] uses the variable to see whether alpha changed. If no alphas are changed during the iterations for times, the algorithm will stop. I think the idea of iterating various times over the main algorithm is correct, but the algorithm must focus on those samples that will help building the boundary. After I implemented the modification, the algorithm iterated less times than the original algorithm. Additionally, the original algorithm implementation seemed to fail in many cases whereas my implementation works.
When the algorithm iterates once, alphas are updated such that nonzero alphas correspond to the samples that will help building the boundary. In this example, after the first iteration, alpha values correspond to this:
Therefore, in the next iteration it will update the samples to focus only in sample #3 and sample #6. After this implementation was done, all the trials I tried worked perfectly. This is the result of the same problem:
Algorithm
This is the algorithm [1] after both improvements:
Initialize
Initialize
Initialize input and
Calculate using (2)
Calculate using (2)
Save old ‘s:
Compute and by (10) and (11)
Continue to
Compute by (14)
Continue to
Compute and clip new value for using (12) and (15)
(*A*)
Continue to
Determine value for using (16)
Compute and using (17) and (18) respectively
Compute by (19)
(*B*)
Algorithm Legend
(*A*): If the difference between the new and is negligible, it makes no sense to update the rest of variables.
(*B*): Useful data are those samples whose had a nonzero value during the previous algorithm iteration.
(2):
(10):
(11):
(12):
(14):
(15):
(16):
(17):
(18):
(19):
The code is provided in the Source code section.
References
1. The Simplified SMO Algorithm http://cs229.stanford.edu/materials/smo.pdf