One of the simplest methods to extract features from a picture is by subtracting and thresholding. Imagine that from a static background we want to get any figure that appears in front. For instance, let us say that we want to measure the height of different people who will lean against a wall. At first, the background (the wall) is observed. Then, someone will appear, and later, by subtracting the new picture from the background we will be able to extract the person who appeared. After that, it will not be difficult to guess the height. The thresholding operation comes from the fact that we have to decide how strong is the change to consider that it indeed changed.
The most impressive advantage of this method is that it is really easy to implement and fast, similar to the temporal median which can actually be used to generate the background. The biggest drawback is that it is very sensitive to noise and luminosity changes as we will see.
I cropped a couple of pictures used in the temporal median to try this out.
And these are the results after applying thresholding and subtraction. The first picture has a lot of noise as you can see, but I think is very easy to remove in this case. I think this noise comes from that when these pictures were taken, there was a bit of wind and pixels are not exactly the same. In order to remove the noise, in the second attempt I firstly used a Gaussian filter. By the way, both pictures were converted first to gray scale to make it simple and faster since the background and foreground will be in a different color, although Gaussian took really long compared to the first one. In any case, in the original image my right arm is almost blended with the background. That is why it is not well detected.
When subtracting you can be very imaginative and try different things depending on the background, foreground and the application:
a) Having single thresholds for each color channel
b) Having a global threshold made from the sum of the subtracting of each channel
c) Combining a) and b)
d) Using gray scale
e) Applying a template convolution (Gaussian or any other)
f) Check out the neighborhood
and so on.
I wanted to try out how to use the webcam to process and show the results “in real time”, although my computer does not compute very fast and I took a picture every 1 or 1.5 seconds, but it is still interesting. I also wrote a post about how to take pictures from the webcam with Matlab.
This is the algorithm of the code I made
Initialize webcam, max_times
Wait 2 seconds before taking a picture of the background (to let me hide)
Take a picture of the background
Wait 1 second
Take a picture
result = blackPicture (background color)
Iterate over each pixel (y,x)
If background(y,x,redChannel) – picture(y,x,redChannel) > threshold || background(y,x,greenChannel) – picture(y,x,greenChannel) > threshold || background(y,x,blueChannel) – picture(y,x,blueChannel) > threshold
result(x,y,channels) = newColor
Show picture (or store it)
Some times we can see that the whole area is green. This is because the luminosity changes depending on where the camera is focusing, and the focus changes depending on where the camera detects something. As the luminosity of the whole picture changes, everything turns to green. Shadows also make the luminosity change.
An example of how luminosity changes depending on where the webcam is focused.
The code is provided in the Source code section.
1. M. Nixon and A. Aguado. 2008. “First order edge detection operators”, Feature Extraction & Image Processing.