Authors: Mary Chris Go and Sebastian Cajas Ordonez
Introduction
The role of computer vision is becoming essential in today's life with its many applications from medical images to security purposes. Until now, computer scientists are still in search of the most efficient and accurate method to do detection. This laboratory report shows different methods in doing foreground segmentation to detect motion in stagnant background. All algorithms were made using a programming language called C++. These were implemented through Eclipse and OpenCV library. Furthermore, the algorithms were also tested using the ChangeDetection.NET baseline, shadow, and dynamicBackground datasets [1]. It includes frame difference, selective running average, and ghost suppression on both grayscale and RGB channels. More methods were explored such as shadow suppression, advanced background substraction: unimodal gaussian and gaussian mixture model.
After many attempts of the doing various methods, this paper concluded that the most effective and well-implemented algorithm is the simple foreground segmentation. This conclusion will be further proven and analyzed on the next sections.
Method
There were three various approaches that were implemented. First is the simple foreground segmentation with three sub branches tasks. Secondly, shadow suppression was done to have a cleaner detection algorithm. Lastly, unimodal gaussian model and multimodal gaussian model were used to do an advanced background subtraction model.
Simple foreground segmentation
1. Frame difference
The binary mask of the foreground was first extracted by just taking the difference between the background and each frame. A certain threshold was set to determine which pixel belong to the foreground. The background was assigned to be the first frame.
A similar approach is performed for RGB channels analysis. Since OpenCV works over BGR, this conversion must be completed as well. once the absolute difference between the image and background has been done (given that we have the first frame as initial background), we use the function split to extract each channel and afterwards a for-cycle to threshold each case. Once the segmentation is finished, we unite the channels again; for this, several techniques were implemented, including ORing, ANDing and in-build functions such as bitwise_and and textitbitwise_or. However the best results are obtained using the ORing logical expression on an inner for-loop.
2. Selective Running Average
An update to the background was introduced in this method. The basis of this update is the adaption speed, α. This variable is numerically set (e.g. α = 0.05).
N.B: Blind means that the update the background model always happens. Selective means that only the pixels declared as background at each time instant are updated.
The above equation means that even when the background mask is equal to zero, we must update the background. And from this equation we see that the running average is taking into consideration the current Frame times alpha, which will update the background pixel-wise as a new foreground object appears during the current frame.
3. Ghost Suppression
The third method is dependent of the second step. It removes stationary objects that appear or are removed from the frame. This method is possible by attributing a pixel to a corresponding counter. This counter is updated every time the pixel serves as a foreground. On the other hand, it is reset every time the pixel serves as a background. A threshold is set to make the pixel a background or a foreground.
Shadow Suppression
This approach relies on restricting under certain thresholds, the ratios and absolute differences of the frame channels on HSV, as it has demonstrated to be a more effective color representation than the RGB color model [2]. From this study, it is determined that the ratio of the value channels must be with the [alpha, beta] interval.
Advanced background subtraction
1. Unimodal Gaussian
This method involves elimination of Gaussian noise. The pixel is assigned a Gaussian distribution that represents the background [3]. Using the initial sequence of frames with stationary objects, the initial values are estimated. Using a selective update, the mean and standard distribution are updated.
2. Multimodal Gaussian Model
This method expands the unimodal gaussian by having various Gaussians that can represent a single pixel. It uses an approximation to update the model. With this technique, the said method can remove the background flickering like water motion and daylight differences.
Conclusion
These video frames are obtained from the implemented algorithm for frame difference:
These video frames are obtained from the implemented algorithm for selective running average:
After many trials and evaluation done for all the methods, it was concluded that the best method is the simple foreground segmentation specifically the selective update method. Although unimodal gaussian showed good results, it failed to outperform itself on its own benchmark. Only the simple foreground segmentation were able to compete well with the results from the benchmark website. A more complex method shows improvement on some factors like background motion but the overall performance is still not enough. When these complex method try to detect the actual objects, the good performance degrades, thus showing bad values.
In a nutshell, this project did not perform the best but was able to implement the methods correctly. If given more time, the parameters can be more optimized to make the results better.
References
-
N. Goyette, P.-M. Jodoin, F. Porikli, J. Konrad, and P. Ishwar, changedetection.net: A new change detection benchmark dataset, in Proc. IEEE Workshop on Change Detection (CDW-2012) at CVPR-2012, Providence, RI, 16-21 Jun., 2012
-
R. Cucchiara, C. Grana, M. Piccardi, A. Prati, Detecting moving objects, ghosts, and shadows in video streams, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (10) (2003) 1337–1342
-
C.R. Wren et al., “Pfinder Real-Time Tracking of the Human Body”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(7):708-785, July 1997
If you want to know more about the implementation and detailed results, feel free to email us!