In this paper, a moving object is extracted from a video using video object detection algorithm based on spatial and temporal segmentation. The technique begins with temporal segmentation in which edge map is extracted using edge operator. The initial binary mask is obtained by using morphological operation applied on initial edge map. The next phase is spatial segmentation where gradient image is obtained by multi-scale morphological operator. The modified gradient image is obtained by the operator applied over the current frame. At last, moving object is extracted by precisely and accurately by watershed segmentation which is performed on the modified gradient image. Again, morphological operation is applied on the output to get final binary mask. This binary mask is then complemented to yield the contour line of the video object. Using the binary mask, the video object is extracted from the video frames. After detection of video object, the object tracking is performed using Kanade–Lucas–Tomasi (KLT) feature tracker.