Video Segmentation Techniques for Instructional Videos –Survey

  • Jyoti Parsola Departement of Computer Application, Graphic Era (Deemed to be University), Dehradun, Uttarakhand, India
  • Durgaprasad Gangodkar Departement of Computer Science and Engineering, Graphic Era (Deemed to be University), Dehradun, Uttarakhand, India
  • Ankush Mittal Departement of Computer Science and Engineering, Graphic Era (Deemed to be University), Dehradun, Uttarakhand, India
Keywords: Video Segmentation, E-Learning Applications


Low cost smart phones and easy internet access have caused an increase in viewership of e-learning video. Usually the memory size of mobile phones is less therefore, it becomes extremely important to reduce size of these instructional videos. Video segmentation is the fundamental task of reducing size of e-learning videos. This paper gives an overview of existing techniques used for video segmentation of e-learning videos. Most of the methods used so far for segmenting instructional video are broadly categorized into i) feature extraction based segmentation ii) motion based segmentation.The performance, comparative merits and limitations of each approach is thoroughly examined and contradicted. The analysis is beneficial for appropriate use of existing methods and for enhancing their performance or forming new methods on the basis of existing methods by combining one or two methods together.


Download data is not yet available.


Amir, A., Ashour, G., & Srinivasan, S. (2001, January). Towards automatic real time preparation of on-line video proceedings for conference talks and presentations. InProceedings of the 34th Annual Hawaii International Conference on System Sciences(pp. 8-pp). IEEE.

Asghar, M. N., Hussain, F., & Manton, R. (2014). Video indexing: a survey.International Journal of Computer and Information Technology,3(01), 148-169.

Baidya, E., & Goel, S. (2014, August). LectureKhoj: automatic tagging and semantic segmentation of online lecture videos. In2014 Seventh International Conference on Contemporary Computing (IC3)(pp. 37-43). IEEE.

Banerjee, P., Bhattacharya, U., & Chaudhuri, B. B. (2014, September). Automatic detection of handwritten texts from video frames of lectures. In2014 14th International Conference on Frontiers in Handwriting Recognition(pp. 627-632). IEEE.

Bianchi, M. (1998,July). Autoauditorium: a fully automatic, multi-camera system to televise auditorium presentations. In Proc. of Joint DARPA/NIST Smart Spaces Technology Workshop.

Brejl, M., & Sonka, M. (2000). Object localization and border detection criteria design in edge-based image segmentation: automated learning from examples.IEEE Transactions on Medical imaging,19(10), 973-985.

Chen,W.T., Liu, W. C., & Chen, M. S.(2010). Adaptive color feature extraction based on image color distributions.IEEE Transactions on Image Processing, 19(8), 2005-2016,

Cheng, H. D., Jiang, X. H., Sun, Y., & Wang, J. (2001). Color image segmentation: advances and prospects.Pattern Recognition,34(12), 2259-2281.

Choudary, C., & Liu, T. (2007). Extracting content from instructional videos by statistical modelling and classification.Pattern Analysis and Applications,10(2), 69-81.

Davila, K., & Zanibbi, R. (2017, November). Whiteboard video summarization via spatio-temporal conflict minimization. In2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)(Vol. 1, pp. 355-362). IEEE.

Dickson, P. E., Adrion, W. R., & Hanson, A. R. (2008, December). Whiteboard content extraction and analysis for the classroom environment. In2008 Tenth IEEE International Symposium on Multimedia(pp. 702-707). IEEE.

Dickson, P., Adrion, W. R., & Hanson, A. (2006, December). Automatic capture of significant points in a computer based presentation. In Eighth IEEE International Symposium on Multimedia (ISM'06)(pp. 921-926). IEEE.

Dong, A., & Li, H. (2005, December). Educational documentary video segmentation and access through combination of visual, audio and text understanding. In Proceedings of the Fifth IEEE International Symposium on Signal Processing and Information Technology, 2005.(pp. 652-657). IEEE.

Dorai, C., Oria, V., & Neelavalli, V. (2003, September). Structuralizing educational videos based on presentation content. In Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429)(Vol. 2, pp. II-1029). IEEE.

Ekinci, M., & Gedikli, E. (2003, November). Background estimation based people detection and tracking for video surveillance. In International Symposium on Computer and Information Sciences(pp. 421-429). Springer, Berlin, Heidelberg.

Franklin, D., & Hammond, K. (2001, May). The intelligent classroom: providing competent assistance. InProceedings of the Fifth International Conference on Autonomous Agents(pp. 161-168).


Fu, K. S., & Mui, J. K. (1981). A survey on image segmentation.Pattern Recognition,13(1), 3-16.

Haubold, A., & Kender, J. R. (2005, November). Augmented segmentation and visualization for presentation videos. In Proceedings of the 13th Annual ACM International Conference on Multimedia(pp. 51-60). ACM.

He, L. W., & Zhang, Z.(2006). Real-time whiteboard capture and processing using a video camera for remote collaboration.IEEE Transactions on Multimedia,9(1), 198-206.

Imran, A. S., Chanda, S., Cheikh, F. A., Franke, K., & Pal, U. (2012, November). Cursive handwritten segmentation and recognition for instructional videos. In2012 Eighth International Conference on Signal Image Technology and Internet Based Systems(pp. 155-160). IEEE.

Javed, O., Shafique, K., & Shah, M. (2002, December). A hierarchical approach to robust background subtraction using color and gradient information. In Workshop on Motion and Video Computing, 2002. Proceedings.(pp. 22-27). IEEE.

Jeong, H. J., Kim, T. E., & Kim, M. H. (2012, December). An accurate lecture video segmentation method by using sift and adaptive threshold. In Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia (pp. 285-288). ACM.

Ju, S. X., Black, M. J., Minneman, S., & Kimber, D. (1998). Summarization of videotaped presentations: automatic analysis of motion and gesture.IEEE Transactions on Circuits and Systems for Video Technology,8(5), 686-696.

Lee, G. C., Yeh, F. H., Chen, Y. J., & Chang, T. K. (2017). Robust handwriting extraction and lecture video summarization.Multimedia Tools and Applications,76(5), 7067-7085.

Li, H., & Dong, A. (2006, August). Hierarchical segmentation of presentation videos through visual and text analysis. In2006 IEEE International Symposium on Signal Processing and Information Technology(pp. 314-319). IEEE.

Lin, M., Nunamaker, J. F., Chau, M., & Chen, H. (2004, January). Segmentation of lecture videos based on text: a method combining multiple linguistic features. In37th Annual Hawaii International Conference on System Sciences, 2004. Proceedings of the(pp.9-pp). IEEE.

Lin, Y. T., Tsai, H. Y., Chang, C. H., & Lee, G. C. (2010, September). Learning-focused structuring for blackboard lecture videos. In2010 IEEE Fourth International Conference on Semantic Computing(pp. 149-155). IEEE.

Liu, Q., Rui, Y., Gupta, A., & Cadiz, J. J. (2001, March). Automating camera management for lecture room environments. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems(pp. 442-449). ACM.

Liu, T., & Choudary, C. (2006). Content-adaptive wireless streaming of instructional videos.Multimedia Tools and Applications,28(2), 157-171.

Liu, T., & Kender, J. R. (2002). Rule-based semantic summarization of instructional videos. InProceedings. International Conference on Image Processing(Vol. 1, pp. I-I). IEEE.

Liu, T., Hjelsvold, R., & Kender, J. R. (2002). Analysis and enhancement of videos of electronic slide presentations. In Proceedings. IEEE International Conference on Multimedia and Expo(Vol. 1, pp. 77-80). IEEE.

Lowe, D. G. (2004). Distinctive image features from scale-invariant key points.International Journal of Computer Vision,60(2), 91-110.

Ma, D., & Agam, G. (2012, January). Lecture video segmentation and indexing. In Document Recognition and Retrieval XIX(Vol. 8297, p. 82970V). International Society for Optics and Photonics.

Masneri, S., & Schreer, O. (2014, January). SVM-based video segmentation and annotation of lectures and conferences. In2014 International Conference on Computer Vision Theory and Applications (VISAPP)(Vol. 2, pp. 425-432). IEEE.

Mittal, A., Gupta, S., Jain, S., & Jain, A. (2006). Content-based adaptive compression of educational videos using phase correlation techniques.Multimedia Systems,11(3), 249-259.

Mukhopadhyay, S., & Smith, B. (1999, October). Passive capture andstructuring of lectures. In ACM Multimedia (1)(pp. 477-487).

Ngo, C. W., Wang, F., & Pong, T. C. (2003, December). Structuring lecture videos for distance learning applications. In Fifth International Symposium on Multimedia Software Engineering, 2003. Proceedings.(pp. 215-222). IEEE.

Onishi, M., Izumi, M., & Fukunaga, K. (2000). Blackboard segmentation using video image of lecture and its applications. InProceedings 15th International Conference on Pattern Recognition. ICPR-2000(Vol. 4, pp. 615-618). IEEE.

Pal, N. R., & Pal, S. K. (1993). A review on image segmentation techniques.Pattern Recognition,26(9), 1277-1294.

Prabhu, N., Kumar, R. P., Punitha, T., & Srinivasan, R. (2008, October). Whiteboard documentation through foreground object detection and stroke classification. In2008 IEEE International Conference on Systems, Man and Cybernetics (pp. 336-340). IEEE.

Ram, A. R., & Chaudhuri, S. (2009, August). Automatic capsule preparation for lecture video. In2009 International Workshop on Technology for Education(pp. 10-16). IEEE.

Subudhi, B. N., Veerakumar, T., Yadav, D., Suryavanshi, A. P., & Disha, S. N. (2017, January). Video skimming for lecture video sequences using histogram based low level features. In2017 IEEE 7th International Advance Computing Conference (IACC)(pp. 684-689). IEEE.

Tang, L., & Kender, J. R. (2005, July). Semantic indexing for instructional video via combination of handwriting recognition and information retrieval. In2005 IEEE International Conference on Multimedia and Expo(pp. 920-923). IEEE.

Tuna, T., Joshi, M., Varghese, V., Deshpande, R., Subhlok, J., & Verma, R. (2015, October). Topic based segmentation of classroom videos. In2015 IEEE Frontiers in Education Conference (FIE)(pp. 1-9). IEEE.

Wallick, M. N., Heck,R. M., & Gleicher, M. L. (2005, March). Marker and chalkboard regions. In Proceedings of Mirage(pp. 223-228).

Wang, F., Ngo, C. W., & Pong, T. C. (2007). Lecture video enhancement and editing by integrating posture, gesture, and text.IEEE Transactions on Multimedia, 9(2), 397-409.

Yadid, S., & Yahav, E. (2016, October). Extracting code from programming tutorial videos. In Proceedings of the 2016 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software(pp. 98-111). ACM.

Yang, H., Siebert, M., Luhne, P., Sack, H., & Meinel, C. (2011, December). Automatic lecture video indexing using video OCR technology. In2011 IEEE International Symposium on Multimedia(pp. 111-116). IEEE.

Yang, H., Siebert, M., Luhne, P., Sack, H., & Meinel, C. (2011, November). Lecture video indexing and analysis using video ocr technology. In2011 Seventh International Conference on Signal Image Technology & Internet-Based Systems(pp. 54-61). IEEE.

Yokoi, T., & Fujiyoshi, H. (2006, July). Generating a time shrunk lecture video by event detection. In2006 IEEE International Conference on Multimedia and Expo(pp. 641-644). IEEE.