Video Understanding

Video Action Recognition

  • S. Sun, Z. Kuang, L. Sheng, W. Ouyang and W. Zhang.
    Optical flow guided feature: A fast and robust motion representation for video action recognition.
    Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2018. [paper]

    We introduce a novel compact motion representation for video action recognition, named Optical Flow guided Feature (OFF), which enables the network to distill temporal information through a fast and robust approach. The OFF is derived from the definition of optical flow and is orthogonal to the optical flow, and could be embedded in any existing CNN based video action recognition framework with only a slight additional cost.

  • Z. Zhang, Z. Kuang, P. Luo, L. Feng and W. Zhang.
    Temporal sequence distillation: Towards few-frame action recognition in videos.
    Proc. ACM Multimedia (MM), 2018. [paper]

    Video Analytics Software as a Service (VA SaaS) has been rapidly growing in recent years. VA SaaS is typically accessed by users using a lightweight client. Because the transmission bandwidth between the client and cloud is usually limited and expensive, it brings great benefits to design cloud video analysis algorithms with a limited data transmission requirement. As the first attempt in this direction, this work introduces a problem of few-frame action recognition, which aims at maintaining high recognition accuracy, when accessing only a few frames during both training and test.

Video Summarization

  • L. Feng, Z. Li, Z. Kuang and W. Zhang.
    Extractive video summarizer with memory augmented neural networks.
    Proc. ACM Multimedia (MM), 2018.

    Humans usually create a summary after viewing and understanding the whole video, and the global attention mechanism capturing information from all video frames plays a key role in the summarization process. Motivated by this observation, we proposed a memory augmented extractive video summarizer, which utilizes an external memory to record visual information of the whole video with high capacity. With the external memory, the video summarizer simply predicts the importance score of a video shot based on the global understanding of the video frames.

Machine Learning & Pattern Recognition


Graph-based Agglomerative Clustering


Semi-supervised Dimensionality Reduction


Learning Partial Differential Equations via Optimal Control

Cross-modality Computer Vision


Inter-modality Face Recognition


Face (Portrait) Sketch Synthesis

[Project Page]

Other Projects


Internet Image Reranking

Proposed a novel semi-supervised learning approach to internet image reranking.
The images are retrieved by search engines given a query keyword.


Photo Quality Evaluation

Implemented features in the following two papers, and investigated the performances using different classifiers.
Y. Ke, X. Tang, and F. Jing. The design of high-level features for photo quality assessment. CVPR, 2006.
Y. Luo and X. Tang. Photo and video quality evaluation: Focusing on the subject. ECCV, 2008.


Automatic Panoramic Mosaic Stitching

The skeleton code was borrowed from UW's course CSE576.
It is also used in other universitys’ courses, such as cornell's course CS6670.
[Project Page]


Seam Carving for Content Aware Image Resizing

Implemented the following paper:
Shai Avidan and Ariel Shamir. Seam Carving for Content-Aware Image Resizing. ACM SIGGRAPH, 2007.
[Project Page]


Image Morphing
Implemented Beier-Neely algorithm and deformable surface algorithm.
[Project Page]


Photometric Stereo
Implemented photometric stereo with a chrome ball, without a chrome ball (Hayakawa's algorithm),
example-based photometric stereo, and Frankot-Chellappa Algorithm.
The skeleton code is borrowed from UW's course CSEP576.
[Project Page]

Locations of visitors to this page

profile counter Stats

Copyright © 2018 Wei Zhang