Large Scale Visual Recognition

  • Vision: to build an engine to recognize everything in Internet images

  • Large Scale Visual Recognition in the Internet scale applications is commercially important and technically challenging.

  • The major challenges brought by a large number of tags and a huge number of live data include:

    • Cost of human labeling;

    • Large variation of object scales;

    • Difficulty of collecting samples for long-tail tags

  • Our research directions include:

    • Data-Efficient Learning (Webly-Supervised Learning, Weakly-Supervised Object Detection, etc.),

    • Neual Architecture Design (Multi-Scale Fusion, etc.),

    • Few-shot Learning (Instance Recognition, Instance Retrieval, etc.).

Data-Efficient Learning

Webly supervised image classification, metadata, Webly supervised learning of convolutional neural networks, visual-semantic graph 
Webly supervised image classification with self-contained confidence, Webly supervised learning of convolutional neural networks 
WebVision image classification challenge, Webly supervised image classification, Webly supervised learning of convolutional neural networks 
Object instance mining for weakly supervised object detection 

Neual Architecture Design

scale aggregation networks, scalenet, multi-scale convolutional networks 
Scale-equalizing pyramid convolution for object detection 

Instance Retrieval

Fashion retrieval via graph reasoning networks on a similarity pyramid, cross-domain fashion image retrieval, deepfashion 
Learning local similarity with spatial relations for object retrieval, Oxford 5k, Paris 6k, INSTRE 
Aggregated deep feature from activation clusters for particular object retrieval, Oxford 5k, Paris 6k 

Locations of visitors to this page

profile counter Stats

Copyright © 2021 Wayne Zhang