Detecting Nonexistent Pedestrians

Jui-Ting Chien

Chia-Jung Chou

Ding-Jie Chen

Hwann-Tzong Chen

National Tsing-Hua University

In ICCV 2017 workshop

[Paper] [Bibtex] [GitHub]

Examples of the synthesis pipeline. We adjust the image brightness for better visualization. From left to right: input images, predicted heatmaps, and synthesized images with phantom pedestrians according to the predicted heatmaps. The likelihoods of head and feet positions are depicted in red and blue

Abstract

We explore beyond object detection and semantic segmentation, and propose to address the problem of estimating the presence probabilities of nonexistent pedestrians in a street scene.Our method builds upon a combination of generative and discriminative procedures to achieve the perceptual capability of figuring out missing visual information. We adopt state-of-the-art inpainting techniques to generate the training data for nonexistent pedestrian detection. The learned detector can predict the probability of observing a pedestrian at some location in image, even if that location exhibits only the background.We evaluate our method by inserting pedestrians into images according to the presence probabilities and conducting user study to determine the `realisticness' of synthetic images. The empirical results show that our method can capture the idea of where the reasonable places are for pedestrians to walk or stand in a street scene.


Supplementary




Proposed Model

The architecture of the proposed model FCN+D.




Predicted Results

Hover over the input images to see the predicted heatmaps (The likelihoods of head and feet positions are depicted in red and blue.)




Synthesis Results

Hover over the input images to see the synthesis output.




Empirical Experiments

Target image

Input images.

Our method

Pedestrians are placed and chosen coherent with the predicted heatmap.
(Hover over the synthesis output to see the predicted heatmap.)

Baseline A

Pedestrians are placed and chosen randomly.
(Hover over the synthesis output to see different possibles.)

Baseline B

Pedestrians are borrow from the other street scene.
(Hover over the synthesis output to see the original scene.)




Recent Related Work

Xiaolong Wang*, Rohit Girdhar*, and Abhinav Gupta. Binge Watching: Scaling Aff ordance Learning from Sitcoms. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (spotlight) (*indicates equal contributions.) [pdf]

Shiyu Huang and Deva Ramanan Expecting the Unexpected: Training Detectors for Unusual Pedestrians with Adversarial Imposters Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. [pdf]

Jin Sun and David Jacobs. Seeing What Is Not There: Learning Context to Determine Where Objects Are Missing. Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017 (spotlight). [pdf]

Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel O'Connor, Jordi Torres, Elisa Sayrol and Xavier Giro-i-Nieto. SalGAN: Visual Saliency Prediction with Generative Adversarial Networks Proc. of IEEE Conference on Computer Vision and Pattern Recognition Scene Understanding Workshop (CVPR SUNw), 2017 (spotlight). [pdf]

Namhoon Lee, Xinshuo Weng, Vishnu Naresh Boddeti, Yu Zhang, Fares Beainy, Kris Kitani and Takeo Kanade. Visual Compiler: Synthesizing a Scene-Specific Pedestrian Detector and Pose Estimator . [pdf]




Acknowledgements

This work is supported in part by MOST grants 103-2221-E-007-045-MY3, 103-2218-E-007-017-MY3, 106-3114-E-007-008, and 106-2221-E-007-079-MY3. Chia-Jung would like to thank UmboCV for providing the fellowship.