Person detection - The Fastest Pedestrian Detector in the West

Original paper

Key idea of the approach

The authors underline that the speed problems of all current state-of-the-art person detectors mainly goes back to the need to construct an image pyramid and computing features at each of the levels of this pyramid which is time-consuming.

Nevertheless, the image pyramid is necessary to find small persons (original image is upscaled) or large persons (original image is downscaled) - while the detection window dimension stays the same.

For this, the authors explored how gradient magnitudes and orientations changed when up- or downsampling the images.

They found that having a feature (e.g. HOG - histograms of gradients) computed at one scale, we can approximate the corresponding feature at higher and lower scales by re-weightening the feature:


[…] our key insight is that for a broad family of features, including gradient histograms, the feature responses computed at a single scale can be used to approximate features responses at nearby scales.


[…] given gradients computed at one scale, is it possible to approximate gradient histograms at a different scale? If so, then we can avoid computing gradients over a finely sampled image pyramid.

Since this relationship gradually degrades when going to significantly higher or lower scales, we can at least construct a sparse image pyramid and approximate features at nearby scales by already computed features at a certain scale and thereby saving time.

Dollar et. al did exactly this on their own integral channel feature person detector, which is described in a previous paper:

P. Dollár, Z. Tu, P. Perona, and S. Belongie
Integral channel features
BMVC 2009

Reference implementation

Getting Dollar's Matlab code running

Update from 16 Sep 2013: link does not work any longer. Seems Piotr Dollar removed the code from his website and integrated the object detection code in his toolbox. See here (detector directory)

Then type

   nm='I.png'; I=imread(nm);
   prm=struct('imgNm',nm,'modelNm','ChnFtrs01','resize',1,'fast',1);
   tic;
   bbs=detect(prm);
   toc;
   figure(1); im(I,[],0); bbApply('draw',bbs,'g');

as it is explained in the comments in detect.m.

The test image 'I.png' comes with Dollar's Matlab code.

Example detections of FPDW with standard parameters

Finding big + small persons

For detecting larger / smaller persons, one has to set the image resize scale factor, as the comment in detect.m says:

% (e.g. resize=2 detects 50+ pixel peds, resize=.5 detects % 200+ pixel peds).

Note: in my case, the Matlab code aborted with an error if I used resize scales < 1 that were not a power of 2.

So use

  • 2^-1 = 0.5
  • 2^-2 = 0.25
  • 2^-3 = 0.125
  • etc.

So e.g.

prm=struct('imgNm',nm,'modelNm','ChnFtrs01','resize', 0.125 ,'fast',1);

Further FPWD parameters

Can be set in the prm structure

    % Prepare FPDW parameter structure
    % INPUTS
    %  prm
    %   .imgNm      - ['REQ'] image file name or actual image
    %   .resNm      - [''] target file name (if empty output bbs)
    %   .resize     - [1] image upscaling (increase to detect small peds)
    %   .modelNm    - ['REQ'] file name for trained classifer
    %   .thr        - [-100] detection threshold (discard bbs below thr)
    %   .nScale     - [10] number of scale per octave (8-10 work well)
    %   .d          - [4] spatial step in pixels (4-8 work well)
    %   .pad        - [0] image padding in pixels (pad=1 is max padding)
    %   .fast       - [1] if true use BMVC10 fast features

E.g.

    prm=struct('imgNm',nm,'modelNm','ChnFtrs01','resize',2,'fast',1,'thr',-100);
    bbs=detect(prm);

My matlab script for doing person detection on a bunch of images

clear;
 
% For each 2nd image for bunch of 281 images...
for i = 0:2:280
 
    % Read image    
    nm = sprintf('H:\\data\\%06d.png', i);
    disp(nm);    
    I=imread(nm);
 
    % Prepare FPDW parameter structure
    % INPUTS
    %  prm
    %   .imgNm      - ['REQ'] image file name or actual image
    %   .resNm      - [''] target file name (if empty output bbs)
    %   .resize     - [1] image upscaling (increase to detect small peds)
    %   .modelNm    - ['REQ'] file name for trained classifer
    %   .thr        - [-100] detection threshold (discard bbs below thr)
    %   .nScale     - [10] number of scale per octave (8-10 work well)
    %   .d          - [4] spatial step in pixels (4-8 work well)
    %   .pad        - [0] image padding in pixels (pad=1 is max padding)
    %   .fast       - [1] if true use BMVC10 fast features
    prm=struct('imgNm',nm,'modelNm','ChnFtrs01','resize',2,'fast',1,'thr',-100);
 
    % Detect persons, and measure how long it takes time to do so
    tic;
    bbs=detect(prm);
    toc;
 
    % Show persons
    h = figure(1);
    im(I,[],0);
    bbApply('draw',bbs,'g');
 
    % Save figure with bboxes as image    
    resNm = sprintf('H:\\tmp\\%06d.png', i);
    saveas(h, resNm);
 
end % for
 
public/person_detection_-_the_fastest_pedestrian_detector_in_the_west_fpdw.txt · Last modified: 2013/09/16 11:08 (external edit) · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki