Geodesic Object Proposals

Philipp Krähenbühl, Vladlen Koltun
ECCV 2014

[pdf] [code] [data]

We present an approach for identifying a set of candidate objects in a given image. This set of candidates can be used for object recognition, segmentation, and other object-based image parsing tasks. To generate the proposals, we identify critical level sets in geodesic distance transforms computed for seeds placed in the image. The seeds are placed by specially trained classifiers that are optimized to discover objects. Experiments demonstrate that the presented approach achieves significantly higher accuracy than alternative approaches, at a fraction of the computational cost.

Code

You’ll need to download both the code and the additional data, which includes pre-trained boundary detectors for Sketch Tokens and Structured Forests. The readme contains instructions on how to compile and run the code. If you find a bug please feel free to contact me. However please don’t contact me to get help compiling or running the code.

Changes

v1.0: Initial Release

v1.1: Added a matlab wrapper, made it easier to use learned seeds and masks from matlab and c++, and added a function to compute bounding boxes from proposals.

v1.2: Added evaluation code for COCO and seed proposals (proposals just containing the seeds themselves). The seed proposals help for both COCO and VOC in terms of segmentation. If you’re interested in just bounding boxes you probably don’t want to use them, as most small bounding boxes are labeled as difficult.

v1.3: Fixed compilation issues on older system. Python 2.7 should now work too.

Matlab

After getting a few requests for Matlab binaries, here they are. I compiled them using Matlab 2014a, but at least under linux it should also work with older matlab versions, as long as you use “LD_PRELOAD=’/usr/lib/x86_64-linux-gnu/libstdc++.so.6’ matlab” to load a recent c++ standard library.

More comparisons

I had several people ask me how GOP compares to SCG/MCG. I also added in the new numbers incliding seed proposals (gop v1.2). So here is the comparison:

Method

# prop.

ABO

Covering

50%-recall

70%-recall

Time

| — |

CPMC

646

0.703

0.850

0.784

0.609

252s

Cat-Ind OP

1536

0.718

0.840

0.820

0.624

119s

Selective Search

4374

0.735

0.786

0.891

0.597

2.6s

SCG

2125

0.754

0.835

0.870

0.663

MCG

5158

0.807

0.868

0.921

0.772

30s

MCG (best 2200 per image)

2199

0.785

0.861

0.896

0.720

30s

Baseline GOP (130,5)

653

0.712

0.812

0.833

0.622

0.6s

Baseline GOP (150,7)

1090

0.727

0.828

0.847

0.644

0.65s

Baseline GOP (200,10)

2089

0.744

0.843

0.867

0.673

0.9s

Baseline GOP (300,15)

3958

0.756

0.849

0.881

0.699

1.2s

Learned GOP (140,4)

652

0.720

0.815

0.844

0.632

1.0s

Learned GOP (160,6)

1199

0.741

0.835

0.865

0.673

1.1s

Learned GOP (180,9)

2286

0.756

0.852

0.877

0.699

1.4s

Learned GOP (200,15)

4186

0.766

0.858

0.889

0.715

1.7s

Baseline GOP (v1.2) (130,5)

780

0.723

0.812

0.850

0.631

0.6s

Baseline GOP (v1.2) (150,7)

1237

0.741

0.828

0.870

0.657

0.65s

Baseline GOP (v1.2) (200,10)

2281

0.759

0.843

0.892

0.688

0.9s

Baseline GOP (v1.2) (300,15)

4242

0.771

0.849

0.910

0.711

1.2s

Learned GOP (v1.2) (140,4)

754

0.731

0.815

0.865

0.640

1.0s

Learned GOP (v1.2) (160,6)

1284

0.751

0.836

0.882

0.684

1.1s

Learned GOP (v1.2) (180,9)

2319

0.767

0.851

0.891

0.710

1.4s

Learned GOP (v1.2) (200,15)

4104

0.777

0.859

0.903

0.725

1.7s

I also got some request to compare to edge boxes or BING, so here is that comparison (with 70% edge boxes):

VUS (2000 windows)

Linear

Log

| — |

BING

0.278

0.189

Objectness

0.323

0.225

Edge Boxes

0.526

0.320

Randomized Prim

0.511

0.274

Selective Search

0.528

0.301

GOP

0.546

0.310