Deformable Convolutional Networks (DCN)

Deformable Convolutional Networks (DCN)

Official Microsoft Research code implementing deformable convolution and deformable RoI pooling operators for object detection and segmentation.

Open SourceImage GenerationSource code (GitHub), Python, MXNet, CUDA/C++, Linux, Windows
Deformable Convolutional Networks (DCN) screenshot

What is Deformable Convolutional Networks (DCN)?

Deformable Convolutional Networks (DCN) is the official open-source implementation from Microsoft Research Asia of the techniques described in the ICCV 2017 paper. It adds learnable spatial offsets to standard convolution and RoI pooling so that a network can adapt its receptive field to the geometry of objects. The repository provides MXNet-based operators along with reference training and inference pipelines for R-FCN, Faster R-CNN, FPN and DeepLab. It is research code released under the MIT licence rather than a hosted product.

Key Features

Deformable Convolution

Convolution layers with learnable 2D offsets that let the sampling grid adapt to object shape and scale.

Deformable RoI/PSRoI Pooling

Adaptive pooling that adjusts bin positions for more accurate region feature extraction in detection.

DCNv2 operators

Updated deformable operators with modulation, released as of December 2018 alongside the original v1 operators.

Detection framework support

Reference implementations for R-FCN, Faster R-CNN and Feature Pyramid Network (FPN).

Semantic segmentation

DeepLab integration using deformable components for dense prediction tasks.

Pre-trained models

Downloadable models trained on COCO, PASCAL VOC and Cityscapes with a ResNet-v1-101 backbone.

MIT-licensed code

Full training and inference scripts provided under a permissive MIT licence by Microsoft.

Pros & Cons

Advantages

  • Released as the official implementation by the original Microsoft Research authors, so it directly matches the published ICCV 2017 paper.
  • Permissive MIT licence allows free use in both research and commercial projects.
  • Covers several major tasks out of the box, including detection (R-FCN, Faster R-CNN, FPN) and segmentation (DeepLab).
  • Ships with pre-trained models and benchmark results on COCO, VOC and Cityscapes for reproducible comparison.
  • Demonstrates measurable accuracy gains, such as Deformable R-FCN reaching 82.3 mAP on VOC07 versus 79.6 for the standard version.

Limitations

  • The code targets Python 2.7 and a specific older MXNet commit, which makes setup difficult on modern environments.
  • It is no longer actively maintained, and the README itself points users to the PyTorch operators in the mmdetection codebase.
  • Building the custom CUDA operators from source requires an NVIDIA GPU and a fairly involved compilation step.

Use Cases

Computer vision researchers reproducing or extending the Deformable ConvNets results from the original paper.

Engineers building object detection systems who need adaptive receptive fields for objects with variable scale or shape.

Teams working on semantic segmentation that want to test deformable components inside a DeepLab pipeline.

Students and academics studying how learnable geometric transformations improve convolutional networks.

Developers porting deformable convolution ideas to other frameworks using this reference MXNet implementation as a baseline.