Python – How to convert a dense layer to an equivalent convolutional layer in Keras

conv-neural-networkkerasneural-networkpython

I would like to do something similar to the Fully Convolutional Networks paper (https://people.eecs.berkeley.edu/~jonlong/long_shelhamer_fcn.pdf) using Keras. I have a network which ends up flattening the feature maps and runs them through several dense layers. I would like to load the weights from a network like this into one where the dense layers are replaced with equivalent convolutions.

The VGG16 network which comes with Keras could be used as an example, where the 7x7x512 output of the last MaxPooling2D() is flattened and then goes into a Dense(4096) layer. In this case the Dense(4096) would be replaced with a 7x7x4096 convolution.

My real network is slightly different, there is a GlobalAveragePooling2D() layer instead of MaxPooling2D() and Flatten(). The output of GlobalAveragePooling2D() is a 2D tensor, and there is no need to additionally flatten it, so all the dense layers including the first would be replaced with 1×1 convolutions.

I've seen this question: Python keras how to transform a dense layer into a convolutional layer which seems very similar if not identical. The trouble is I can't get the suggested solution to work, because (a) I'm using TensorFlow as the backend, so the weights rearrangement/filter "rotation" isn't right, and (b) I can't figure out how to load the weights. Loading the old weights file into the new network with model.load_weights(by_name=True) doesn't work, because the names don't match (and even if they did the dimensions differ).

What should the rearrangement be when using TensorFlow?

How do I load the weights? Do I create one of each model, call model.load_weights() on both to load the identical weights and then copy some of the extra weights that need rearrangement?

Best Answer

Based on the answer of hars, I created this function to transform an arbitrary cnn into a fcn:

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D
from keras.engine import InputLayer
import keras

def to_fully_conv(model):

    new_model = Sequential()

    input_layer = InputLayer(input_shape=(None, None, 3), name="input_new")

    new_model.add(input_layer)

    for layer in model.layers:

        if "Flatten" in str(layer):
            flattened_ipt = True
            f_dim = layer.input_shape

        elif "Dense" in str(layer):

            input_shape = layer.input_shape
            output_dim =  layer.get_weights()[1].shape[0]
            W,b = layer.get_weights()

            if flattened_ipt:
                shape = (f_dim[1],f_dim[2],f_dim[3],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (f_dim[1],f_dim[2]),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])
                flattened_ipt = False

            else:
                shape = (1,1,input_shape[1],output_dim)
                new_W = W.reshape(shape)
                new_layer = Convolution2D(output_dim,
                                          (1,1),
                                          strides=(1,1),
                                          activation=layer.activation,
                                          padding='valid',
                                          weights=[new_W,b])


        else:
            new_layer = layer

        new_model.add(new_layer)

    return new_model

you can test the function like this:

model = keras.applications.vgg16.VGG16()
new_model = to_fully_conv(model)
Related Topic