Multi task learning in Keras

deep-learningkeraskeras-layermachine learningneural-network

I am trying to implement shared layers in Keras. I do see that Keras has keras.layers.concatenate, but I am unsure from documentation about its use. Can I use it to create multiple shared layers? What would be the best way to implement a simple shared neural network as shown below using Keras? Shared Neural network


Edit 1:
Note that all the shape of input, output and shared layers for all 3 NNs are the same. There are multiple shared layers (and non-shared layers) in the three NNs. The coloured layers are unique to each NN, and have same shape.

Basically, the figure represents 3 identical NNs with multiple shared hidden layers, followed by multiple non-shared hidden layers.

I am unsure how to share multiple layers as in the Twitter example, there was just one shared layer (example in API doc).


Edit 2:
Based on geompalik's helpful comments, this is what I initially came up with:

sharedLSTM1 = LSTM((data.shape[1]), return_sequences=True)
sharedLSTM2 = LSTM(data.shape[1])
def createModel(dropoutRate=0.0, numNeurons=40, optimizer='adam'):
    inputLayer = Input(shape=(timesteps, data.shape[1]))
    sharedLSTM1Instance = sharedLSTM1(inputLayer)
    sharedLSTM2Instance =  sharedLSTM2(sharedLSTM1Instance)
    dropoutLayer = Dropout(dropoutRate)(sharedLSTM2Instance)
    denseLayer1 = Dense(numNeurons)(dropoutLayer)
    denseLayer2 = Dense(numNeurons)(denseLayer1)
    outputLayer = Dense(1, activation='sigmoid')(denseLayer2)
    return (inputLayer, outputLayer)

inputLayer1, outputLayer1 = createModel()
inputLayer2, outputLayer2 = createModel()
model = Model(inputs=[inputLayer1, inputLayer2], outputs=[outputLayer1, outputLayer2])
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

In the above code, I expect that the LSTM Layers in the two models are shared, whereas the dropout and 2 dense layers are not shared. Is that correct?

If so, I do not need keras.layers.concatenate in this example, right?

I get the following image if I try to visualise the network using plot_model (which is what I was expecting):

model plot

Best Answer

Implementing the shown architecture is quite straight-forward with the functional API of Keras. Check this page for more information on that.

In your case you have the input layer and the first hidden layer shared, and then one layer for each of the three subjects. Designing your model is now a matter of how your data look: for instance, if for a given input you have different outputs for each subject, you can should define a model like:

model = Model(inputs=[you_main_input], outputs=[subject1_output, subject2_output, subject3_output])

If that is not the case, and you have training data corresponding to each of the subjects, you can define three NNs, and have the first two layers shared across them. Check under "Shared layers" in the above-cited documentation.

Related Topic