I'm trying to use the functional API to have a shared layer where only one of the paths is trainable:
a_in = Input(x_shape)
b_in = Input(x_shape)
a_out = my_model(a_in) # I want these weights to be trainable
b_out = my_model(b_in) # I want these weights to be non-trainable (no gradient update)
y_out = my_merge(a_out, b_out)
full_model = Model(inputs=[a_in, b_in], outputs=[y_out])
full_model.compile(...)
I can't figure out how to do this though. Setting the my_model trainable flag affects both layers. I can compile 2 different models with different trainable flags, but then I can't see how I could combine 2 pre-compiled models to optimize my single merged cost function.
Is this even possible to do with Keras? And if not, is it possible in TensorFlow?
Yes, it is possible to have a shared layer with different trainable flags in the functional API of Keras. You can accomplish this by creating a copy of the shared layer, and setting the trainable attribute to False on one copy and True on the other copy.
Here is an example:
from keras import backend as K # Create a copy of the shared layer with the same weights shared_layer_copy = keras.layers.Lambda(lambda x: x)(shared_layer.output) # Make the original shared layer trainable shared_layer.trainable = True# Make the copy of the shared layer non-trainable shared_layer_copy.trainable = False# Use the shared layer for one path a_out = shared_layer(a_in) # Use the copy of the shared layer for the other path b_out = shared_layer_copy(b_in) # Merge the output of the two paths y_out = my_merge(a_out, b_out) # Create the full model full_model = Model(inputs=[a_in, b_in], outputs=[y_out]) # Compile the full model full_model.compile(...)
Alternatively, you can use the trainable argument in the Lambda layer, this argument allows to define whether the layer's weights should be updated during training.
shared_layer_copy = keras.layers.Lambda(lambda x: x, trainable=False)(shared_layer.output)
It's also worth noting that this method works for Tensorflow 2 as well.