问题描述:

We have modified the CIFAR-10 tutorial (Convolution Neural Network) to run on the Adience database for Gender classification on faces. We read here that "Parameter Sharing" is useful as assumption that one patch feature is useful regardless of location in the image. Except:

Note that sometimes the parameter sharing assumption may not make sense. This is especially the case when the input images to a ConvNet have some specific centered structure, where we should expect, for example, that completely different features should be learned on one side of the image than another. One practical example is when the input are faces that have been centered in the image.

Objective: Therefore we would like to turn off parameter sharing for our CNN.

Code

I think the CIFAR-10 tutorial uses parameter sharing? and this part of the code in the def inference(images) function seems to have to do something with it:

biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0))

bias = tf.nn.bias_add(conv, biases)

Which calls:

def _variable_on_cpu(name, shape, initializer):

with tf.device('/cpu:0'):

var = tf.get_variable(name, shape, initializer=initializer)

return var

Question

  • Is parameter sharing indeed happening in the CIFAR-10 tutorial?
  • Could you tell us whether we're looking at the right piece of code for turning off parameter sharing or where else to look?
  • Any other help / suggestions are welcome, because we don't know where to start.

网友答案:

The CIFAR-10 model from the tutorial uses "parameter sharing" in the first two layers ('conv1' and 'conv2'). The sharing is implied by the use of the tf.nn.conv2d() operator, which effectively extracts patches from the input image and applies the same filter (i.e. a shared parameter) to each patch.

It's not trivial to "turn off" parameter sharing when you have a set of convolutional layers: instead you have to replace them with a different type of layer. The simplest change might be to replace the convolutional layers with a fully connected layer, e.g. by using tf.nn.relu_layer() (as in the 'local3' and 'local4' layers), which internally performs a matrix multiplication and maintains separate parameters for each input neuron.

N.B. Fully connected layers are often over-parameterized for vision tasks, and a more appropriate middle ground would be to use a "local receptive field", which (informally) maintains separate parameters for each input (as in a fully connected layer), but only combines values from "nearby" inputs to produce an output (as in a convolution). Unfortunately, TensorFlow doesn't yet contain an implementation of local receptive fields, but adding support for them would be a useful project.

相关阅读:
Top