[Now Reading] Maxout Networks

Title: Maxout Networks
Authors: Ian J. Goodfellow, David Warde-Farley, Mehdi Mirza, Aaron Courville, Yoshua Bengio
Link: https://arxiv.org/abs/1302.4389

Quick summary:
Maxout is an activation function that takes the maximum value of a bunch of neurons. In one sense, one could think as dropout being similar since dropout will discard some neurons and will pass forward others whereas maxout will only pass the maximum value of some of them. In essence, maxout is like max pooling since it reduces the dimensionality leaving only the maximum values.

It is well explained in the following post: http://www.simon-hohberg.de/blog/2015-07-19-maxout
Goodfellow PhD’s defence (talking about maxout): https://www.youtube.com/watch?v=ckoD_bE8Bhs&t=28m

Nowadays it is also implemented in tf.contrib.layers.maxout but here is a very simple implementation:

def maxout(inputs, num_units, axis=None):
    shape = inputs.get_shape().as_list()
    if axis is None:
        # Assume that channel is the last dimension
        axis = -1
    num_channels = shape[axis]
    if num_channels % num_units:
        raise ValueError('number of features({}) is not a multiple of num_units({})'
             .format(num_channels, num_units))
    shape[axis] = -1
    shape += [num_channels // num_units]
    outputs = tf.reduce_max(tf.reshape(inputs, shape), -1, keep_dims=False)
    return outputs

Juan Miguel Valverde

"The only way to proof that you understand something is by programming it"

Leave a Reply