lasagne.layers.dnn
¶
This module houses layers that require cuDNN to work. Its layers are not automatically imported into the lasagne.layers
namespace: To use these layers, you need to import lasagne.layers.dnn
explicitly.
Note that these layers are not required to use cuDNN: If cuDNN is available, Theano will use it for the default convolution and pooling layers anyway.
However, they allow you to enforce the usage of cuDNN or use features not available in lasagne.layers
.
-
class
lasagne.layers.dnn.
Pool2DDNNLayer
(incoming, pool_size, stride=None, pad=(0, 0), ignore_border=True, mode='max', **kwargs)[source]¶ 2D pooling layer
Performs 2D mean- or max-pooling over the two trailing axes of a 4D input tensor. This is an alternative implementation which uses
theano.sandbox.cuda.dnn.dnn_pool
directly.- Parameters
incoming : a
Layer
instance or tupleThe layer feeding into this layer, or the expected input shape.
pool_size : integer or iterable
The length of the pooling region in each dimension. If an integer, it is promoted to a square pooling region. If an iterable, it should have two elements.
stride : integer, iterable or
None
The strides between sucessive pooling regions in each dimension. If
None
thenstride = pool_size
.pad : integer or iterable
Number of elements to be added on each side of the input in each dimension. Each value must be less than the corresponding stride.
ignore_border : bool (default: True)
This implementation never includes partial pooling regions, so this argument must always be set to True. It exists only to make sure the interface is compatible with
lasagne.layers.MaxPool2DLayer
.mode : string
Pooling mode, one of ‘max’, ‘average_inc_pad’ or ‘average_exc_pad’. Defaults to ‘max’.
**kwargs
Any additional keyword arguments are passed to the
Layer
superclass.
Notes
The value used to pad the input is chosen to be less than the minimum of the input, so that the output of each pooling region always corresponds to some element in the unpadded input region.
This is a drop-in replacement for
lasagne.layers.MaxPool2DLayer
. Its interface is the same, except it does not support theignore_border
argument.-
get_output_for
(input, **kwargs)[source]¶ Propagates the given input through this layer (and only this layer).
- Parameters
input : Theano expression
The expression to propagate through this layer.
- Returns
output : Theano expression
The output of this layer given the input to this layer.
Notes
This is called by the base
lasagne.layers.get_output()
to propagate data through a network.This method should be overridden when implementing a new
Layer
class. By default it raises NotImplementedError.
-
get_output_shape_for
(input_shape)[source]¶ Computes the output shape of this layer, given an input shape.
- Parameters
input_shape : tuple
A tuple representing the shape of the input. The tuple should have as many elements as there are input dimensions, and the elements should be integers or None.
- Returns
tuple
A tuple representing the shape of the output of this layer. The tuple has as many elements as there are output dimensions, and the elements are all either integers or None.
Notes
This method will typically be overridden when implementing a new
Layer
class. By default it simply returns the input shape. This means that a layer that does not modify the shape (e.g. because it applies an elementwise operation) does not need to override this method.
-
class
lasagne.layers.dnn.
MaxPool2DDNNLayer
(incoming, pool_size, stride=None, pad=(0, 0), ignore_border=True, **kwargs)[source]¶ 2D max-pooling layer
Subclass of
Pool2DDNNLayer
fixingmode='max'
, provided for compatibility to otherMaxPool2DLayer
classes.
-
class
lasagne.layers.dnn.
Pool3DDNNLayer
(incoming, pool_size, stride=None, pad=(0, 0, 0), ignore_border=True, mode='max', **kwargs)[source]¶ 3D pooling layer
Performs 3D mean- or max-pooling over the 3 trailing axes of a 5D input tensor. This is an alternative implementation which uses
theano.sandbox.cuda.dnn.dnn_pool
directly.- Parameters
incoming : a
Layer
instance or tupleThe layer feeding into this layer, or the expected input shape.
pool_size : integer or iterable
The length of the pooling region in each dimension. If an integer, it is promoted to a square pooling region. If an iterable, it should have two elements.
stride : integer, iterable or
None
The strides between sucessive pooling regions in each dimension. If
None
thenstride = pool_size
.pad : integer or iterable
Number of elements to be added on each side of the input in each dimension. Each value must be less than the corresponding stride.
ignore_border : bool (default: True)
This implementation never includes partial pooling regions, so this argument must always be set to True. It exists only to make sure the interface is compatible with
lasagne.layers.MaxPool2DLayer
.mode : string
Pooling mode, one of ‘max’, ‘average_inc_pad’ or ‘average_exc_pad’. Defaults to ‘max’.
**kwargs
Any additional keyword arguments are passed to the
Layer
superclass.
Notes
The value used to pad the input is chosen to be less than the minimum of the input, so that the output of each pooling region always corresponds to some element in the unpadded input region.
-
get_output_for
(input, **kwargs)[source]¶ Propagates the given input through this layer (and only this layer).
- Parameters
input : Theano expression
The expression to propagate through this layer.
- Returns
output : Theano expression
The output of this layer given the input to this layer.
Notes
This is called by the base
lasagne.layers.get_output()
to propagate data through a network.This method should be overridden when implementing a new
Layer
class. By default it raises NotImplementedError.
-
get_output_shape_for
(input_shape)[source]¶ Computes the output shape of this layer, given an input shape.
- Parameters
input_shape : tuple
A tuple representing the shape of the input. The tuple should have as many elements as there are input dimensions, and the elements should be integers or None.
- Returns
tuple
A tuple representing the shape of the output of this layer. The tuple has as many elements as there are output dimensions, and the elements are all either integers or None.
Notes
This method will typically be overridden when implementing a new
Layer
class. By default it simply returns the input shape. This means that a layer that does not modify the shape (e.g. because it applies an elementwise operation) does not need to override this method.
-
class
lasagne.layers.dnn.
MaxPool3DDNNLayer
(incoming, pool_size, stride=None, pad=(0, 0, 0), ignore_border=True, **kwargs)[source]¶ 3D max-pooling layer
Subclass of
Pool3DDNNLayer
fixingmode='max'
, provided for consistency toMaxPool2DLayer
classes.
-
class
lasagne.layers.dnn.
Conv2DDNNLayer
(incoming, num_filters, filter_size, stride=(1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False, **kwargs)[source]¶ 2D convolutional layer
Performs a 2D convolution on its input and optionally adds a bias and applies an elementwise nonlinearity. This is an alternative implementation which uses
theano.sandbox.cuda.dnn.dnn_conv
directly.- Parameters
incoming : a
Layer
instance or a tupleThe layer feeding into this layer, or the expected input shape. The output of this layer should be a 4D tensor, with shape
(batch_size, num_input_channels, input_rows, input_columns)
.num_filters : int
The number of learnable convolutional filters this layer has.
filter_size : int or iterable of int
An integer or a 2-element tuple specifying the size of the filters.
stride : int or iterable of int
An integer or a 2-element tuple specifying the stride of the convolution operation.
pad : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)
By default, the convolution is only computed where the input and the filter fully overlap (a valid convolution). When
stride=1
, this yields an output that is smaller than the input byfilter_size - 1
. The pad argument allows you to implicitly pad the input with zeros, extending the output size.A single integer results in symmetric zero-padding of the given size on all borders, a tuple of two integers allows different symmetric padding per dimension.
'full'
pads with one less than the filter size on both sides. This is equivalent to computing the convolution wherever the input and the filter overlap by at least one position.'same'
pads with half the filter size (rounded down) on both sides. Whenstride=1
this results in an output size equal to the input size. Even filter size is not supported.'valid'
is an alias for0
(no padding / a valid convolution).Note that
'full'
and'same'
can be faster than equivalent integer values due to optimizations by Theano.untie_biases : bool (default: False)
If
False
, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 3D tensor.
W : Theano shared variable, expression, numpy array or callable
Initial value, expression or initializer for the weights. These should be a 4D tensor with shape
(num_filters, num_input_channels, filter_rows, filter_columns)
. Seelasagne.utils.create_param()
for more information.b : Theano shared variable, expression, numpy array, callable or
None
Initial value, expression or initializer for the biases. If set to
None
, the layer will have no biases. Otherwise, biases should be a 1D array with shape(num_filters,)
if untied_biases is set toFalse
. If it is set toTrue
, its shape should be(num_filters, output_rows, output_columns)
instead. Seelasagne.utils.create_param()
for more information.nonlinearity : callable or None
The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.
flip_filters : bool (default: False)
Whether to flip the filters and perform a convolution, or not to flip them and perform a correlation. Flipping adds a bit of overhead, so it is disabled by default. In most cases this does not make a difference anyway because the filters are learnt. However,
flip_filters
should be set toTrue
if weights are loaded into it that were learnt using a regularlasagne.layers.Conv2DLayer
, for example.num_groups : int (default: 1)
The number of groups to split the input channels and output channels into, such that data does not cross the group boundaries. Requires the number of channels to be divisible by the number of groups, and requires Theano 0.10 or later for more than one group.
**kwargs
Any additional keyword arguments are passed to the Layer superclass.
Attributes
W
(Theano shared variable or expression) Variable or expression representing the filter weights.
b
(Theano shared variable or expression) Variable or expression representing the biases.
-
convolve
(input, **kwargs)[source]¶ Symbolically convolves input with
self.W
, producing an output of shapeself.output_shape
. To be implemented by subclasses.- Parameters
input : Theano tensor
The input minibatch to convolve
**kwargs
Any additional keyword arguments from
get_output_for()
- Returns
Theano tensor
input convolved according to the configuration of this layer, without any bias or nonlinearity applied.
-
class
lasagne.layers.dnn.
Conv3DDNNLayer
(incoming, num_filters, filter_size, stride=(1, 1, 1), pad=0, untie_biases=False, W=lasagne.init.GlorotUniform(), b=lasagne.init.Constant(0.), nonlinearity=lasagne.nonlinearities.rectify, flip_filters=False, **kwargs)[source]¶ 3D convolutional layer
Performs a 3D convolution on its input and optionally adds a bias and applies an elementwise nonlinearity. This implementation uses
theano.sandbox.cuda.dnn.dnn_conv3d
directly.- Parameters
incoming : a
Layer
instance or a tupleThe layer feeding into this layer, or the expected input shape. The output of this layer should be a 5D tensor, with shape
(batch_size, num_input_channels, input_depth, input_rows, input_columns)
.num_filters : int
The number of learnable convolutional filters this layer has.
filter_size : int or iterable of int
An integer or a 3-element tuple specifying the size of the filters.
stride : int or iterable of int
An integer or a 3-element tuple specifying the stride of the convolution operation.
pad : int, iterable of int, ‘full’, ‘same’ or ‘valid’ (default: 0)
By default, the convolution is only computed where the input and the filter fully overlap (a valid convolution). When
stride=1
, this yields an output that is smaller than the input byfilter_size - 1
. The pad argument allows you to implicitly pad the input with zeros, extending the output size.A single integer results in symmetric zero-padding of the given size on all borders, a tuple of three integers allows different symmetric padding per dimension.
'full'
pads with one less than the filter size on both sides. This is equivalent to computing the convolution wherever the input and the filter overlap by at least one position.'same'
pads with half the filter size (rounded down) on both sides. Whenstride=1
this results in an output size equal to the input size. Even filter size is not supported.'valid'
is an alias for0
(no padding / a valid convolution).Note that
'full'
and'same'
can be faster than equivalent integer values due to optimizations by Theano.untie_biases : bool (default: False)
If
False
, the layer will have a bias parameter for each channel, which is shared across all positions in this channel. As a result, the b attribute will be a vector (1D).If True, the layer will have separate bias parameters for each position in each channel. As a result, the b attribute will be a 4D tensor.
W : Theano shared variable, expression, numpy array or callable
Initial value, expression or initializer for the weights. These should be a 5D tensor with shape
(num_filters, num_input_channels, filter_depth, filter_rows, filter_columns)
. Seelasagne.utils.create_param()
for more information.b : Theano shared variable, expression, numpy array, callable or
None
Initial value, expression or initializer for the biases. If set to
None
, the layer will have no biases. Otherwise, biases should be a 1D array with shape(num_filters,)
if untied_biases is set toFalse
. If it is set toTrue
, its shape should be(num_filters, output_depth, output_rows, output_columns)
instead. Seelasagne.utils.create_param()
for more information.nonlinearity : callable or None
The nonlinearity that is applied to the layer activations. If None is provided, the layer will be linear.
flip_filters : bool (default: False)
Whether to flip the filters and perform a convolution, or not to flip them and perform a correlation. Flipping adds a bit of overhead, so it is disabled by default. In most cases this does not make a difference anyway because the filters are learned, but if you want to compute predictions with pre-trained weights, take care if they need flipping.
num_groups : int (default: 1)
The number of groups to split the input channels and output channels into, such that data does not cross the group boundaries. Requires the number of channels to be divisible by the number of groups, and requires Theano 0.10 or later for more than one group.
**kwargs
Any additional keyword arguments are passed to the Layer superclass.
Attributes
W
(Theano shared variable or expression) Variable or expression representing the filter weights.
b
(Theano shared variable or expression) Variable or expression representing the biases.
-
convolve
(input, **kwargs)[source]¶ Symbolically convolves input with
self.W
, producing an output of shapeself.output_shape
. To be implemented by subclasses.- Parameters
input : Theano tensor
The input minibatch to convolve
**kwargs
Any additional keyword arguments from
get_output_for()
- Returns
Theano tensor
input convolved according to the configuration of this layer, without any bias or nonlinearity applied.
-
class
lasagne.layers.dnn.
SpatialPyramidPoolingDNNLayer
(incoming, pool_dims=[4, 2, 1], mode='max', **kwargs)[source]¶ Spatial Pyramid Pooling Layer
Performs spatial pyramid pooling (SPP) over the input. It will turn a 2D input of arbitrary size into an output of fixed dimension. Hence, the convolutional part of a DNN can be connected to a dense part with a fixed number of nodes even if the dimensions of the input image are unknown.
The pooling is performed over \(l\) pooling levels. Each pooling level \(i\) will create \(M_i\) output features. \(M_i\) is given by \(n_i * n_i\), with \(n_i\) as the number of pooling operation per dimension in level \(i\), and we use a list of the \(n_i\)’s as a parameter for SPP-Layer. The length of this list is the level of the spatial pyramid.
- Parameters
incoming : a
Layer
instance or tupleThe layer feeding into this layer, or the expected input shape.
pool_dims : list of integers
The list of \(n_i\)’s that define the output dimension of each pooling level \(i\). The length of pool_dims is the level of the spatial pyramid.
mode : string
Pooling mode, one of ‘max’, ‘average_inc_pad’ or ‘average_exc_pad’. Defaults to ‘max’.
**kwargs
Any additional keyword arguments are passed to the
Layer
superclass.
Notes
This layer should be inserted between the convolutional part of a DNN and its dense part. Convolutions can be used for arbitrary input dimensions, but the size of their output will depend on their input dimensions. Connecting the output of the convolutional to the dense part then usually demands us to fix the dimensions of the network’s InputLayer. The spatial pyramid pooling layer, however, allows us to leave the network input dimensions arbitrary. The advantage over a global pooling layer is the added robustness against object deformations due to the pooling on different scales.
References
- R39
He, Kaiming et al (2015): Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. http://arxiv.org/pdf/1406.4729.pdf.
-
get_output_for
(input, **kwargs)[source]¶ Propagates the given input through this layer (and only this layer).
- Parameters
input : Theano expression
The expression to propagate through this layer.
- Returns
output : Theano expression
The output of this layer given the input to this layer.
Notes
This is called by the base
lasagne.layers.get_output()
to propagate data through a network.This method should be overridden when implementing a new
Layer
class. By default it raises NotImplementedError.
-
get_output_shape_for
(input_shape)[source]¶ Computes the output shape of this layer, given an input shape.
- Parameters
input_shape : tuple
A tuple representing the shape of the input. The tuple should have as many elements as there are input dimensions, and the elements should be integers or None.
- Returns
tuple
A tuple representing the shape of the output of this layer. The tuple has as many elements as there are output dimensions, and the elements are all either integers or None.
Notes
This method will typically be overridden when implementing a new
Layer
class. By default it simply returns the input shape. This means that a layer that does not modify the shape (e.g. because it applies an elementwise operation) does not need to override this method.
-
class
lasagne.layers.dnn.
BatchNormDNNLayer
(incoming, axes='auto', epsilon=1e-4, alpha=0.1, beta=lasagne.init.Constant(0), gamma=lasagne.init.Constant(1), mean=lasagne.init.Constant(0), inv_std=lasagne.init.Constant(1), **kwargs)[source]¶ Batch Normalization
This layer implements batch normalization of its inputs:
\[y = \frac{x - \mu}{\sqrt{\sigma^2 + \epsilon}} \gamma + \beta\]This is a drop-in replacement for
lasagne.layers.BatchNormLayer
that uses cuDNN for improved performance and reduced memory usage.- Parameters
incoming : a
Layer
instance or a tupleThe layer feeding into this layer, or the expected input shape
axes : ‘auto’, int or tuple of int
The axis or axes to normalize over. If
'auto'
(the default), normalize over all axes except for the second: this will normalize over the minibatch dimension for dense layers, and additionally over all spatial dimensions for convolutional layers. Only supports'auto'
and the equivalent axes list, or0
and(0,)
to normalize over the minibatch dimension only.epsilon : scalar
Small constant \(\epsilon\) added to the variance before taking the square root and dividing by it, to avoid numerical problems. Must not be smaller than
1e-5
.alpha : scalar
Coefficient for the exponential moving average of batch-wise means and standard deviations computed during training; the closer to one, the more it will depend on the last batches seen
beta : Theano shared variable, expression, numpy array, callable or None
Initial value, expression or initializer for \(\beta\). Must match the incoming shape, skipping all axes in axes. Set to
None
to fix it to 0.0 instead of learning it. Seelasagne.utils.create_param()
for more information.gamma : Theano shared variable, expression, numpy array, callable or None
Initial value, expression or initializer for \(\gamma\). Must match the incoming shape, skipping all axes in axes. Set to
None
to fix it to 1.0 instead of learning it. Seelasagne.utils.create_param()
for more information.mean : Theano shared variable, expression, numpy array, or callable
Initial value, expression or initializer for \(\mu\). Must match the incoming shape, skipping all axes in axes. See
lasagne.utils.create_param()
for more information.inv_std : Theano shared variable, expression, numpy array, or callable
Initial value, expression or initializer for \(1 / \sqrt{ \sigma^2 + \epsilon}\). Must match the incoming shape, skipping all axes in axes. See
lasagne.utils.create_param()
for more information.**kwargs
Any additional keyword arguments are passed to the
Layer
superclass.
See also
batch_norm_dnn
Convenience function to apply batch normalization
Notes
This layer should be inserted between a linear transformation (such as a
DenseLayer
, orConv2DLayer
) and its nonlinearity. The convenience functionbatch_norm_dnn()
modifies an existing layer to insert cuDNN batch normalization in front of its nonlinearity.For further information, see
lasagne.layers.BatchNormLayer
. This implementation is fully compatible, except for restrictions on the axes and epsilon arguments.-
get_output_for
(input, deterministic=False, batch_norm_use_averages=None, batch_norm_update_averages=None, **kwargs)[source]¶ Propagates the given input through this layer (and only this layer).
- Parameters
input : Theano expression
The expression to propagate through this layer.
- Returns
output : Theano expression
The output of this layer given the input to this layer.
Notes
This is called by the base
lasagne.layers.get_output()
to propagate data through a network.This method should be overridden when implementing a new
Layer
class. By default it raises NotImplementedError.
-
lasagne.layers.dnn.
batch_norm_dnn
(layer, **kwargs)[source]¶ Apply cuDNN batch normalization to an existing layer. This is a drop-in replacement for
lasagne.layers.batch_norm()
; see there for further information.- Parameters
layer : A
Layer
instanceThe layer to apply the normalization to; note that it will be modified as specified in
lasagne.layers.batch_norm()
**kwargs
Any additional keyword arguments are passed on to the
BatchNormDNNLayer
constructor.- Returns
BatchNormDNNLayer or NonlinearityLayer instance
A batch normalization layer stacked on the given modified layer, or a nonlinearity layer stacked on top of both if layer was nonlinear.