lasagne.init

Functions to create initializers for parameter variables.

Examples

>>> from lasagne.layers import DenseLayer
>>> from lasagne.init import Constant, GlorotUniform
>>> l1 = DenseLayer((100,20), num_units=50,
...                 W=GlorotUniform('relu'), b=Constant(0.0))

Initializers

Constant([val])

Initialize weights with constant value.

Normal([std, mean])

Sample initial weights from the Gaussian distribution.

Uniform([range, std, mean])

Sample initial weights from the uniform distribution.

Glorot(initializer[, gain, c01b])

Glorot weight initialization.

GlorotNormal([gain, c01b])

Glorot with weights sampled from the Normal distribution.

GlorotUniform([gain, c01b])

Glorot with weights sampled from the Uniform distribution.

He(initializer[, gain, c01b])

He weight initialization.

HeNormal([gain, c01b])

He initializer with weights sampled from the Normal distribution.

HeUniform([gain, c01b])

He initializer with weights sampled from the Uniform distribution.

Orthogonal([gain])

Intialize weights as Orthogonal matrix.

Sparse([sparsity, std])

Initialize weights as sparse matrix.

Detailed description

class lasagne.init.Initializer[source]

Base class for parameter tensor initializers.

The Initializer class represents a weight initializer used to initialize weight parameters in a neural network layer. It should be subclassed when implementing new types of weight initializers.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Constant(val=0.0)[source]

Initialize weights with constant value.

Parameters

val : float

Constant value for weights.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Normal(std=0.01, mean=0.0)[source]

Sample initial weights from the Gaussian distribution.

Initial weight parameters are sampled from N(mean, std).

Parameters

std : float

Std of initial parameters.

mean : float

Mean of initial parameters.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Uniform(range=0.01, std=None, mean=0.0)[source]

Sample initial weights from the uniform distribution.

Parameters are sampled from U(a, b).

Parameters

range : float or tuple

When std is None then range determines a, b. If range is a float the weights are sampled from U(-range, range). If range is a tuple the weights are sampled from U(range[0], range[1]).

std : float or None

If std is a float then the weights are sampled from U(mean - np.sqrt(3) * std, mean + np.sqrt(3) * std).

mean : float

see std for description.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Glorot(initializer, gain=1.0, c01b=False)[source]

Glorot weight initialization.

This is also known as Xavier initialization [R4].

Parameters

initializer : lasagne.init.Initializer

Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.

c01b : bool

For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

GlorotNormal

Shortcut with Gaussian initializer.

GlorotUniform

Shortcut with uniform initializer.

Notes

For a DenseLayer, if gain='relu' and initializer=Uniform, the weights are initialized as

\[\begin{split}a &= \sqrt{\frac{12}{fan_{in}+fan_{out}}}\\ W &\sim U[-a, a]\end{split}\]

If gain=1 and initializer=Normal, the weights are initialized as

\[\begin{split}\sigma &= \sqrt{\frac{2}{fan_{in}+fan_{out}}}\\ W &\sim N(0, \sigma)\end{split}\]

References

R4(1,2)

Xavier Glorot and Yoshua Bengio (2010): Understanding the difficulty of training deep feedforward neural networks. International conference on artificial intelligence and statistics.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.GlorotNormal(gain=1.0, c01b=False)[source]

Glorot with weights sampled from the Normal distribution.

See Glorot for a description of the parameters.

class lasagne.init.GlorotUniform(gain=1.0, c01b=False)[source]

Glorot with weights sampled from the Uniform distribution.

See Glorot for a description of the parameters.

class lasagne.init.He(initializer, gain=1.0, c01b=False)[source]

He weight initialization.

Weights are initialized with a standard deviation of \(\sigma = gain \sqrt{\frac{1}{fan_{in}}}\) [R5].

Parameters

initializer : lasagne.init.Initializer

Initializer used to sample the weights, must accept std in its constructor to sample from a distribution with a given standard deviation.

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.

c01b : bool

For a lasagne.layers.cuda_convnet.Conv2DCCLayer constructed with dimshuffle=False, c01b must be set to True to compute the correct fan-in and fan-out.

See also

HeNormal

Shortcut with Gaussian initializer.

HeUniform

Shortcut with uniform initializer.

References

R5(1,2)

Kaiming He et al. (2015): Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. arXiv preprint arXiv:1502.01852.

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.HeNormal(gain=1.0, c01b=False)[source]

He initializer with weights sampled from the Normal distribution.

See He for a description of the parameters.

class lasagne.init.HeUniform(gain=1.0, c01b=False)[source]

He initializer with weights sampled from the Uniform distribution.

See He for a description of the parameters.

class lasagne.init.Orthogonal(gain=1.0)[source]

Intialize weights as Orthogonal matrix.

Orthogonal matrix initialization [R6]. For n-dimensional shapes where n > 2, the n-1 trailing axes are flattened. For convolutional layers, this corresponds to the fan-in, so this makes the initialization usable for both dense and convolutional layers.

Parameters

gain : float or ‘relu’

Scaling factor for the weights. Set this to 1.0 for linear and sigmoid units, to ‘relu’ or sqrt(2) for rectified linear units, and to sqrt(2/(1+alpha**2)) for leaky rectified linear units with leakiness alpha. Other transfer functions may need different factors.

References

R6(1,2)

Saxe, Andrew M., James L. McClelland, and Surya Ganguli. “Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.” arXiv preprint arXiv:1312.6120 (2013).

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.

class lasagne.init.Sparse(sparsity=0.1, std=0.01)[source]

Initialize weights as sparse matrix.

Parameters

sparsity : float

Exact fraction of non-zero values per column. Larger values give less sparsity.

std : float

Non-zero weights are sampled from N(0, std).

sample(shape)[source]

Sample should return a theano.tensor of size shape and data type theano.config.floatX.

Parameters

shape : tuple or int

Integer or tuple specifying the size of the returned matrix.

returns : theano.tensor

Matrix of size shape and dtype theano.config.floatX.