Caffe
Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
caffe::RecurrentLayer< Dtype > Class Template Referenceabstract

An abstract class for implementing recurrent behavior inside of an unrolled network. This Layer type cannot be instantiated – instead, you should use one of its implementations which defines the recurrent architecture, such as RNNLayer or LSTMLayer. More...

#include <recurrent_layer.hpp>

Inheritance diagram for caffe::RecurrentLayer< Dtype >:
caffe::LSTMLayer< Dtype > caffe::RNNLayer< Dtype >

Public Member Functions

 RecurrentLayer (const LayerParameter &param)
 
virtual void LayerSetUp (const vector< Blob< Dtype > * > &bottom, const vector< Blob< Dtype > * > &top)
 
virtual void Reshape (const vector< Blob< Dtype > * > &bottom, const vector< Blob< Dtype > * > &top)
 
virtual void Reset ()
 
virtual const char * type () const
 
virtual int MinBottomBlobs () const
 
virtual int MaxBottomBlobs () const
 
virtual int ExactNumTopBlobs () const
 
virtual bool AllowForceBackward (const int bottom_index) const
 

Protected Member Functions

virtual void FillUnrolledNet (NetParameter *net_param) const =0
 Fills net_param with the recurrent network architecture. Subclasses should define this – see RNNLayer and LSTMLayer for examples.
 
virtual void RecurrentInputBlobNames (vector< string > *names) const =0
 Fills names with the names of the 0th timestep recurrent input Blob&s. Subclasses should define this – see RNNLayer and LSTMLayer for examples.
 
virtual void RecurrentInputShapes (vector< BlobShape > *shapes) const =0
 Fills shapes with the shapes of the recurrent input Blob&s. Subclasses should define this – see RNNLayer and LSTMLayer for examples.
 
virtual void RecurrentOutputBlobNames (vector< string > *names) const =0
 Fills names with the names of the Tth timestep recurrent output Blob&s. Subclasses should define this – see RNNLayer and LSTMLayer for examples.
 
virtual void OutputBlobNames (vector< string > *names) const =0
 Fills names with the names of the output blobs, concatenated across all timesteps. Should return a name for each top Blob. Subclasses should define this – see RNNLayer and LSTMLayer for examples.
 
virtual void Forward_cpu (const vector< Blob< Dtype > * > &bottom, const vector< Blob< Dtype > * > &top)
 
virtual void Forward_gpu (const vector< Blob< Dtype > * > &bottom, const vector< Blob< Dtype > * > &top)
 
virtual void Backward_cpu (const vector< Blob< Dtype > * > &top, const vector< bool > &propagate_down, const vector< Blob< Dtype > * > &bottom)
 

Protected Attributes

shared_ptr< Net< Dtype > > unrolled_net_
 A Net to implement the Recurrent functionality.
 
int N_
 The number of independent streams to process simultaneously.
 
int T_
 The number of timesteps in the layer's input, and the number of timesteps over which to backpropagate through time.
 
bool static_input_
 Whether the layer has a "static" input copied across all timesteps.
 
int last_layer_index_
 The last layer to run in the network. (Any later layers are losses added to force the recurrent net to do backprop.)
 
bool expose_hidden_
 Whether the layer's hidden state at the first and last timesteps are layer inputs and outputs, respectively.
 
vector< Blob< Dtype > * > recur_input_blobs_
 
vector< Blob< Dtype > * > recur_output_blobs_
 
vector< Blob< Dtype > * > output_blobs_
 
Blob< Dtype > * x_input_blob_
 
Blob< Dtype > * x_static_input_blob_
 
Blob< Dtype > * cont_input_blob_
 

Detailed Description

template<typename Dtype>
class caffe::RecurrentLayer< Dtype >

An abstract class for implementing recurrent behavior inside of an unrolled network. This Layer type cannot be instantiated – instead, you should use one of its implementations which defines the recurrent architecture, such as RNNLayer or LSTMLayer.

Member Function Documentation

◆ Forward_cpu()

template<typename Dtype >
void caffe::RecurrentLayer< Dtype >::Forward_cpu ( const vector< Blob< Dtype > * > &  bottom,
const vector< Blob< Dtype > * > &  top 
)
protectedvirtual
Parameters
bottominput Blob vector (length 2-3)
  1. $ (T \times N \times ...) $ the time-varying input $ x $. After the first two axes, whose dimensions must correspond to the number of timesteps $ T $ and the number of independent streams $ N $, respectively, its dimensions may be arbitrary. Note that the ordering of dimensions – $ (T \times N \times ...) $, rather than $ (N \times T \times ...) $ – means that the $ N $ independent input streams must be "interleaved".
  2. $ (T \times N) $ the sequence continuation indicators $ \delta $. These inputs should be binary (0 or 1) indicators, where $ \delta_{t,n} = 0 $ means that timestep $ t $ of stream $ n $ is the beginning of a new sequence, and hence the previous hidden state $ h_{t-1} $ is multiplied by $ \delta_t = 0 $ and has no effect on the cell's output at timestep $ t $, and a value of $ \delta_{t,n} = 1 $ means that timestep $ t $ of stream $ n $ is a continuation from the previous timestep $ t-1 $, and the previous hidden state $ h_{t-1} $ affects the updated hidden state and output.
  3. $ (N \times ...) $ (optional) the static (non-time-varying) input $ x_{static} $. After the first axis, whose dimension must be the number of independent streams, its dimensions may be arbitrary. This is mathematically equivalent to using a time-varying input of $ x'_t = [x_t; x_{static}] $ – i.e., tiling the static input across the $ T $ timesteps and concatenating with the time-varying input. Note that if this input is used, all timesteps in a single batch within a particular one of the $ N $ streams must share the same static input, even if the sequence continuation indicators suggest that difference sequences are ending and beginning within a single batch. This may require padding and/or truncation for uniform length.
Parameters
topoutput Blob vector (length 1)
  1. $ (T \times N \times D) $ the time-varying output $ y $, where $ D $ is recurrent_param.num_output(). Refer to documentation for particular RecurrentLayer implementations (such as RNNLayer and LSTMLayer) for the definition of $ y $.

The documentation for this class was generated from the following files: