sandbox.blocksparse – Block sparse dot operations (gemv and outer)

API

class theano.sandbox.blocksparse.SparseBlockGemv(inplace=False)

This op computes the dot product of specified pieces of vectors and matrices, returning pieces of vectors:

for b in range(batch_size):
for j in range(o.shape[1]):
for i in range(h.shape[1]):
o[b, j, :] += numpy.dot(h[b, i], W[iIdx[b, i], oIdx[b, j]])

where b, h, W, o iIdx, oIdx are defined in the docstring of make_node.

../../_images/blocksparse.png
make_node(o, W, h, inputIdx, outputIdx)

Compute the dot product of the specified pieces of vectors and matrices.

Parameters:
  • var (shape, comment) –
  • o ((batch, oWin, oSize) output vector) –
  • W ((iBlocks, oBlocks, iSize, oSize), weight matrix) –
  • h ((batch, iWin, iSize), input from lower layer (sparse)) –
  • inputIdx ((batch, iWin), indexes of the input blocks) –
  • outputIdx ((batch, oWin), indexes of the output blocks) –
  • (batch, oWin, oSize), dot(W[i, j], h[i]) + o[j] (returns) –
  • Notation
  • --------
  • batch is the number of examples in a minibatch (batch size). (-) –
  • iBlocks is the total number of blocks in the input (from lower (-) – layer).
  • iSize is the size of each of these input blocks. (-) –
  • iWin is the number of blocks that will be used as inputs. Which (-) – blocks will be used is specified in inputIdx.
  • oBlocks is the number or possible output blocks. (-) –
  • oSize is the size of each of these output blocks. (-) –
  • oWin is the number of output blocks that will actually be computed. (-) – Which blocks will be computed is specified in outputIdx.
class theano.sandbox.blocksparse.SparseBlockOuter(inplace=False)

This computes the outer product of two sets of pieces of vectors updating a full matrix with the results:

for b in range(batch_size):
o[xIdx[b, i], yIdx[b, j]] += (alpha * outer(x[b, i], y[b, j]))

This op is involved in the gradient of SparseBlockGemv.

make_node(o, x, y, xIdx, yIdx, alpha=None)

Compute the dot product of the specified pieces of vectors and matrices.

Parameters:
  • var (shape, comment) –
  • o ((xBlocks, yBlocks, xSize, ySize)) –
  • x ((batch, xWin, xSize)) –
  • y ((batch, yWin, ySize)) –
  • xIdx ((batch, iWin), indexes of the x blocks) –
  • yIdx ((batch, oWin), indexes of the y blocks) –
  • (xBlocks, yBlocks, xSize, ySize), outer(x[i], y[j]) + o[i, j] (returns) –
  • Notation
  • --------
  • batch is the number of examples in a minibatch (batch size). (-) –
  • xBlocks is the total number of blocks in x. (-) –
  • xSize is the size of each of these x blocks. (-) –
  • xWin is the number of blocks that will be used as x. Which blocks (-) – will be used is specified in xIdx.
  • yBlocks is the number or possible y blocks. (-) –
  • ySize is the size of each of these y blocks. (-) –
  • yWin is the number of y blocks that will actually be computed. (-) – Which blocks will be computed is specified in yIdx.
theano.sandbox.blocksparse.sparse_block_dot(W, h, inputIdx, b, outputIdx)

Compute the dot product (plus bias) of the specified pieces of vectors and matrices. See SparseBlockGemv to get more information.

Parameters:
  • var (shape, comment) –
  • W ((iBlocks, oBlocks, iSize, oSize), weight matrix) –
  • h ((batch, iWin, iSize), input from lower layer (sparse)) –
  • inputIdx ((batch, iWin), indexes of the input blocks) –
  • b ((oBlocks, oSize), bias vector) –
  • outputIdx ((batch, oWin), indexes of the output blocks) –
  • (batch, oWin, oSize), dot(W[i, j], h[i]) + b[j] (returns) – but b[j] is only added once
  • Notation
  • --------
  • batch is the number of examples in a minibatch (batch size). (-) –
  • iBlocks is the total number of blocks in the input (from lower layer). (-) –
  • iSize is the size of each of these input blocks. (-) –
  • iWin is the number of blocks that will be used as inputs. Which blocks (-) – will be used is specified in inputIdx.
  • oBlocks is the number or possible output blocks. (-) –
  • oSize is the size of each of these output blocks. (-) –
  • oWin is the number of output blocks that will actually be computed. (-) – Which blocks will be computed is specified in outputIdx.