List of gpuarray Ops implemented¶
Normally you should not call directly those Ops! Theano should automatically transform cpu ops to their gpu equivalent. So this list is just useful to let people know what is implemented on the gpu.
Basic Op¶
-
class
theano.sandbox.gpuarray.basic_ops.
GpuAlloc
(context_name, memset_0=False)[source]¶ Allocate initialized memory on the GPU.
Parameters: - context_name (str) – The name of the context in which to allocate memory
- memset_0 (bool) – It’s only an optimized version. True, it means the value is always 0, so the c code call memset as it is faster.
-
class
theano.sandbox.gpuarray.basic_ops.
GpuAllocEmpty
(dtype, context_name)[source]¶ Allocate uninitialized memory on the GPU.
-
class
theano.sandbox.gpuarray.basic_ops.
GpuContiguous
(use_c_code='/usr/bin/g++')[source]¶ Return a C contiguous version of the input.
This may either pass the object as-is (if already C contiguous) or make a copy.
-
class
theano.sandbox.gpuarray.basic_ops.
GpuKernelBase
[source]¶ Base class for operations that need to compile kernels.
It is not mandatory to use this class, but it helps with a lot of the small things that you have to pay attention to.
-
class
theano.sandbox.gpuarray.basic_ops.
GpuReshape
(ndim, name=None)[source]¶ Reshape for GPU variables.
-
class
theano.sandbox.gpuarray.basic_ops.
HostFromGpu
(use_c_code='/usr/bin/g++')[source]¶ Transfer data to CPU.
-
class
theano.sandbox.gpuarray.basic_ops.
Kernel
(code, params, name, flags, codevar=None, binvar=None, objvar=None)[source]¶ This class groups together all the attributes of a gpu kernel.
-
theano.sandbox.gpuarray.basic_ops.
as_gpuarray_variable
(x, context_name)[source]¶ This will attempt to convert x into a variable on the GPU.
It can take either a value of another variable. If x is already suitable, it will be returned as-is.
Parameters: - x – Object to convert
- context_name (str or None) – target context name for the result
Blas Op¶
Elemwise Op¶
-
theano.sandbox.gpuarray.elemwise.
GpuCAReduce
[source]¶ alias of
GpuCAReduceCPY
-
class
theano.sandbox.gpuarray.elemwise.
GpuCAReduceCPY
(scalar_op, axis=None, dtype=None, acc_dtype=None)[source]¶ CAReduce that reuse the python code from gpuarray.
-
class
theano.sandbox.gpuarray.elemwise.
GpuCAReduceCuda
(scalar_op, axis=None, reduce_mask=None, dtype=None, acc_dtype=None, pre_scalar_op=None)[source]¶ GpuCAReduceCuda is a Reduction along some dimensions by a scalar op.
Parameters: - reduce_mask – The dimensions along which to reduce. The reduce_mask is a tuple of booleans (actually integers 0 or 1) that specify for each input dimension, whether to reduce it (1) or not (0).
- pre_scalar_op – If present, must be a scalar op with only 1 input. We will execute it on the input value before reduction.
Examples
When scalar_op is a theano.scalar.basic.Add instance:
- reduce_mask == (1,) sums a vector to a scalar
- reduce_mask == (1,0) computes the sum of each column in a matrix
- reduce_mask == (0,1) computes the sum of each row in a matrix
- reduce_mask == (1,1,1) computes the sum of all elements in a 3-tensor.
Notes
Any reduce_mask of all zeros is a sort of ‘copy’, and may be removed during graph optimization.
This Op is a work in progress.
This op was recently upgraded from just GpuSum a general CAReduce. Not many code cases are supported for scalar_op being anything other than scal.Add instances yet.
Important note: if you implement new cases for this op, be sure to benchmark them and make sure that they actually result in a speedup. GPUs are not especially well-suited to reduction operations so it is quite possible that the GPU might be slower for some cases.
-
class
theano.sandbox.gpuarray.elemwise.
GpuDimShuffle
(input_broadcastable, new_order, inplace=False)[source]¶ DimShuffle on the GPU.
Subtensor Op¶
-
class
theano.sandbox.gpuarray.subtensor.
GpuAdvancedIncSubtensor1
(inplace=False, set_instead_of_inc=False)[source]¶ Implement AdvancedIncSubtensor1 on the gpu.
-
class
theano.sandbox.gpuarray.subtensor.
GpuAdvancedIncSubtensor1_dev20
(inplace=False, set_instead_of_inc=False)[source]¶ Implement AdvancedIncSubtensor1 on the gpu, but use function only avail on compute capability 2.0 and more recent.
-
class
theano.sandbox.gpuarray.subtensor.
GpuAdvancedSubtensor1
(sparse_grad=False)[source]¶ AdvancedSubrensor1 on the GPU.
-
class
theano.sandbox.gpuarray.subtensor.
GpuIncSubtensor
(idx_list, inplace=False, set_instead_of_inc=False, destroyhandler_tolerate_aliased=None)[source]¶ Implement IncSubtensor on the gpu.
Notes
The optimization to make this inplace is in tensor/opt. The same optimization handles IncSubtensor and GpuIncSubtensor. This Op has c_code too; it inherits tensor.IncSubtensor’s c_code. The helper methods like
do_type_checking()
,copy_of_x()
, etc. specialize the c_code for this Op.-
copy_into
(view, source)[source]¶ Parameters: - view (string) – C code expression for an array.
- source (string) – C code expression for an array.
Returns: C code expression to copy source into view, and 0 on success.
Return type: str
-
copy_of_x
(x)[source]¶ Parameters: x – A string giving the name of a C variable pointing to an array. Returns: C code expression to make a copy of x. Return type: str Notes
Base class uses PyArrayObject *, subclasses may override for different types of arrays.
-
Nnet Op¶
-
class
theano.sandbox.gpuarray.nnet.
GpuCrossentropySoftmax1HotWithBiasDx
(use_c_code='/usr/bin/g++')[source]¶ Implement CrossentropySoftmax1HotWithBiasDx on the gpu.
Gradient wrt x of the CrossentropySoftmax1Hot Op.
-
class
theano.sandbox.gpuarray.nnet.
GpuCrossentropySoftmaxArgmax1HotWithBias
(use_c_code='/usr/bin/g++')[source]¶ Implement CrossentropySoftmaxArgmax1HotWithBias on the gpu.
-
class
theano.sandbox.gpuarray.nnet.
GpuSoftmax
(use_c_code='/usr/bin/g++')[source]¶ Implement Softmax on the gpu.