============================================= Writing an Op to work on an ``ndarray`` in C ============================================= This section walks through a non-trivial example Op that does something pretty weird and unrealistic, that is hard to express with existing Ops. (Technically, we could use ``Scan`` to implement the Op we're about to describe, but we ignore that possibility for the sake of example.) The following code works, but important error-checking has been omitted for clarity. For example, when you write C code that assumes memory is contiguous, you should check the strides and alignment. .. testcode:: import theano class Fibby(theano.Op): """ An arbitrarily generalized Fibbonacci sequence """ __props__ = () def make_node(self, x): x_ = tensor.as_tensor_variable(x) assert x_.ndim == 1 return theano.Apply(self, inputs=[x_], outputs=[x_.type()]) # using x_.type() is dangerous, it copies x's broadcasting behaviour def perform(self, node, inputs, output_storage): x, = inputs y = output_storage[0][0] = x.copy() for i in range(2, len(x)): y[i] = y[i-1] * y[i-2] + x[i] def c_code(self, node, name, inames, onames, sub): x, = inames y, = onames fail = sub['fail'] return """ Py_XDECREF(%(y)s); %(y)s = (PyArrayObject*)PyArray_FromArray( %(x)s, 0, NPY_ARRAY_ENSURECOPY); if (!%(y)s) %(fail)s; {//New scope needed to make compilation work dtype_%(y)s * y = (dtype_%(y)s*)PyArray_DATA(%(y)s); dtype_%(x)s * x = (dtype_%(x)s*)PyArray_DATA(%(x)s); for (int i = 2; i < PyArray_DIMS(%(x)s)[0]; ++i) y[i] = y[i-1]*y[i-2] + x[i]; } """ % locals() def c_code_cache_version(self): return (1,) fibby = Fibby() In the first two lines of the C function, we make y point to a new array with the correct size for the output. This is essentially simulating the line ``y = x.copy()``. The variables ``%(x)s`` and ``%(y)s`` are set up by the TensorType to be ``PyArrayObject`` pointers. TensorType also set up ``dtype_%(x)s`` to be a typdef to the C type for ``x``. .. code-block:: c Py_XDECREF(%(y)s); %(y)s = (PyArrayObject*)PyArray_FromArray( %(x)s, 0, NPY_ARRAY_ENSURECOPY); The first line reduces the reference count of the data that y originally pointed to. The second line allocates the new data and makes y point to it. In C code for a theano op, numpy arrays are represented as ``PyArrayObject`` C structs. This is part of the numpy/scipy C API documented at http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html TODO: NEEDS MORE EXPLANATION. .. _op_contract_fibby: Writing an Optimization ======================= ``fibby`` of a vector of zeros is another vector of zeros of the same size. Theano does not attempt to infer this from the code provided via ``Fibby.perform`` or ``Fibby.c_code``. However, we can write an optimization that makes use of this observation. This sort of local substitution of special cases is common, and there is a stage of optimization (specialization) devoted to such optimizations. The following optimization (``fibby_of_zero``) tests whether the input is guaranteed to be all zero, and if so it returns the input itself as a replacement for the old output. TODO: talk about OPTIMIZATION STAGES .. If you modify this code, also change : .. theano/tests/test_tutorial.py:T_fibby.test_fibby_1 .. testcode:: from theano.tensor.opt import get_scalar_constant_value, NotScalarConstantError # Remove any fibby(zeros(...)) @theano.tensor.opt.register_specialize @theano.gof.local_optimizer([fibby]) def fibby_of_zero(node): if node.op == fibby: x = node.inputs[0] try: if numpy.all(0 == get_scalar_constant_value(x)): return [x] except NotScalarConstantError: pass The ``register_specialize`` decorator is what activates our optimization, and tells Theano to use it in the specialization stage. The ``local_optimizer`` decorator builds a class instance around our global function. The ``[fibby]`` argument is a hint that our optimizer works on nodes whose ``.op`` attribute equals ``fibby``. The function here (``fibby_of_zero``) expects an ``Apply`` instance as an argument for parameter ``node``. It tests using function ``get_scalar_constant_value``, which determines if a Variable (``x``) is guaranteed to be a constant, and if so, what constant. Test the optimization ===================== Here is some code to test that the optimization is applied only when needed. .. testcode:: import numpy import theano.tensor as T from theano import function from theano import tensor # Test it does not apply when not needed x = T.dvector() f = function([x], fibby(x)) # We call the function to make sure it runs. # If you run in DebugMode, it will compare the C and Python outputs. f(numpy.random.rand(5)) topo = f.maker.fgraph.toposort() assert len(topo) == 1 assert isinstance(topo[0].op, Fibby) # Test that the optimization gets applied. f_zero = function([], fibby(T.zeros([5]))) # If you run in DebugMode, it will compare the output before # and after the optimization. f_zero() # Check that the optimization removes the Fibby Op. # For security, the Theano memory interface ensures that the output # of the function is always memory not aliased to the input. # That is why there is a DeepCopyOp op. topo = f_zero.maker.fgraph.toposort() assert len(topo) == 1 assert isinstance(topo[0].op, theano.compile.ops.DeepCopyOp)