Skip to content

Commit

Permalink
[FRONTEND] A Python hybrid frontend (#1251)
Browse files Browse the repository at this point in the history
  • Loading branch information
were authored and tqchen committed Jun 22, 2018
1 parent a55bc29 commit 90db723
Show file tree
Hide file tree
Showing 14 changed files with 1,232 additions and 5 deletions.
15 changes: 15 additions & 0 deletions docs/api/python/hybrid.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
tvm.hybrid
----------
.. automodule:: tvm.hybrid

.. autosummary::

tvm.hybrid.parse
tvm.hybrid.script
tvm.hybrid.popcount
tvm.hybrid.sigmoid

.. autofunction:: tvm.hybrid.parse
.. autofunction:: tvm.hybrid.script
.. autofunction:: tvm.hybrid.popcount
.. autofunction:: tvm.hybrid.sigmoid
1 change: 1 addition & 0 deletions docs/api/python/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,4 @@ Python API
dev
topi
nnvm/index
hybrid
76 changes: 76 additions & 0 deletions docs/dev/hybrid_script.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
Hybrid Frontend Developer Guide
===============================

If you are a developer:

1. who is trying writing some preliminary patterns that have not been supported by TVM yet,
maybe :ref:`hybrid-langref-label` is a better place for you.

2. who wants to know the implementing details of this module, you are right here!

Features
--------

Software emulation
~~~~~~~~~~~~~~~~~~

In software emulation, the most intresting thing is the decorator ``tvm.hybrid.script``.
This decorator helps 2 things:

1. Importing runtime variables

2. Overload the function according to the arguments passed

Correct me if I am wrong: I believe that how 1. is implemented is dangerous, but I have no
choice. What I did is add those names into python dict ``func.__global__`` and after
the call to ``func`` is done, those names will be cleaned up.

Overload is simple: the decorator checks the arguments' types and determines which function
should be actually called.


Backend Compilation
~~~~~~~~~~~~~~~~~~~

Compilation is a large module, you can see ``python/tvm/hybrid/var_decl.py`` and
``python/tvm/hybrid/parser.py`` for more details. The first stage determines the
usage, or more accurately the declaration of each variable and the second stage does
the actual IR generation.

Attributes
~~~~~~~~~~

So far, ONLY tensors' `shape` attribute is supported. You can see ``visit_Subscript``
in ``python/tvm/hybrid/parser.py`` for more details. This is a hacky solution, I just
check the attributes when subscript.

Loops
~~~~~

In HalideIR, loops have in total 4 types: ``serial``, ``unrolled``, ``parallel``, and ``vectorized``.


.. note::

Unlike what that is in HalideIR, in ``loop_type(a, b)``, ``a`` is the starting point and ``b``
is the trip count of iterations. Here ``loop_type(a, b)`` indicates ``[a, b)``. Thus, when lowering it
to HalideIR, we need to do ``start, extent = a, b - a``


.. note::

In HalideIR those are enums, they are in passive form.
Here we use active form to annotate loops, because they are ready to run.


Variables
~~~~~~~~~

Because there is no variables in ``HalideIR``, all the mutatable variables will be lowered to an array with size 1.
It takes the first store of a variable as its declaration.

Math intrinsics
~~~~~~~~~~~~~~~
So far, these math intrinsics, ``log``, ``exp``, ``sigmoid``, ``tanh``, ``power``, and ``popcount``, are supported.
Math intrinsics will be imported by the decorator. Most of the intrinsics are borrowed by library implementation
except ``popcount`` and ``sigmoid``. I implemented them manually.
1 change: 1 addition & 0 deletions docs/dev/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ In this part of documentation, we share the rationale for the specific choices m
runtime
nnvm_json_spec
nnvm_overview
hybrid_script
172 changes: 172 additions & 0 deletions docs/langref/hybrid_script.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
.. _hybrid-langref-label:

Hybrid Frontend Language Reference
==================================

Overview
--------

This hybrid frontend allows users to write preliminary versions of some idioms that yet have
been supported by TVM officially.

Features
--------

Software Emulation
~~~~~~~~~~~~~~~~~~

Both software emulation and compilation are supported. To define a function,
you need to use ``tvm.hybrid.script`` decorator to indicate this is a hybrid function:

.. code-block:: python
@tvm.hybrid.script
def outer_product(a, b, c):
for i in range(a.shape[0]):
for j in range(b.shape[0]):
c[i, j] = a[i] * b[j]
a = numpy.random.rand(100)
b = numpy.random.rand(99)
c = numpy.zeros((100, 99))
outer_product(a, b, c)
This decorator will import `Keywords`_ required spontaneously when software emulation.
After software emulation is done, the imported keywords will be cleaned up. Users do not need
worry about keyword conflict and pollution.

Every element passed for software emulation in the argument list is either a python variable
or ``numpy`` numeric type.

Backend Compilation
~~~~~~~~~~~~~~~~~~~

The current parse interface looks like:

.. code-block:: python
a = tvm.placeholder((100, ), name='a')
b = tvm.placeholder((99, ), name='b')
c = tvm.placeholder((100, 99), name='c')
tvm.hybrid.parse(outer_product, [a, b, c]) # return an ir root of this function
If we pass these tvm tensors to this function, it returns a op node:

**Under construction, we are still deciding what kind of node should be returned.**

.. code-block:: python
a = tvm.placeholder((100, ), name='a')
b = tvm.placeholder((99, ), name='b')
c = tvm.placeholder((100, 99), name='c')
op = outer_product(a, b, c) # return the corresponding op node
Tuning
~~~~~~

**Under construction, not truly supported yet.**

Follow up the example above, you can use some tvm like interfaces to tune the code:

.. code-block:: python
sch = tvm.create_schedule(op)
jo, ji = sch.split(j, 4)
sch.vectorize(ji)
``split``, ``reorder``, and loop_annotation will be supported!

Loops
~~~~~

In HalideIR, loops have in total 4 types: ``serial``, ``unrolled``, ``parallel``, and ``vectorized``.

Here we use ``range`` aka ``serial``, ``unroll``, ``parallel``, and ``vectorize``,
these **4** keywords to annotate the corresponding types of for loops.
The the usage is roughly the same as Python standard ``range``.

Variables
~~~~~~~~~

All the mutatable variables will be lowered to an array with size 1.
It regards the first store of a variable as its declaration.

.. note::

Unlike conventional Python, in hybrid script, the declared variable
can only be used in the scope level it is declared.


.. note::

Currently, you can ONLY use basic-typed variables, i.e. the type of the
variable should be either ``float32``, or ``int32``.

.. code-block:: python
for i in range(5):
s = 0 # declaration, this s will be a 1-array in lowered IR
for j in range(5):
s += a[i, j] # do something with sum
b[i] = sum # you can still use sum in this level
a[0] = s # you CANNOT use s here, even though it is allowed in conventional Python
b = (1, 2) # this has NOT been supported yet!
Attributes
~~~~~~~~~~

So far, ONLY tensors' ``shape`` attribute is supported! The ``shape`` atrribute is essentailly a
tuple, so you MUST access it as an array. Also, currently, only constant-indexed access is supported.

.. code-block:: python
x = a.shape[2] # OK!
for i in range(3):
for j in a.shape[i]: # BAD! i is not a constant!
# do something
Conditional Statement and Expression
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python
if condition:
# do something
a = b if condition else c
However, NO ``True`` and ``False`` keyword supported yet.


Math Intrinsics
~~~~~~~~~~~~~~~

So far, these math intrinsics, ``log``, ``exp``, ``sigmoid``,
``tanh``, ``power``, and ``popcount``, are supported.
No import is required, just as it is mentioned in `Software Emulation`_, just use it!

Array Allocation
~~~~~~~~~~~~~~~~

**Under construction, this function will be supported later!**

Use a function call ``allocation(shape, type, share/local)`` to declare an array buffer.
The basic usage is roughly the same as a normal array.


Thread Bind
~~~~~~~~~~~


You can also do loop-thread bind by writing code like this:

.. code-block:: python
for tx in bind("threadIdx.x", 100):
a[tx] = b[tx]
Keywords
~~~~~~~~
- For keywords: ``serial``, ``range``, ``unroll``, ``parallel``, ``vectorize``, ``bind``
- Math keywords: ``log``, ``exp``, ``sigmoid``, ``tanh``, ``power``, ``popcount``
5 changes: 5 additions & 0 deletions docs/langref/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,8 @@ Language Reference
==================
This document provide references to
embedded languages in TVM stack.

.. toctree::
:maxdepth: 2

hybrid_script
18 changes: 13 additions & 5 deletions python/tvm/build_module.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,12 +332,20 @@ def lower(sch,
lower_phase1 = [x[1] for x in add_lower_pass if x[0] == 1]
lower_phase2 = [x[1] for x in add_lower_pass if x[0] == 2]
lower_phase3 = [x[1] for x in add_lower_pass if x[0] > 2]
# normalize schedule first
sch = sch.normalize()

# Phase 0
bounds = schedule.InferBound(sch)
stmt = schedule.ScheduleOps(sch, bounds)
stmt = ir_pass.InjectPrefetch(stmt)
if isinstance(sch, schedule.Schedule):
# normalize schedule first
sch = sch.normalize()
bounds = schedule.InferBound(sch)
stmt = schedule.ScheduleOps(sch, bounds)
stmt = ir_pass.InjectPrefetch(stmt)
else:
#So far there is no op for hybrid script, so a plain ir body is given
if not isinstance(sch, _stmt.Stmt):
raise ValueError("sch should be either a Schedule or a Stmt")
stmt = sch

for f in lower_phase0:
stmt = f(stmt)
# Phase 1
Expand Down
10 changes: 10 additions & 0 deletions python/tvm/hybrid/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
"""Hybrid Programming APIs of TVM Python Package.
This package maps a subset of python to HalideIR so that:
1. Users can write some preliminary versions of the computation patterns
have not been supported yet and verify it across the real execution and
python semantic emulation.
2. Developers can build HalideIR by writing Python code.
"""

from .api import script, parse
46 changes: 46 additions & 0 deletions python/tvm/hybrid/api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
"""APIs of lowering the Python subset to HalideIR"""
from __future__ import absolute_import as _abs

import types
import decorator
from .parser import parse_python

@decorator.decorator
def script(func, *args):
"""If the arguments are tvm types, compile it to HalideIR.
O.W. return the python emulated result"""
from .util import _enter_hybrid_runtime, _restore_runtime, _is_tvm_arg_types
if _is_tvm_arg_types(args):
return parse(func, args)
else:
intersect = _enter_hybrid_runtime(func)
func(*args)
_restore_runtime(func, intersect)
return func


def parse(func, args):
"""Parse a subset of Python to HalideIR
Parameters
----------
func : str or types.FunctionType
If it is a string, parse the source code
If it is a function, parse the function
args : list of Buffer or Tensor or Var
The argument lists to the function.
Leave it None if no buffer is related to the function to be parsed
Returns
-------
root : Stmt
The result Halide IR and the parser class instance.
"""
from .util import _pruned_source
if isinstance(func, str):
src = func
else:
assert isinstance(func, types.FunctionType)
src = _pruned_source(func)
return parse_python(src, args)
Loading

0 comments on commit 90db723

Please sign in to comment.