-
Notifications
You must be signed in to change notification settings - Fork 133
Tentative NMatrix Tutorial
- This tutorial is meant to mimic as closely as possible the Tentative NumPy Tutorial (as of August 2013) but for SciRuby's NMatrix. Because of this, examples and wording are used extensively from the Tentative Numpy Tutorial with little or no modification from the original. This is intentional, hereby disclosed, and hopefully justified given the nature of this project. The authors of this tutorial express deep appreciation to those responsible for the excellent Numpy and Tentative Numpy Tutorial. In addition, NMatrix was based heavily off of Masahiro Tanaka's excellent NArray work.
Before reading this tutorial you should know a bit of Ruby. If you would like to refresh your memory, take a look at the Ruby quickstart tutorial, the tutorials at rubymonk, or tryruby. If you wish to work the examples in this tutorial, you must also have some software installed on your computer. Minimally:
- Ruby >= 1.9.3 (Download Ruby)
- NMatrix (installation instructions)
You may find these useful for plotting:
Also, take a look at:
- pry for an enhanced interactive shell (better than irb)
- SciRuby(http://sciruby.com/) for additional scientific tools with growing NMatrix compatibility
- Follow the Tentative NumPy Tutorial as closely as possible.
- Aim to implement similar functionality, but in a ruby-ish way.
- Make a note on the wiki if something needs to be implemented to provide a particular functionality. Then, ideally, go and implement it and update the wiki.
An NMatrix is a homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NMatrix dimensions are called axes. The number of axes is dim. For example, the coordinates of a point in 3D space [1, 2, 1] is an array of dim 1, because it has one axis. That axis has a length of 3. In example pictured below, the ruby Array has dim 2 (it is 2-dimensional). The first dimension (axis) has a length of 2, the second dimension has a length of 3.
# an array of dim 1 -- it has only one axis
[1, 2, 1] # plain ol' ruby Array
N[1, 2, 1] # an NMatrix
# an array of dim 2 -- has 3 columns and 2 rows.
[[1, 0, 0], # the plain ol' ruby Array of two arrays
[0, 1, 2]]
N[[1, 0, 0], [0, 1, 2]] # as an NMatrix
Key attributes of an NMatrix are:
-
NMatrix#dim
: the number of axes (dimensions) of the array. -
NMatrix#shape
: A matrix with m number of rows and n number of columns will have a shape (m,n). The right-most number changes most rapidly while traversing the array (like Matlab, Numpy, and ruby Arrays of Arrays and opposite FORTRAN and PDL). -
NMatrix#dtype
: A symbol representing the type of array. (:byte
,:int8
,:int16
,:int32
,:int64
,:float32
,:float64
,:complex64
,:complex128
,:object
) -
NMatrix#stype
: The storage type of the array (:dense
,:list
,:yale
). Dense is the default and yale is for working with sparse arrays. List is less frequently used. -
NMatrix.size
: the total number of elements of the array. This is equal to the product of the elements of shape.
[NOTE: The numpy ndarray.itemsize
has no real equivalent right now. Also, access to underlying data (ndarray.data
in numpy) is not well-defined]
You should be able to type in every code block in this document from here on and duplicate these results. Some output lines are omitted for brevity.
% irb --simple-prompt
>> require 'nmatrix'
=> true
>> a = NMatrix.seq([3,5])
>> a.shape
=> [3, 5]
>> a.dim
=> 2
>> a.class
=> NMatrix
>> a.dtype
=> :int32
>> a.stype
=> :dense
# [[no concept of itemsize or size right now]]
>> b = N[6, 7, 8]
>> #<NMatrix:0x007fec8d777780 shape:[3] dtype:int32 stype:dense>
>> b.class
=> NMatrix
If you are using pry you should see a pretty-printed matrix in the output. If you are using irb, you can print out the matrix using pp:
>> require 'pp'
>> pp a
[
[ 0, 1, 2, 3, 4] [ 5, 6, 7, 8, 9] [10, 11, 12, 13, 14] ]
>> pp b
[6, 7, 8]
Currently, you create NMatrix objects with the NMatrix#new
method. The first parameter is the dimensions of the matrix, and the second parameter contains the initial values.
For example, the following creates a 3x4 matrix filled with zeros:
m = NMatrix.new([3,4], 0)
If the first parameter is an integer, NMatrix.new
returns a square matrix.
m = NMatrix.new( 4, 0) # Same as NMatrix.new( [4,4], 0 )
Matrices need not have only two dimensions:
# Create a 3D matrix (4 x 4 x 4)
m = NMatrix.new([4,4,4], 0)
In addition, the second parameter to NMatrix.new
can be an array which specifies all the values in the matrix:
m = NMatrix.new([2,5], [1,2,3,4,5,6,7,8,9,0])
# [1, 2, 3, 4, 5]
# [6, 7, 8, 9, 0]
If the array of values has fewer elements than the matrix requires, NMatrix.new
will repeat the array as many times as required:
m = NMatrix.new([3,6], [1,2,3])
# [1, 2, 3, 1, 2, 3]
# [1, 2, 3, 1, 2, 3]
# [1, 2, 3, 1, 2, 3]
Thanks to Daniel Carrera, we have an in-line constructor that makes creating matrices a very simple and readable operation:
a = N[ [1,2,3,4] ] # => [ [1.0 2.0 3.0 4.0] ]
a = N[ [1,2,3,4], dtype: :int32 ] # => [ [1 2 3 4] ]
a = N[ [1,2,3], [3,4,5] ] # => 1.0 2.0 3.0
# 3.0 4.0 5.0
Note that N[1,2,3,4]
will create a "dimensionless" (technically 1-dimensional) vector. We advise creating a vector with the same dimensionality as the matrices you're working with. For two dimensions:
N[ [1,2,3,4] ] # row vector # [1, 2, 3, 4]
N[ [1],[2],[3],[4] ] # column vector [ [1] [2] [3] [4] ]
N[ [1,2,3,4] ].transpose == N[[1], [2],[3],[4]] # true
One-dimensional matrices aren't really one-dimensional. They are n-dimensional, but n-1 of their dimensions are length 1.
Here's how you create vectors with different orientations:
>> x = NMatrix.new([4], [1,2,3,4]) # no orientation (dimensionless/1-dimensional)
>> x = NMatrix.new([4,1], [1,2,3,4]) # column vector in two dimensions (4 rows, 1 column)
>> x = NMatrix.new([1,4], [1,2,3,4]) # row vector in two dimensions (1 row, 4 columns)
>> x = NMatrix.new([4,1,1], [1,2,3,4]) # column vector in three dimensions (4 rows, 1 column, 1 layer)
There are two methods to print NMatrix, puts
and pp
. The output of puts
is mainly used for debugging and you should prefer the pretty_print
ouput. Here is an irb session that show the different outputs of those methods:
- The pretty print method offers a nice ouput for the NMatrix objects.
irb(main):001:0> n = NMatrix.new([2, 3], 0)
=> #<NMatrix:0x0000000174e860 shape:[2,3] dtype:int32 stype:dense>
irb(main):002:0> require "pp"
=> true
irb(main):003:0> pp n
[
[0, 0, 0]
[0, 0, 0]
]
- The
puts
output is more compact.
=> #<NMatrix:0x0000000174e860 shape:[2,3] dtype:int32 stype:dense>
irb(main):004:0> puts n
[0, 0, 0, 0, 0, 0]
=> nil
- The
pp
output for 3d matrix is the following:
n3d = NMatrix.new([2, 3, 4], 0)
=> #<NMatrix:0x00000001554f00 shape:[2,3,4] dtype:int32 stype:dense>
irb(main):006:0> pp n3d
{ layers:
[
[0, 0, 0]
[0, 0, 0]
]
[
[0, 0, 0]
[0, 0, 0]
]
[
[0, 0, 0]
[0, 0, 0]
]
[
[0, 0, 0]
[0, 0, 0]
]
}
- Printing big matrix
For now, the printing of the content of NMatrix objects does not depend on its size like with NumPy. For example with the pry interpreter, when you create a NMatrix, it call pp
on it. It could be problematic if you create a big matrix because it displays the content of the matrix:
pry -r nmatrix
[1] pry(main)> n = NMatrix.new([100, 100], 0)
=>
[
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# 99 lines like this one
# cut for conciseness
Arithmetic operations on NMatrix objects apply elementwise. A new NMatrix is created and filled with the result.
>> a = N[20.0,30.0,40.0,50.0]
=> [20.0, 30.0, 40.0, 50.0 shape:[4] dtype:float64 stype:dense>
>> b = N.seq [4]
=> [0, 1, 2, 3 shape:[4] dtype:int32 stype:dense>
>> c = a-b
=> [20.0, 29.0, 38.0, 47.0 shape:[4] dtype:float64 stype:dense>
>> b**2
=> [0, 1, 4, 9 shape:[4] dtype:int32 stype:dense>
>> a.map{|e| Math.sin(e)} * 10
=> [9.129452507276277, -9.880316240928618, 7.451131604793488, -2.6237485370392877 shape:[4] dtype:float64 stype:dense>
>> a < 35
=> [true, true, false, false shape:[4] dtype:object stype:dense>
Unlike in many matrix languages, the product operator * operates elementwise on NMatrix objects. The matrix product can be performed using the dot
function.
>> a = N[[1,1], [0,1]]
>> b = N[[2,0], [3,4]]
>> a*b
=> [2, 0, 0, 4 shape:[2,2] dtype:int32 stype:dense>
>> a.dot b
=> [5, 4, 3, 4 shape:[2,2] dtype:int32 stype:dense>
TODO: +=
-like operators
When operating with NMatrix objects of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).
>> a = NVector.ones(3, :int32)
>> b = NVector.linspace(0, Math::PI, 3)
>> b.dtype
=> :float64
>> c = a+b
=> [1.0, 2.5707963267948966, 4.141592653589793 shape:[1,3] dtype:float64 stype:dense>
>> d = c*Complex(0,1)
=> [(0.0+1.0i), (0.0+2.5707963267948966i), (0.0+4.141592653589793i) shape:[1,3] dtype:complex128 stype:dense>
Many unary options, such as computing the sum of all the elements in the NMatrix, are implemented as methods of the NMatrix
class.
>> a = N.random([2,3])
=> [0.5441637193570046, 0.8916235693579796, 0.7244554237682101, 0.6617806999277982, 0.5792396150088834, 0.7118546436543527 shape:[2,3] dtype:float64 stype:dense
>> a.sum
=> [1.2059444192848028, 1.4708631843668631, 1.4363100674225628 shape:[1,3] dtype:float64 stype:dense>
>> a.min
=> [0.5441637193570046, 0.5792396150088834, 0.7118546436543527 shape:[1,3] dtype:float64 stype:dense>
>> a.max
=> [0.6617806999277982, 0.8916235693579796, 0.7244554237682101 shape:[1,3] dtype:float64 stype:dense>
By default, these operations are applied over the first dimension of the NMatrix. However, by specifying an optional parameter, you can apply an operation over the specified dimension of an NMatrix:
>> b = N.indgen([3,4])
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 shape:[3,4] dtype:int32 stype:dense>
>> b.sum(0)
=> [12, 15, 18, 21 shape:[1,4] dtype:int32 stype:dense>
>> b.sum(1)
=> [6, 22, 38 shape:[3,1] dtype:int32 stype:dense>