Tentative NMatrix Tutorial

Introduction

Credit & Disclaimer

This tutorial is meant to mimic as closely as possible the Tentative NumPy Tutorial (as of August 2013) but for SciRuby's NMatrix. Because of this, examples and wording are used extensively from the Tentative Numpy Tutorial with little or no modification from the original. This is intentional, hereby disclosed, and hopefully justified given the nature of this project. The authors of this tutorial express deep appreciation to those responsible for the excellent Numpy and Tentative Numpy Tutorial. In addition, NMatrix was based heavily off of Masahiro Tanaka's excellent NArray work.

Prerequisites

Before reading this tutorial you should know a bit of Ruby. If you would like to refresh your memory, take a look at the Ruby quickstart tutorial, the tutorials at rubymonk, or tryruby. If you wish to work the examples in this tutorial, you must also have some software installed on your computer. Minimally:

Ruby >= 1.9.3 (Download Ruby)
NMatrix (installation instructions)

You may find these useful for plotting:

rubyvis
plotrb [in progress]
scruffy
plot with R using rserve
the gnuplot gem
others

Also, take a look at:

pry for an enhanced interactive shell (better than irb)
SciRuby(http://sciruby.com/) for additional scientific tools with growing NMatrix compatibility

Rules for Editing

Follow the Tentative NumPy Tutorial as closely as possible.
Aim to implement similar functionality, but in a ruby-ish way.
Make a note on the wiki if something needs to be implemented to provide a particular functionality. Then, ideally, go and implement it and update the wiki.

The Basics

An NMatrix is a homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NMatrix dimensions are called axes. The number of axes is dim. For example, the coordinates of a point in 3D space [1, 2, 1] is an array of dim 1, because it has one axis. That axis has a length of 3. In example pictured below, the ruby Array has dim 2 (it is 2-dimensional). The first dimension (axis) has a length of 2, the second dimension has a length of 3.

# an array of dim 1 -- it has only one axis
[1, 2, 1]          # plain ol' ruby Array
N[1, 2, 1]         # an NMatrix
# an array of dim 2 -- has 3 columns and 2 rows.
[[1, 0, 0],        # the plain ol' ruby Array of two arrays
[0, 1, 2]] 
N[[1, 0, 0], [0, 1, 2]]  # as an NMatrix

Key attributes of an NMatrix are:

NMatrix#dim : the number of axes (dimensions) of the array.
NMatrix#shape : A matrix with m number of rows and n number of columns will have a shape (m,n). The right-most number changes most rapidly while traversing the array (like Matlab, Numpy, and ruby Arrays of Arrays and opposite FORTRAN and PDL).
NMatrix#dtype : A symbol representing the type of array. (:byte, :int8, :int16, :int32, :int64, :float32, :float64, :complex64, :complex128, :object)
NMatrix#stype : The storage type of the array (:dense, :list, :yale). Dense is the default and yale is for working with sparse arrays. List is less frequently used.
NMatrix.size : the total number of elements of the array. This is equal to the product of the elements of shape.

[NOTE: The numpy ndarray.itemsize has no real equivalent right now. Also, access to underlying data (ndarray.data in numpy) is not well-defined]

An example

You should be able to type in every code block in this document from here on and duplicate these results. Some output lines are omitted for brevity.

% irb --simple-prompt
>> require 'nmatrix'
=> true
>> a = NMatrix.seq([3,5])

>> a.shape
=> [3, 5]
>> a.dim
=> 2
>> a.class
=> NMatrix
>> a.dtype
=> :int32
>> a.stype
=> :dense
# [[no concept of itemsize or size right now]]

>> b = N[6, 7, 8]
>> #<NMatrix:0x007fec8d777780 shape:[3] dtype:int32 stype:dense>
>> b.class
=> NMatrix

If you are using pry you should see a pretty-printed matrix in the output. If you are using irb, you can print out the matrix using pp:

>> require 'pp'
>> pp a
[
  [ 0,  1,  2,  3,  4]   [ 5,  6,  7,  8,  9]   [10, 11, 12, 13, 14] ]
>> pp b
[6, 7, 8]

NMatrix Creation

Standard way.

Currently, you create NMatrix objects with the NMatrix#new method. The first parameter is the dimensions of the matrix, and the second parameter contains the initial values.

For example, the following creates a 3x4 matrix filled with zeros:

m = NMatrix.new([3,4], 0)

If the first parameter is an integer, NMatrix.new returns a square matrix.

m = NMatrix.new( 4, 0)   # Same as NMatrix.new( [4,4], 0 )

Matrices need not have only two dimensions:

# Create a 3D matrix (4 x 4 x 4)
m = NMatrix.new([4,4,4], 0)

In addition, the second parameter to NMatrix.new can be an array which specifies all the values in the matrix:

m = NMatrix.new([2,5], [1,2,3,4,5,6,7,8,9,0])
# [1, 2, 3, 4, 5]
# [6, 7, 8, 9, 0]

If the array of values has fewer elements than the matrix requires, NMatrix.new will repeat the array as many times as required:

m = NMatrix.new([3,6], [1,2,3])
# [1, 2, 3, 1, 2, 3]
# [1, 2, 3, 1, 2, 3]
# [1, 2, 3, 1, 2, 3]

Simpler constructor for matrices

Thanks to Daniel Carrera, we have an in-line constructor that makes creating matrices a very simple and readable operation:

a = N[ [1,2,3,4] ]                    # =>  [ [1.0  2.0  3.0  4.0] ]
a = N[ [1,2,3,4], dtype: :int32 ]     # =>  [ [1    2    3    4] ]
a = N[ [1,2,3], [3,4,5] ]             # =>  1.0  2.0  3.0
                                      #     3.0  4.0  5.0

Note that N[1,2,3,4] will create a "dimensionless" (technically 1-dimensional) vector. We advise creating a vector with the same dimensionality as the matrices you're working with. For two dimensions:

N[ [1,2,3,4] ] # row vector # [1, 2, 3, 4]
N[ [1],[2],[3],[4] ] # column vector [ [1]   [2]   [3]   [4] ]
N[ [1,2,3,4] ].transpose == N[[1], [2],[3],[4]] # true

Dimensions and orientations

One-dimensional matrices aren't really one-dimensional. They are n-dimensional, but n-1 of their dimensions are length 1.

Here's how you create vectors with different orientations:

>> x = NMatrix.new([4], [1,2,3,4]) # no orientation (dimensionless/1-dimensional)
>> x = NMatrix.new([4,1], [1,2,3,4]) # column vector in two dimensions (4 rows, 1 column)
>> x = NMatrix.new([1,4], [1,2,3,4]) # row vector in two dimensions (1 row, 4 columns)
>> x = NMatrix.new([4,1,1], [1,2,3,4]) # column vector in three dimensions (4 rows, 1 column, 1 layer)

Printing Arrays

There are two methods to print NMatrix, puts and pp. The output of puts is mainly used for debugging and you should prefer the pretty_print ouput. Here is an irb session that show the different outputs of those methods:

The pretty print method offers a nice ouput for the NMatrix objects.

irb(main):001:0> n = NMatrix.new([2, 3], 0)
=> #<NMatrix:0x0000000174e860 shape:[2,3] dtype:int32 stype:dense>
irb(main):002:0> require "pp"
=> true
irb(main):003:0> pp n

[
  [0, 0, 0]
  [0, 0, 0]
]

The puts output is more compact.

=> #<NMatrix:0x0000000174e860 shape:[2,3] dtype:int32 stype:dense>
irb(main):004:0> puts n
[0, 0, 0, 0, 0, 0]
=> nil

The pp output for 3d matrix is the following:

n3d = NMatrix.new([2, 3, 4], 0)
=> #<NMatrix:0x00000001554f00 shape:[2,3,4] dtype:int32 stype:dense>
irb(main):006:0> pp n3d

{ layers:
  [
    [0, 0, 0]
    [0, 0, 0]
  ]

  [
    [0, 0, 0]
    [0, 0, 0]
  ]

  [
    [0, 0, 0]
    [0, 0, 0]
  ]

  [
    [0, 0, 0]
    [0, 0, 0]
  ]
}

Printing big matrix

For now, the printing of the content of NMatrix objects does not depend on its size like with NumPy. For example with the pry interpreter, when you create a NMatrix, it call pp on it. It could be problematic if you create a big matrix because it displays the content of the matrix:

pry -r nmatrix
[1] pry(main)> n = NMatrix.new([100, 100], 0)
=> 
[
  [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
# 99 lines like this one
# cut for conciseness

Basic Operations

Arithmetic operations on NMatrix objects apply elementwise. A new NMatrix is created and filled with the result.

 >> a = N[20.0,30.0,40.0,50.0]
 => [20.0, 30.0, 40.0, 50.0 shape:[4] dtype:float64 stype:dense>
 >> b = N.seq [4]
 => [0, 1, 2, 3 shape:[4] dtype:int32 stype:dense> 
 >> c = a-b
 => [20.0, 29.0, 38.0, 47.0 shape:[4] dtype:float64 stype:dense>
 >> b**2
 => [0, 1, 4, 9 shape:[4] dtype:int32 stype:dense>
 >> a.map{|e| Math.sin(e)} * 10
 => [9.129452507276277, -9.880316240928618, 7.451131604793488, -2.6237485370392877 shape:[4] dtype:float64 stype:dense>
 >> a < 35
 => [true, true, false, false shape:[4] dtype:object stype:dense>

Unlike in many matrix languages, the product operator * operates elementwise on NMatrix objects. The matrix product can be performed using the dot function.

 >> a = N[[1,1], [0,1]]
 >> b = N[[2,0], [3,4]]
 >> a*b
 => [2, 0, 0, 4 shape:[2,2] dtype:int32 stype:dense>
 >> a.dot b
 => [5, 4, 3, 4 shape:[2,2] dtype:int32 stype:dense>

TODO: +=-like operators

When operating with NMatrix objects of different types, the type of the resulting array corresponds to the more general or precise one (a behavior known as upcasting).

 >> a = NVector.ones(3, :int32)
 >> b = NVector.linspace(0, Math::PI, 3)
 >> b.dtype
 => :float64
 >> c = a+b
 => [1.0, 2.5707963267948966, 4.141592653589793 shape:[1,3] dtype:float64 stype:dense>
 >> d = c*Complex(0,1)
 => [(0.0+1.0i), (0.0+2.5707963267948966i), (0.0+4.141592653589793i) shape:[1,3] dtype:complex128 stype:dense>

Many unary options, such as computing the sum of all the elements in the NMatrix, are implemented as methods of the NMatrix class.

 >> a = N.random([2,3])
 => [0.5441637193570046, 0.8916235693579796, 0.7244554237682101, 0.6617806999277982, 0.5792396150088834, 0.7118546436543527 shape:[2,3] dtype:float64 stype:dense
 >> a.sum
 => [1.2059444192848028, 1.4708631843668631, 1.4363100674225628 shape:[1,3] dtype:float64 stype:dense> 
 >> a.min
 => [0.5441637193570046, 0.5792396150088834, 0.7118546436543527 shape:[1,3] dtype:float64 stype:dense> 
 >> a.max
 => [0.6617806999277982, 0.8916235693579796, 0.7244554237682101 shape:[1,3] dtype:float64 stype:dense>

By default, these operations are applied over the first dimension of the NMatrix. However, by specifying an optional parameter, you can apply an operation over the specified dimension of an NMatrix:

 >> b = N.indgen([3,4])
 => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 shape:[3,4] dtype:int32 stype:dense> 
 >> b.sum(0)
 => [12, 15, 18, 21 shape:[1,4] dtype:int32 stype:dense> 
 >> b.sum(1)
 => [6, 22, 38 shape:[3,1] dtype:int32 stype:dense>

Tentative NMatrix Tutorial

Table of contents

Introduction

Credit & Disclaimer

Prerequisites

Rules for Editing

The Basics

An example

NMatrix Creation

Standard way.

Simpler constructor for matrices

Dimensions and orientations

Printing Arrays

Basic Operations

Universal Methods

Indexing, Slicing and Iterating

Shape Manipulation

Changing the shape of an array

Stacking together different arrays

Splitting one array into several smaller ones

Copies and Views

No Copy at All

View or Shallow Copy

Deep Copy

Functions and Methods Overview

Less Basic

Broadcasting rules

Fancy indexing and index tricks

Indexing with Arrays of Indices

Indexing with Boolean Arrays

The ix_() function

Indexing with strings

Linear Algebra

Simple Array Operations

Indexing: Comparing Matrices and 2D Arrays

Tricks and Tips

"Automatic" Reshaping

Vector Stacking

Histograms

References

Clone this wiki locally