Skip to content

Tentative NMatrix Tutorial

John Prince edited this page Aug 8, 2013 · 29 revisions

Credit & Disclaimer

  • This tutorial is meant to mimic as closely as possible the Tentative NumPy Tutorial but for SciRuby's NMatrix. Because of this, examples and wording are used extensively from the Tentative Numpy Tutorial with little or no modification from the original. This is intentional, hereby disclosed, and hopefully justified given the nature of this project. The authors of this tutorial express deep appreciation to those responsible for the excellent Numpy and Tentative Numpy Tutorial. In addition, NMatrix was based heavily off of Masahiro Tanaka's excellent NArray work.

Links

Prerequisites

The installation processing is still being simplified, but with a fresh ubuntu/debian install you should be up and running with these commands:

sudo apt-get install libatlas-base-dev ruby ruby-dev
sudo apt-get --purge remove liblapack-dev liblapack3 liblapack3gf
export CPLUS_INCLUDE_PATH=/usr/include/atlas
export C_INCLUDE_PATH=/usr/include/atlas
sudo gem install nmatrix # or "gem install nmatrix" if you're using rbenv or rvm

See detailed install instructions for linux and Mac OS X.

Ruby does not have a single plotting package tied to NMatrix at the moment, but several excellent plotting options exist:

The Basics

An NMatrix is a multidimensional array with a low memory footprint and which can perform massive operations quickly. Here is some basic ruby code to consider:

# an array of rank 1 -- it has only one axis
[1, 2, 1]          # plain ol' ruby Array
N[1, 2, 1]         # an NMatrix
# an array of rank 2 -- has 3 columns and 2 rows.
[[1, 0, 0],        # the plain ol' ruby Array of two arrays
[0, 1, 2]] 
N[[1, 0, 0], [0, 1, 2]]  # as an NMatrix

Basic NMatrix attributes may be queried:

NMatrix#dim
the number of axes (dimensions) of the array. Often referred to as 'rank'.

NMatrix#shape
A matrix with m number of rows and n number of columns will have a shape (n,m). The left-most number changes most rapidly while traversing the array (like FORTRAN and PDL, opposite of Matlab, Numpy, and ruby Arrays of Arrays).

NMatrix#size (#total, #length)
Total number of elements in the array. Equal to the product of the elements of #shape.

NMatrix#typecode
An Integer representing the type of array. ( 1=byte, 2=sint, 3=int, 4=sfloat, 5=float, 6=scomplex, 7=complex, 8=object )

NMatrix#element_size
Size in bytes of each element in the array. ( byte=1, sint=2, int=4, sfloat=4, float=8, scomplex=8, complex=16, object=8 )

NMatrix#to_s
Read-only (?) access to underlying raw data. NArray#to_string creates a string representation for viewing.

An example

[You should be able to type in every code block in this document from here on and duplicate these results]

  % irb --simple-prompt
  >> require 'narray'
  => true
  >> a = NArray.int(5,2).indgen!
  => NArray.int(5,2): 
  [ [ 0, 1, 2, 3, 4 ], 
    [ 5, 6, 7, 8, 9 ] ]

We created an array object, labeled a. a has these basic attributes:

  • a.shape is [5,2]
  • a.dim is 2
  • a.size is 10
  • a.typecode is 3 (for int)
  • a.element_size is 4 meaning each int takes 4 bytes in memory

Each can be checked in your interactive session:

  >> a.shape
  => [5, 2]
  >> a.typecode
  => 3
  # etc..

NArray Creation

There are many ways to create NArrays.

  >> a = NArray[2,3,4]
  => NArray.int(3): 
  [ 2, 3, 4 ]
  >> a.class
  => NArray

The simplest is using the NArray#[ ] method. It understands arrays of arrays and will correctly deduce the correct type of the NArray (based on the element with highest typecode).

  >> b = NArray[ [1.5,2,3], [4,5,6] ] 
  => NArray.float(3,2): 
  [ [ 1.5, 2.0, 3.0 ], 
    [ 4.0, 5.0, 6.0 ] ]

Once we have an array we can take a look at its attributes:

  >> b.dim
  => 2
  >> b.shape
  => [3, 2]
  >> b.typecode
  => 5
  >> b.element_size
  => 8

The type of array can be coerced after creation. The following examples use the 'complex' type, but any type will act similarly:

  >> c = NArray[ [1,2,3], [4,5,6] ].to_type('complex')
  => NArray.complex(3,2): 
  [ [ 1.0+0.0i, 2.0+0.0i, 3.0+0.0i ], 
    [ 4.0+0.0i, 5.0+0.0i, 6.0+0.0i ] ]
  # is there a closer equivalent to this numpy code: "array([[1,2],[3,4]], dtype=complex)"?

Or created directly in the type of your choice by giving the size of the array. Notice that methods initializing an array without data always initialize to 0 (or nil in the case of an object).

  # There is a class method for each of the 8 datatypes:
  >> c = NArray.complex(3,2)
  => NArray.complex(3,2): 
  [ [ 0.0+0.0i, 0.0+0.0i, 0.0+0.0i ], 
    [ 0.0+0.0i, 0.0+0.0i, 0.0+0.0i ] ]

  # The generic form
  >> c = NArray.new('complex', 3, 2)  # or can use the typecode Integer
  => NArray.complex(3,2): 
  [ [ 0.0+0.0i, 0.0+0.0i, 0.0+0.0i ], 
    [ 0.0+0.0i, 0.0+0.0i, 0.0+0.0i ] ]

The methods fill! and indgen! are really handy for initializing an NArray to desired value(s). They are in-place methods but they return the object they act on for chaining.

  # fill 'er up with ones
  >> z = NArray.int(4,3,2).fill!(1)
  => NArray.int(4,3,2): 
  [ [ [ 1, 1, 1, 1 ], 
      [ 1, 1, 1, 1 ], 
      [ 1, 1, 1, 1 ] ], 
    [ [ 1, 1, 1, 1 ], 
      [ 1, 1, 1, 1 ], 
      [ 1, 1, 1, 1 ] ] ]

  # indgen! for creating ranges
  >> NArray.int(7).indgen!   # simple
  => NArray.int(7): 
  [ 0, 1, 2, 3, 4, 5, 6 ]
  >> NArray.int(5).indgen!(2, 3) # start=2, step=3
  => NArray.int(5): 
  [ 2, 5, 8, 11, 14 ]

[ Should these be implemented in core? ]

Creating an array with indgen or linspace isn't currently built-in to NArray, but here are some quick hacks to get the functionality (same interface as found in GSL::Vector):

  class NArray
    def self.linspace( start, stop, number=10 )
      NArray.to_na( (start..stop).step((stop-start).to_f/(number-1)).map{|x| x } ).to_f # always returns type float
    end

    def self.indgen( number, start=0, step=1 )
      vals = []
      number.times { vals << start; start += step }
      NArray.to_na(vals).to_f
    end
  end

  # now we can do these
  >> NArray.linspace( 0, 2, 9 )  # start, stop, n
  => NArray.float(9): 
  [ 0.0, 0.25, 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, 2.0 ]
  >> NArray.indgen(4,1.5,3)  # n, start, step
  => NArray.float(4): 
  [ 1.5, 4.5, 7.5, 10.5 ]

Also see random! and randomn (normally distributed random)

Displaying NArrays

An NArray displays a line of type information, followed by a ruby array (multidimensional) representation (truncated if it is very large).

  >> a = NArray.int(6).indgen!
  => NArray.int(6): 
  [ 0, 1, 2, 3, 4, 5 ]

Like normal ruby, the output given in an irb session is straight from the 'inspect' method:

  >> a.inspect
  => "NArray.int(6): \n[ 0, 1, 2, 3, 4, 5 ]"

If you want a nicely formatted array, the easiest thing to do is call 'to_a':

  >> b = NArray.int(12).indgen!.reshape(3,4)
  => NArray(ref).int(3,4): 
  [ [ 0, 1, 2 ], 
    [ 3, 4, 5 ], 
    [ 6, 7, 8 ], 
    [ 9, 10, 11 ] ]
  >> b.to_a
  => [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]

NArray truncates arrays that are too large to fit on the page:

  >> NArray.int(10000).indgen!
  => NArray.int(10000): 
  [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, ... ]
  >> NArray.int(10000).indgen!.reshape(100,100)
  => NArray(ref).int(100,100): 
  [ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, ... ], 
    [ 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, ... ], 
    [ 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, ... ], 
    [ 300, 301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, ... ], 
    [ 400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, ... ], 
    [ 500, 501, 502, 503, 504, 505, 506, 507, 508, 509, 510, 511, 512, ... ], 
    [ 600, 601, 602, 603, 604, 605, 606, 607, 608, 609, 610, 611, 612, ... ], 
    [ 700, 701, 702, 703, 704, 705, 706, 707, 708, 709, 710, 711, 712, ... ], 
    [ 800, 801, 802, 803, 804, 805, 806, 807, 808, 809, 810, 811, 812, ... ], 
    [ 900, 901, 902, 903, 904, 905, 906, 907, 908, 909, 910, 911, 912, ... ], 
   ...
  # [no way to turn off this behavior like in numpy [should this be implemented?]]

Basic Operations

Arithmetic operators on arrays apply elementwise. A new array is created and filled with the result.

  >> a = NArray[20,30,40,50]
  => NArray.int(4): 
  [ 20, 30, 40, 50 ]
  >> b = NArray.int(4).indgen!
  => NArray.int(4): 
  [ 0, 1, 2, 3 ]
  >> c = a-b
  => NArray.int(4): 
  [ 20, 29, 38, 47 ]
  >> b**2
  => NArray.int(4): 
  [ 0, 1, 4, 9 ]
  >> 10*NMath.sin(a)
  => NArray.float(4): 
  [ 9.12945, -9.88032, 7.45113, -2.62375 ]
  >> a < 35
  => NArray.byte(4): 
  [ 1, 1, 0, 0 ]

Unlike in many matrix languages (but like numpy), the product operator '*' operates elementwise with NArrays. Matrix product is performed if both NArrays are of the NMatrix subclass.

  >> a = NArray[ [1,1],[0,1] ] ;
  >> b = NArray[ [2,0],[3,4] ] ;
  >> a*b                               # element-wise
  => NArray.int(2,2): 
  [ [ 2, 0 ], 
    [ 0, 4 ] ]
  >> NMatrix.ref(a) * NMatrix.ref(b)   # dot-product
  => NMatrix.int(2,2): 
  [ [ 5, 4 ], 
    [ 3, 4 ] ]
  # dot product for NVector subclass
  >> vec_a = NVector[ 1,2,3 ] ;
  >> vec_b = NVector[ 4,5,6 ] ;
  >> vec_a * vec_b
  => 32
  # normal product if they are NArrays
  >> narr_a = NArray.to_na( vec_a.to_a ) ;
  >> narr_b = NArray.to_na( vec_b.to_a ) ;
  >> narr_a * narr_b
  => NArray.int(3): 
  [ 4, 10, 18 ]

It is possible to perform many operations inplace so that no new array is created. (e.g., add!, sbt!, mul!, div!, mod!, reshape!, ...)

  >> a = NArray.int(3,2).fill!(1)
  => NArray.int(3,2): 
  [ [ 1, 1, 1 ], 
    [ 1, 1, 1 ] ]
  >> a.mul!(3)
  => NArray.int(3,2): 
  [ [ 3, 3, 3 ], 
  [ 3, 3, 3 ] ]
  >> b = NArray.float(3,2).random!
  => NArray.float(3,2): 
  [ [ 0.873988, 0.165591, 0.518291 ], 
    [ 0.092512, 0.516683, 0.805215 ] ]
  >> b.add!(a)
  => NArray.float(3,2): 
  [ [ 3.87399, 3.16559, 3.51829 ], 
    [ 3.09251, 3.51668, 3.80522 ] ]
  >> a.add!(b)     # b is converted to datatype int
  => NArray.int(3,2): 
  [ [ 6, 6, 6 ], 
    [ 6, 6, 6 ] ]

upcasting: When operating with arrays of different types, the type of the resulting array corresponds to the more general or precise one. NOTE: upcasting works with NArray objects, but not with normal ruby objects [is this a feature to be added to NArray??]

  >> a = NArray.int(3).fill!(1)
  => NArray.int(3): 
  [ 1, 1, 1 ]
  >> b_arr = [0, Math::PI/2, Math::PI]
  => [0, 1.5707963267948966, 3.141592653589793]
  >> b_narr = NArray[0, Math::PI/2, Math::PI]
  => NArray.float(3): 
  [ 0.0, 1.5708, 3.14159 ]
  # no upcasting with a normal ruby array (or scalar)
  >> c = a + b_arr
  => NArray.int(3): 
  [ 1, 2, 4 ]
  # lower typecode + higher typecode => higher typecode
  >> c = a + b_narr
  => NArray.float(3): 
  [ 1.0, 2.5708, 4.14159 ]

  # no upcasting with a normal ruby scalar [error in this case]
  >> c * 'i'.to_c
  RangeError: can't convert 0+1i into Float
    ...
  # but we can do scalar upcasting using an NArray of 1 element
  # [under the hood, _broadcasting_ is being used to permit the operation]
  >> c * NArray[ 'i'.to_c ]
  => NArray.complex(3): 
  [ 0.0+1.0i, 0.0+2.5708i, 0.0+4.14159i ]
  >> d = Math::E**( c * NArray['i'.to_c] )
  => NArrayScalar.complex(3): 
  [ 0.540302+0.841471i, -0.841471+0.540302i, -0.540302-0.841471i ]

Upcasting rules follow the basic order of the typecodes: the result from an operation on arrays of two differing typecodes will be the larger typecode. (e.g., sint + float => float)

none byte sint int sfloat float scomplex complex Object
none none none none none none none none none none
byte none byte sint int sfloat float scomplex complex Object
sint none sint sint int sfloat float scomplex complex Object
int none int int int sfloat float scomplex complex Object
sfloat none sfloat sfloat sfloat sfloat float scomplex complex Object
float none float float float float float complex complex Object
scomplex none scomplex scomplex scomplex scomplex complex scomplex complex Object
complex none complex complex complex complex complex complex complex Object
Object none Object Object Object Object Object Object Object Object

Unary operations, like computing the sum of all the elements in the array work across the entire NArray.

  >> a = NArray.float(3,2).random!
  => NArray.float(3,2): 
  [ [ 0.657951, 0.0634028, 0.126082 ], 
    [ 0.561376, 0.798723, 0.422753 ] ]
  >> a.sum
  => 2.6302875872835334
  >> a.min
  => 0.06340280672630172
  >> a.max
  => 0.7987225425165548

By default, these operations apply to the array as if it were a list of numbers, regardless of its shape. However, by specifying the axis parameter you can apply an operation along the specified axis of an array:

  >> b = NArray.int(4,3).indgen!
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> b.sum(1)   # sum of each column
  => NArray.int(3): 
  [ 6, 22, 38 ]
  >> b.min(0)
  => NArray.int(3): 
  [ 0, 4, 8 ]
  # [NO CUMULATIVE SUM CURRENTLY IMPLEMENTED! ?? #cumsum ?? ]

Indexing, Slicing and Iterating

  >> a = NArray.int(10).indgen!**3
  => NArray.int(10): 
  [ 0, 1, 8, 27, 64, 125, 216, 343, 512, 729 ]
  >> a[2]
  => 8
  >> a[2...5]
  => NArray.int(3): 
  [ 8, 27, 64 ]
  >> a[ (0...6).step(2).to_a ] = -1000 ;
  >> a[-1..0]
  => NArray.int(10): 
  [ 729, 512, 343, 216, 125, -1000, 27, -1000, 1, -1000 ]
  >> a.each {|x| print x**(1.0/3), " " }
  5.0+8.660254037844384i 1.0 5.0+8.660254037844384i 3.0 5.0+8.660254037844384i 4.999999999999999 5.999999999999999 6.999999999999999 7.999999999999999 8.999999999999998 => nil
  # [how to encourage floats that came to precisely the right value?]

Multidimensional arrays are indexed with one index per axis. The argument is given as a comma separated list:

  # [NArray has no equivalent to the fromfunction method in numpy, so
  #  one is provided here. TODO: implement one for NArray in C!
  #  (maybe could just extend indgen!?)]
  class NArray
    def from_function!(level=0, indices=[], &block)
      if level < shape.size
        (0...(shape[level])).each do |s|
          new_indices = indices.dup
          new_indices << s
          self[*new_indices] = block.call(new_indices) if new_indices.size == shape.size
          from_function!(level+1, new_indices, &block)
        end
      end
      self
    end
  end

  >> b = NArray.int(4,5).from_function! {|x,y| 10*y+x }
  => NArray.int(4,5): 
  [ [ 0, 1, 2, 3 ], 
    [ 10, 11, 12, 13 ], 
    [ 20, 21, 22, 23 ], 
    [ 30, 31, 32, 33 ], 
    [ 40, 41, 42, 43 ] ]
  >> b[3,2]
  => 23

  # true means all the values across that dimension
  >> b[1,true]
  => NArray.int(5): 
  [ 1, 11, 21, 31, 41 ]
  # note that slice does the same thing as brackets, but PRESERVES THE RANK!!!
  >> b.slice(1,true)
  => NArray.int(1,5): 
  [ [ 1 ], 
    [ 11 ], 
    [ 21 ], 
    [ 31 ], 
    [ 41 ] ]
  >> b[true, 1..2]  # ranges are okay of course
  => NArray.int(4,2): 
  [ [ 10, 11, 12, 13 ], 
    [ 20, 21, 22, 23 ] ]

A multidimensional array accessed with a single index is treated as a flattened (1 dimensional array):

  >> b[3]
  => 3
  >> b[4]
  => 10
  >> b[19]
  => 43

[ Should discuss slice vs. [] here (or somewhere)]

A range of missing indices can be filled in with false. false is like true (selects all values in that dimension), but it will also fill in missing dimensions.

  >> b[false,-1]  # equivalent to b[true,-1]
  => NArray.int(4): 
  [ 40, 41, 42, 43 ]

If x is a rank 5 array (it has 5 axes), then

  • x[1,2,false] is equivalent to x[1,2,true,true,true],
  • x[false,3] is equivalent to x[true,true,true,true,3]
  • x[4,false,5,true] is equivalent to x[4,true,true,5,true]
  >> c = NArray[ [[0,1,2],[10,12,13]], [[100,101,102], [110,112,113]] ]
  => NArray.int(3,2,2): 
  [ [ [ 0, 1, 2 ], 
      [ 10, 12, 13 ] ], 
    [ [ 100, 101, 102 ], 
      [ 110, 112, 113 ] ] ]
  >> c.shape
  => [3, 2, 2]
  >> c[false,1]
  => NArray.int(3,2): 
  [ [ 100, 101, 102 ], 
    [ 110, 112, 113 ] ]
  >> c[2,false]
  => NArray.int(2,2): 
  [ [ 2, 13 ], 
    [ 102, 113 ] ]

Iterating over multidimensional arrays can be done with respect to the first axis (each row):

  [This should probably be a *built in* behavior of NArray.  This is my hack]
  require 'enumerator'
  class NArray
  
    # returns all the indices to access the values of the NArray.  if start == 1,
    # then the first dimension (row) values are not returned, if start == 2,
    # then the first two dimensions are skipped.
    #
    # if a block is provided, the indices are yielded one at a time
    # [obviously, this could be made into a fast while loop instead of
    # recursive ... someone have at it]
    def indices(start=0, ar_of_indices=[], final=[], level=shape.size-1, &block)
      if level >= 0
        (0...(shape[level])).each do |s|
          new_indices = ar_of_indices.dup
          new_indices.unshift(s)
          if (new_indices.size == (shape.size - start))
            block.call(new_indices)
            final << new_indices 
          end
          indices(start, new_indices, final, level-1, &block)
        end
      end
      final
    end
  
    # returns an enumerator that yields each row of the NArray
    def by_row
      Enumerator.new do |yielder|
        indices(1) do |ind|
          yielder << self[true, *ind]
        end
      end
    end
  end
  
  >> b.by_row.each {|row| p row }
  NArray.int(4): 
  [ 0, 1, 2, 3 ]
  NArray.int(4): 
  [ 10, 11, 12, 13 ]
  NArray.int(4): 
  [ 20, 21, 22, 23 ]
  NArray.int(4): 
  [ 30, 31, 32, 33 ]
  NArray.int(4): 
  [ 40, 41, 42, 43 ]
  => #<Enumerator::Generator:0x00000000d5e008>

Performing something for each element in the array is the default behavior:

  # iterate through each element:
  >> b.each {|v| print "#{v} " } #  => 0, 1, 2, 3, 10, 11, 12, 13 ...

  # collect back into an NArray
  >> c = b.collect {|v| v + 7 }
  => NArray.int(4,5): 
  [ [ 7, 8, 9, 10 ], 
    [ 17, 18, 19, 20 ], 
    [ 27, 28, 29, 30 ], 
    [ 37, 38, 39, 40 ], 
    [ 47, 48, 49, 50 ] ]
  # note that 'map' is currently undefined
  >> d = b.map {|v| v + 7 }
  # => NoMethodError: undefined method `map' for #<NArray:0x00000000eadd30>

Shape Manipulation

Changing the shape of an array

An array has a shape, given by the number of elements along each axis:

  => NArray.int(4,3): 
  [ [ 0, 6, 7, 2 ], 
    [ 2, 1, 3, 9 ], 
    [ 3, 6, 4, 7 ] ]
  >> a.shape
  => [4, 3]

The shape of an array can be changed with various commands:

  # [NOTE: flatten and flatten! exist but are currently undocumented on the Method List page]
  >> a.flatten
  => NArray(ref).int(12): 
  [ 0, 6, 7, 2, 2, 1, 3, 9, 3, 6, 4, 7 ]
  >> a.reshape!(2,6)
  => NArray.int(2,6): 
  [ [ 0, 6 ], 
    [ 7, 2 ], 
    [ 2, 1 ], 
    [ 3, 9 ], 
    [ 3, 6 ], 
    [ 4, 7 ] ]
  # note that transpose with no arguments does nothing unless the object is an NMatrix)
  # [should this default behavior be changed for a 2 dimesional matrix??]
  # we specify the dimensions to transpose
  >> a.transpose(1,0)
  => NArray.int(6,2): 
  [ [ 0, 7, 2, 3, 3, 4 ], 
    [ 6, 2, 1, 9, 6, 7 ] ]

The reshape function creates an NArray that still refers back to the original NArray. The reshape! argument will modify the calling NArray, of course.

  # [NOTE: shape=(size,...) is also advertised to do this, but seems broken]
  >> a
  => NArray.int(2,6): 
  [ [ 0, 6 ], 
    [ 7, 2 ], 
    [ 2, 1 ], 
    [ 3, 9 ], 
    [ 3, 6 ], 
    [ 4, 7 ] ]
  >> a.reshape!(6,2)
  => NArray.int(6,2): 
  [ [ 0, 6, 7, 2, 2, 1 ], 
    [ 3, 9, 3, 6, 4, 7 ] ]

Stacking together different arrays

Several arrays can be stacked together, along different axes:

  # [NOTE: This functionality should probably be added to core NArray functionality]
  # [these methods are not rigorously tested yet]

  class NArray
    class << self
      # borrows other dimension lengths from the first object and relies on it to
      # raise errors (or not) upon concatenation.
      def cat(dim=0, *narrays)
        raise ArgumentError, "'dim' must be an integer (did you forget your dim arg?)" unless dim.is_a?(Integer)
        raise ArgumentError, "must have narrays to cat" if narrays.size == 0
        new_typecode = narrays.map(&:typecode).max
        narrays.uniq.each {|narray| narray.newdim!(dim) if narray.shape[dim].nil? }
        shapes = narrays.map(&:shape)
        new_dim_size = shapes.inject(0) {|sum,v| sum + v[dim] }
        new_shape = shapes.first.dup
        new_shape[dim] = new_dim_size
        narr = NArray.new(new_typecode, *new_shape)
        range_cnt = 0
        narrays.zip(shapes) do |narray, shape|
          index = shape.map {true}
          index[dim] = (range_cnt...(range_cnt += shape[dim]))
          narr[*index] = narray
        end
        narr
      end
      def vcat(*narrays) ; cat(1, *narrays) end
      def hcat(*narrays) ; cat(0, *narrays) end
    end
  
    # object method interface
  
    def cat(dim=0, *narrays) ; NArray.cat(dim, self, *narrays) end
    def vcat(*narrays) ; NArray.vcat(self, *narrays) end
    def hcat(*narrays) ; NArray.hcat(self, *narrays) end
  end

  >> a = NArray.float(2,2).random!.mul!(10).floor
  => NArray.int(2,2): 
  [ [ 4, 8 ], 
    [ 7, 2 ] ]
  >> b = NArray.int(2,2).random!(10)  # equivalent to above
  => NArray.int(2,2): 
  [ [ 3, 3 ], 
    [ 3, 7 ] ]
  >> NArray.vcat(a,b)   # same as 'a.vcat(b)'
  => NArray.int(2,4): 
  [ [ 4, 8 ], 
    [ 7, 2 ], 
    [ 3, 3 ], 
    [ 3, 7 ] ]
  >> NArray.hcat(a,b)   # same as 'a.hcat(b)'
  => NArray.int(4,2): 
  [ [ 4, 8, 3, 3 ], 
    [ 7, 2, 3, 7 ] ]

(Not sure why but) NArray doesn't seem to need numpy's column_stack or row_stack [I could be wrong]. Things just work as you would expect:

  >> a = NArray[4,2]
  => NArray.int(2): 
  [ 4, 2 ]
  >> b = NArray[2,8]
  => NArray.int(2): 
  [ 2, 8 ]
  >> a.reshape(1,true)
  => NArray(ref).int(1,2): 
  [ [ 4 ], 
    [ 2 ] ]
  >> a.reshape(1,true).hcat b.reshape(1,true)
  => NArray.int(2,2): 
  [ [ 4, 2 ], 
    [ 2, 8 ] ]
  >> NArray.vcat( a.reshape(1,true), b.reshape(1,true) )
  => NArray.int(1,4): 
  [ [ 4 ], 
    [ 2 ], 
    [ 2 ], 
    [ 8 ] ]

Splitting one array into several smaller ones

Using hsplit you can split an array along its horizontal axis, either by specifying the number of equally shaped arrays to return, or by specifying the columns after which the division should occur:

  # not in core, so we implement it here [should be in core]
  class NArray
    # returns an array of NArray objects
    # based on Dave Vasilevsky's comment (http://snippets.dzone.com/posts/show/3486)
    # if parts is an Array, it indicates the points at which to split
    def split(parts=2, dim=0)
      indices = shape.map {true}
      cut_points = 
        if parts.is_a?(Array)
          points = parts.dup 
          points.unshift(0) if points[0] != 0
          points.push(shape[dim]) unless [-1, shape[dim]].include?( points.last )
          points
        else
          q, r = shape[dim].divmod(parts)
          (0..parts).map {|i| i * q + [r, i].min }  
        end
      cut_points.each_cons(2).map do |a,b| 
        indices[dim] = a...b
        self[*indices]
      end
    end
  
    # splits into parts moving along the horizontal axis
    # NArray[1,2,3,4,5,6].split  # => [NArray[1,2,3], NArray[3,4,6]]
    # can also take an array of split points
    def hsplit(parts=2)
      split(parts, 0)
    end
  
    # splits into parts moving along the vertical axis
    # NArray[ [1,2,3], 
    #         [4,5,6] ].split  # => [NArray[1,2,3], NArray[3,4,6]]
    # can also take an array of split points
    def vsplit(parts=2)
      split(parts, 1)
    end
  end
  >> a = NArray.int(12,2).random!(10)
  => NArray.int(12,2): 
  [ [ 0, 5, 3, 7, 4, 9, 9, 9, 0, 7, 5, 4 ], 
    [ 6, 7, 9, 8, 1, 7, 2, 6, 1, 6, 2, 9 ] ]
  >> a.hsplit(3)
  => [NArray.int(4,2): 
  [ [ 0, 5, 3, 7 ], 
    [ 6, 7, 9, 8 ] ], NArray.int(4,2): 
  [ [ 4, 9, 9, 9 ], 
    [ 1, 7, 2, 6 ] ], NArray.int(4,2): 
  [ [ 0, 7, 5, 4 ], 
    [ 1, 6, 2, 9 ] ]]
  >> a.hsplit([3,4])
  => [NArray.int(3,2): 
  [ [ 0, 5, 3 ], 
    [ 6, 7, 9 ] ], NArray.int(1,2): 
  [ [ 7 ], 
    [ 8 ] ], NArray.int(8,2): 
  [ [ 4, 9, 9, 9, 0, 7, 5, 4 ], 
    [ 1, 7, 2, 6, 1, 6, 2, 9 ] ]]

vsplit splits along the vertical axis, and split allows to specify along which axis to split.

Copies and Views

When operating and manipulating NArrays, normal Ruby rules are followed for copying the data, making NArrays consistent and easy to work with. For a beginner's sake, here are three cases:

No Copy at All

Simple assignments don't make any copy of NArray objects nor of their data.

  => NArray.int(12): 
  [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ]
  >> b = a
  => NArray.int(12): 
  [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 ]
  >> a.equal?(b)  # a is the same object as b!
  => true
  >> b.reshape!(4,3)
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 10000 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> a.shape
  => [4, 3]

Methods make no copy, since Ruby passes all objects as references.

  >> def f(x) ; x.object_id  end
  >> a.object_id                 # object id is a unique object identifier
  => 16955060
  >> f(a)
  => 16955060

Methods in ruby that act in-place (on the calling object) are often trailed with a !. The bang methods that fill up an NArray with values (indgen!, fill!, random!) also have an undocumented method that also acts in-place (this is not the behavior one would expect). [randomn is not advertised as a !, and it does not modify the object in-place, but returns a copy]

  
  # indgen! fill! and random! have a non-bang method that still acts in place:
  >> a = NArray[ 1,2,3,4 ]
  => NArray.int(4): 
  [ 1, 2, 3, 4 ]
  >> b = a.fill(7)
  => NArray.int(4): 
  [ 7, 7, 7, 7 ]
  >> a       # b is just a shallow copy of a since this is just simple assignment
  => NArray.int(4): 
  [ 7, 7, 7, 7 ]

  # randomn makes a deep copy
  >> x = NArray.float(4)
  => NArray.float(4): 
  [ 0.0, 0.0, 0.0, 0.0 ]
  >> x.randomn
  => NArray(ref).float(4): 
  [ -0.993153, -0.456061, 0.626678, 0.946439 ]
  >> x
  => NArray.float(4): 
  [ 0.0, 0.0, 0.0, 0.0 ]

  # [ NOTE: this should be implemented in the core ]
  # we can easily override the confusing default behavior to be less surprised:
  class NArray
    def indgen(*args) ; self.dup.indgen!(*args) end
    def fill(*args) ; self.dup.fill!(*args) end
    def random(*args) ; self.dup.random!(*args) end
  end

View or Shallow Copy

Different NArray objects can share the same data. The refer method creates a new NArray object that looks at the same data. The only other methods that make a reference are reshape and newdim (shape is a fairly superficial property of the underlying data).

  >> c = a.refer
  => NArray(ref).int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> c.equal?(a)
  => false
  # [NArray has no method equivalent to 'base' or 'owndata' ...
  #  should these be implemented?]
  >> c.reshape!(6,2)           # a's shape doesn't change
  => NArray(ref).int(6,2): 
  [ [ 0, 1, 2, 3, 4, 5 ], 
    [ 6, 7, 8, 9, 10, 11 ] ]
  >> a.shape
  => [4, 3]
  >> c[4,0] = 1234 ;           # a's data changes 
  >> a
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 1234, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]

To set values on an NArray, one must set them at slice time (no shallow views of an assigned slice):

  >> d = a.dup ;    # make a deep copy of 'a' for this demonstration
  >> d[1...3,true] = 10
  => 10
  >> d
  => NArray.int(4,3): 
  [ [ 0, 10, 10, 3 ], 
    [ 1234, 10, 10, 7 ], 
    [ 8, 10, 10, 11 ] ]

Deep Copy

Slicing an array returns a deep copy of the sliced data:

  >> s = a[1...3,true]   # s is a new, standalone copy
  => NArray.int(2,3): 
  [ [ 1, 2 ], 
    [ 5, 6 ], 
    [ 9, 10 ] ]
  >> s[] = 10 ;
  >> s
  => NArray.int(2,3): 
  [ [ 10, 10 ], 
    [ 10, 10 ], 
    [ 10, 10 ] ]
  >> a
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 1234, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]

The dup method (and [ ] without args) makes a deep copy of the NArray.

  >> d = a.dup         # equivalent to:  d = a[]
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 1234, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> a.equal?(d)
  => false
  >> d[0,0] = 9999
  => 9999
  >> a
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 1234, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]

Functions and Methods Overview

Here is a list of NArray methods names ordered in some categories. The names link to the NArray Example List [to be created] so that you can see the functions in action.

NArray Creation

[TODO: need to fill this in]

NArray Creation

Conversions

Manipulations

Questions

Ordering

Operations

Basic Statistics

Basic Linear Algebra

Less Basic

NMath module

The NMath module provides mathematical functions that operate on NArrays.

  >> b = NArray.int(3).indgen!
  => NArray.int(3): 
  [ 0, 1, 2 ]
  >> NMath.exp(b)
  => NArray.float(3): 
  [ 1.0, 2.71828, 7.38906 ]
  >> NMath.sqrt(b)
  => NArray.float(3): 
  [ 0.0, 1.0, 1.41421 ]
  >> b + c   # this is not part of NMath, but shown as the equivalent of Numpy's add
  => NArray.float(3): 
  [ 2.0, 0.0, 6.0 ]

Note that the NMath module (like any module) can be included into the Kernel:

  >> include NMath ;
  >> exp(b)
  => NArray.float(3): 
  [ 1.0, 2.71828, 7.38906 ]
  >> sqrt(b)
  => NArray.float(3): 
  [ 0.0, 1.0, 1.41421 ]
  >> sqrt(2)  # NMath functions also work on normal ruby numbers
  => 1.4142135623730951

Broadcasting rules

Broadcasting allows universal functions to deal in a meaningful way with inputs that do not have exactly the same shape.

The first rule of broadcasting is that if all input arrays do not have the same number of dimensions, then a "1" will be repeatedly appended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.

The second rule of broadcasting ensures that arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the "broadcasted" array.

After application of the broadcasting rules, the sizes of all arrays must match. EricsBroadcastingDoc provides a good explanation of broadcasting for Numpy, but it applies equally well for NArray, as shown below:

  >> a = NArray[1.0,2.0,3.0]
  => NArray.float(3): 
  [ 1.0, 2.0, 3.0 ]
  >> b = 2.0
  => 2.0
  >> a*b               # the simplest case of broadcasting, 3x1 and a scalar (1x1)

  >> a = NArray[[ 0.0, 0.0, 0.0],[10.0,10.0,10.0],[20.0,20.0,20.0],[30.0,30.0,30.0]]
  => NArray.float(3,4): 
  [ [ 0.0, 0.0, 0.0 ], 
    [ 10.0, 10.0, 10.0 ], 
    [ 20.0, 20.0, 20.0 ], 
    [ 30.0, 30.0, 30.0 ] ]
  >> b = NArray[1.0,2.0,3.0]
  => NArray.float(3): 
  [ 1.0, 2.0, 3.0 ]
  >> a + b             # multiplication of a 3x4 with a 3x1 NArray
  => NArray.float(3,4): 
  [ [ 1.0, 2.0, 3.0 ], 
    [ 11.0, 12.0, 13.0 ], 
    [ 21.0, 22.0, 23.0 ], 
    [ 31.0, 32.0, 33.0 ] ]

Fancy indexing and index tricks

NArray offers more indexing facilities than regular ruby Arrays. Basically, NArrays can be indexed with integers and slices (as we saw before) and also with arrays of integers and arrays of booleans.

  >> a = NArray.int(12).indgen!**2          # the first 12 square numbers
  => NArray.int(12): 
  [ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121 ]
  >> i = NArray[1,1,3,8,5] ;                # an array of indices
  >> a[i]                                   # the elements of a at the positions i
  => NArray.int(5): 
  [ 1, 1, 9, 64, 25 ]
  >> j = NArray[[3,4],[9,7]] ;              # a bidimensional array of indices
  >> a[j]
  => NArray.int(2,2):                       # the same shape as j
  [ [ 9, 16 ], 
    [ 81, 49 ] ]

[ This is missing functionality in NArray. In NArray, a single array or NArray used as a index for #[ ] accesses the flattened NArray. It is easy to access the flattened NArray (narray.flatten[3]), why not reserve the functionality here to behave like Numpy???] When the indexed array a is multidimensional, a single array of indices reffers to the first dimension of a. The following example shows this behaviour by converting an image of labels into a color image using a palette.

  # this doesn't work in NArray because #[] is reserved for working with the flattened NArray.
  # should this functionality be included in NArray??
  # not sure the easiest/best way to implement this...
  # could easily be implemented in a separate function...

[ Not implemented ] We can also give indexes for more than one dimension. The arrays of indices for each dimension must have the same shape.

  # not implemented (#[ ] already reserved )

[ Not implemented ] Naturally, we can put i and j in a sequence (say a list) and then do the indexing with the list.

  # not implemented (#[ ] already reserved )

...

Indexing with Boolean Arrays

When we index arrays with arrays of (integer) indices we are providing the list of indices to pick. With boolean indices the approach is different; we explicitly choose which items in the array we want and which ones we don't. The most natural way one can think of for boolean indexing is to use boolean arrays that have the same shape as the original array:

  >> a = NArray.int(4,3).indgen! ;
  >> b = a > 4
  => NArray.byte(4,3): 
  [ [ 0, 0, 0, 0 ], 
    [ 0, 1, 1, 1 ], 
    [ 1, 1, 1, 1 ] ]
  >> a[b]
  => NArray.int(7): 
  [ 5, 6, 7, 8, 9, 10, 11 ]

This property can be very useful in assignments:

  >> a[b] = 0
  => 0
  >> a
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 0, 0, 0 ], 
    [ 0, 0, 0, 0 ] ]

You can look at the Mandelbrot set example to see how to use boolean indexing to generate an image of the Mandelbrot set.

[NOTE: this behavior is not implemented, should it be??] The second way of indexing with booleans is more similar to integer indexing; for each dimension of the array we give a 1D boolean array selecting the slices we want.

  # the nearest behavior I can think of is like this:
  >> a = NArray.int(4,3).indgen! ;
  >> a[true, [1,2]]
  => NArray.int(4,2): 
  [ [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> a[[0,2],true]
  => NArray.int(2,3): 
  [ [ 0, 2 ], 
    [ 4, 6 ], 
    [ 8, 10 ] ]
  ## [NOTE: I don't know how to duplicate the last example in this block]

The ix_() function

[NOTE: NArray has no equivalent to ix_ that I know of. Should this be implemented?]

Indexing with strings

[NOTE: NArray does not have this functionality built in. Should it be implemented?? maybe...?]

Here are some attempts at equivalent(??) ways to achieve named access [I'm not sure I understand the use of 'Record Arrays'. Is it just named columns?]

  img = NArray[[[0,1], [0,0]], [[0,0], [1,0]], [[0,0], [0,1]]]
  => NArray.int(2,2,3): 
  [ [ [ 0, 1 ], 
      [ 0, 0 ] ], 
    [ [ 0, 0 ], 
      [ 1, 0 ] ], 
    [ [ 0, 0 ], 
      [ 0, 1 ] ] ]
  # 1) just define some methods on the object
  # (we could always subclass and define methods on the subclass, too)
  >> def img.red ; self[false,0] end ;
  >> def img.green ; self[false,1] end ;
  >> def img.blue ; self[false,2] end ;
  >> img.red
  => NArray.int(2,2): 
  [ [ 0, 1 ], 
    [ 0, 0 ] ]

  # 2) define a struct for keeping track of the indices
  >> Color = Struct.new(:red, :green, :blue) ;
  >> c = Color.new([false,0], [false,1], [false,2]) ;
  >> img[*c.red]
  => NArray.int(2,2): 
  [ [ 0, 1 ], 
    [ 0, 0 ] ]

Linear Algebra

Work in progress. Basic linear algebra to be included here.

Simple Array Operations

The best way to do linear algebra in Numpy is to create an NMatrix (a subclass of NArray) (see How To: Convert from NArray to NMatrix):

  class NMatrix
    def self.eye(sz, typecode='int')
      self.new(typecode, sz, sz).unit
    end
  end

  >> a = NMatrix[ [1.0,2.0], [3.0, 4.0] ]
  => NMatrix.float(2,2): 
  [ [ 1.0, 2.0 ], 
    [ 3.0, 4.0 ] ]
  >> a.transpose
  => NMatrix.float(2,2): 
  [ [ 1.0, 3.0 ], 
    [ 2.0, 4.0 ] ]
  >> a.inverse
  => NMatrix.float(2,2): 
  [ [ -2.0, 1.0 ], 
    [ 1.5, -0.5 ] ]
  >> NMatrix.eye 2
  => NMatrix.int(2,2): 
  [ [ 1, 0 ], 
    [ 0, 1 ] ]
  >> j = NMatrix[ [0.0, -1.0], [1.0, 0.0] ]
  => NMatrix.float(2,2): 
  [ [ 0.0, -1.0 ], 
    [ 1.0, 0.0 ] ]
  >> j*j       # matrix multiplication
  => NMatrix.float(2,2): 
  [ [ -1.0, 0.0 ], 
    [ 0.0, -1.0 ] ]
  >> NArray.ref(j) * NArray.ref(j)  # cast a shallow NArray to get numerical multiply
  => NArray.float(2,2): 
  [ [ 0.0, 1.0 ], 
    [ 1.0, 0.0 ] ]

  # define some simple behavior to achieve trace

  class NMatrix
    # returns an NArray of the diagonal
    def get_diagonal
      # should be implemented in C
      min_dim = self.shape.min
      na = NArray.new(self.typecode, min_dim)
      (0...min_dim).each {|i| na[i] = self[i,i] }
      na
    end
    def trace
      get_diagonal.sum
    end
  end

  >> u = NMatrix.int(2,2).unit
  => NMatrix.int(2,2): 
  [ [ 1, 0 ], 
    [ 0, 1 ] ]
  >> u.trace
  => 2

  # define the ability to directly solve
  # the advantage of doing it the other way is that it is already factored

  class NMatrix
    # stores the lu matrix after first computation
    def solve(other)
      @lu_decomposed ||= self.lu
      @lu_decomposed.solve(other)
    end
  end

  >> a.solve(y)
  => NMatrix.float(1,2): 
  [ [ -3.0 ], 
    [ 4.0 ] ]

  # NOTE:
  # solving for the eigenvalues/eigenvectors can be done with the *na_geev*
  # package available from the NArray site: http://narray.rubyforge.org/
  # Unfortunately, it requires compiling.
  # A solution based on FFI would be great here...
  # or a gem that compiled things by itself
  # other solutions?
  # how does numpy solve for eigenvals/vecs?

The NMatrix Class

  >> a = NMatrix[ [1.0, 2.0], [3.0, 4.0] ]
  => NMatrix.float(2,2): 
  [ [ 1.0, 2.0 ], 
    [ 3.0, 4.0 ] ]

  # no matlab like short cut for making matrices exists, but can define
  class NMatrix
    # returns an NArray of type float based on a string '1.0 2.0; 3.0 4.0'
    def self.from_matlab_string(string)
      arrs = string.split(/\s*;\s*/).map {|st| st.split(/\s+/).map(&:to_f) }
      unless arrs[0].is_a?(Array)  # transform into two dimensional if only 1 dim
        arrs = [arrs]
      end
      NMatrix.to_na(arrs)
    end
  end

  >> a = NMatrix.from_matlab_string('1.0 2.0; 3.0 4.0')
  => NMatrix.float(2,2): 
  [ [ 1.0, 2.0 ], 
    [ 3.0, 4.0 ] ]

  >> a.class
  => NMatrix
  >> a.transpose
  => NMatrix.float(2,2): 
  [ [ 1.0, 3.0 ], 
    [ 2.0, 4.0 ] ]

  >> x = NMatrix.from_matlab_string('5.0 7.0') ;
  >> y = x.transpose
  => NMatrix.float(1,2): 
  [ [ 5.0 ], 
    [ 7.0 ] ]

  >> a*y          # matrix multiplication
  => NMatrix.float(1,2): 
  [ [ 19.0 ], 
    [ 43.0 ] ]

  >> a.inverse    
  => NMatrix.float(2,2): 
  [ [ -2.0, 1.0 ], 
    [ 1.5, -0.5 ] ]

  >> a.lu.solve(y)  # or note previous implementation for a.solve(y)
  => NMatrix.float(1,2): 
  [ [ -3.0 ], 
    [ 4.0 ] ]
  >> y/a      # equivalent to:   a.lu.solve y    if a is square NMatrix
  => NMatrix.float(1,2): 
  [ [ -3.0 ], 
    [ 4.0 ] ]

Indexing: Comparing Matrices (NMatrix and NVector) and 2-d NArrays

NVector and NMatrix are merely subclasses of NArray that behave slightly differently in two ways: 1) NVector & NMatrix override some functions to be matrix-style functions (e.g., '*' is matrix multiplication or dot product rather than element-wise multiplication. 2) indexing (slicing with '[ ]') still results in an NMatrix with 2 dimensions. NMatrices must always contain 2 dimensions!

>> a = NArray.int(4,3).indgen!
=> NArray.int(4,3): 
[ [ 0, 1, 2, 3 ], 
  [ 4, 5, 6, 7 ], 
  [ 8, 9, 10, 11 ] ]
>> m = NMatrix.ref(a.dup)
=> NMatrix(ref).int(4,3): 
[ [ 0, 1, 2, 3 ], 
  [ 4, 5, 6, 7 ], 
  [ 8, 9, 10, 11 ] ]
>> print a.class, " ", m.class
NArray NMatrix=> nil

Now, let's take some simple slices. Basic slicing uses slice objects or integers (or nothing). For example, the evaluation of a[] and m[] will duplicate the objects.

  >> a[]
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> a[].shape
  => [4, 3]
  >> m[]
  => NMatrix.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]
  >> m.shape
  => [4, 3]

Now for something that differs from normal ruby Array indexing: you may use comma-separated indices to index along multiple axes at the same time.

  >> a[true,1].shape
  => [4]
  >> a[1,true]
  => NArray.int(3): 
  [ 1, 5, 9 ]
  >> a[1,true].shape
  => [3]
  >> m[1,true]
  => NMatrix.int(1,3): 
  [ [ 1 ], 
    [ 5 ], 
    [ 9 ] ]
  >> m[1,true].shape
  => [1, 3]

Notice the difference in the last two results. Use of true (or false, which is the same as true but expands to other dimensions as necessary) for the 2-d NArray produces a 1-dimensional NArray, while for a NMatrix it produces a 2-dimensional NMatrix. A slice of an NMatrix will always produce an NMatrix. For example, a slice m[true,2] produces an NMatrix of shape (1,4). In contrast, a slice of an NArray will always produce an NArray of the lowest possible dimension. For example, if c were a three dimensional NArray, c[1,false] produces a 2-d NArray while c[1,true,1] produces a 1 dimensional NArray. From this point on, we will show results only for the NArray slice if the results for the corresponding NMatrix slice are identical.

Lets say that we wanted the 1st and 3rd column of an array. One way is to slice using a list:

  >> a[[1,3],true]
  => NArray.int(2,3): 
  [ [ 1, 3 ], 
    [ 5, 7 ], 
    [ 9, 11 ] ]

[NOTE: NArray does not implement the take method, should it??] A slightly more complicated way is to use the take() method:

  # not implemented (should it be?)

If we wanted to skip the first row, we could use:

  # not implemented (should it be?)

Or we could use basic slicing rules:

  >> a[[1,3], 1..-1]
  => NArray.int(2,2): 
  [ [ 5, 7 ], 
    [ 9, 11 ] ]

Yet another way to slice the above is to use a cross product:

  # ix_ (or equivalent) is not implemented!

For the reader's convenience, here is our array again:

  >> a
  => NArray.int(4,3): 
  [ [ 0, 1, 2, 3 ], 
    [ 4, 5, 6, 7 ], 
    [ 8, 9, 10, 11 ] ]

Now let's do something a bit more complicated. Lets say that we want to retain all columns where the first row is greater than 1. One way is to create a boolean index:

  # I don't know what the equivalent would be in NArray (anyone??)
  ## <numpy>
  ## >>> A[0,:]>1
  ## array([False, False, True, True], dtype=bool)
  ## >>> A[:,A[0,:]>1]
  ## array([[ 2,  3],
  ##        [ 6,  7],
  ##        [10, 11]])
  ## </numpy>

[NOTE: section needs some work for showing similarity]

Tricks and Tips

Here we give a list of short and useful tips.

"Automatic" Reshaping

To change the dimensions of an array you can omit (using true as placeholder) one of the sizes which will then be deduced automatically:

  >> a = NArray.int(30).indgen!
  >> a.reshape(3,true,2)
  => NArray(ref).int(3,5,2): 
  [ [ [ 0, 1, 2 ], 
      [ 3, 4, 5 ], 
      [ 6, 7, 8 ], 
      [ 9, 10, 11 ], 
      [ 12, 13, 14 ] ], 
    [ [ 15, 16, 17 ], 
      [ 18, 19, 20 ], 
      [ 21, 22, 23 ], 
      [ 24, 25, 26 ], 
      [ 27, 28, 29 ] ] ]

Vector Stacking

How to construct a 2D array from a list of equally-sized row vectors? In matlab this is quite easy: if x and y are two vectors of the same length you only need do m=[x;y]. In NArray this works via the methods cat, vcat or hcat, depending on which dimension the stacking is to be done [NOTE: these methods are defined earlier in the tutorial and should probably be included in the core]. For example:

  # NOTE: relies on methods defined earlier in this tutorial
  >> x = NArray.int(5).indgen!(0,2)
  => NArray.int(5): 
  [ 0, 2, 4, 6, 8 ]
  >> y = NArray.int(5).indgen!
  => NArray.int(5): 
  [ 0, 1, 2, 3, 4 ]
  >> x.vcat y
  => NArray.int(5,2): 
  [ [ 0, 2, 4, 6, 8 ], 
    [ 0, 1, 2, 3, 4 ] ]
  >> x.hcat(y)
  => NArray.int(10): 
  [ 0, 2, 4, 6, 8, 0, 1, 2, 3, 4 ]

Histograms

The histogram gem provides histogram functionality (for both NArrays and Arrays). [NOTE: should this be included in NArray? Histogramming is very basic and very useful]. No equivalent exists in ruby (yet) for Pylab histogram plotting.

To use, gem install histogram

  require 'histogram/narray'
  # Build a vector of 10000 normal deviates with variance 0.5^2 and mean 2
  >> mu, sigma = 2, 0.5 ;
  >> v = NArray.float(10000).randomn * sigma + mu  # <= is that correct??
  => NArray.float(10000): 
  [ 2.58947, 1.22316, 1.39052, 1.53752, 1.41965, 3.02762, 2.24002, 2.29033, ... ]
  >> (bins, n) = v.histogram(50)
  => [NArray.float(50): 
  [ -0.150957, -0.0681237, 0.01471, 0.0975437, 0.180377, 0.263211, ... ], NArray.float(50): 
  [ 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 3.0, 3.0, 10.0, 6.0, 17.0, 36.0, 42.0, ... ]]

References

Clone this wiki locally