[release] NuMojo v0.6 for MAX and mojo 25.1 #223

forfudan · 2025-02-25T14:19:49Z

NuMojo v0.6 for MAX and mojo 25.1

…andint` (#199) The `random` module was created at quite an early stage. A lot of things should be re-considered. This PR aims to refactor the random module by: 1. Aligning the functional behaviors as much as possible with `numpy` while keeping internal consistency of style: The `shape` always comes as the first argument. 2. For all functions, accept `shape` as an `NDArrayShape`. Meanwhile, we still provide overloads for `*shape: Int`. (`shape: List[Int]` is also possible but not recommened). 3. `rand` now generates uniformed distributed values and does not accept integral types. 4. `randint` is added to generate random integral values based on `low` and `high`. add tests for it. 5. `random_exponential` is renamed as `exponential` (same as numpy)[https://numpy.org/doc/stable/reference/random/generated/numpy.random.exponential.html#numpy.random.exponential].

…dtype` + unify functions (#200) This PR makes some updates to the `statistics` module: 1. Add `returned_dtype` to several functions (`mean`, `median`) which defaults to `f64`. 2. Add an overload of `mean` that calculates the average of all items and returns a scalar. Remove the function `cummean`. 3. Add `variance` and `std` functions. Remove `cumpvariance` and `cumpstd` functions (the formulae are not correct). 4. Incorporate the changes into the corresponding `NDArray`. Add `variance` and `std` methods. 5. Fix the current tests and add tests for statistics module. 6. Add more detailed docstring for functions.

Implements `broadcast_to()` for `NDArray`. Add tests. It can broadcast an ndarray of any shape to any compatible shape. The data will be copied into the new array. An example goes as follows. ```mojo from numojo.prelude import * from python import Python fn main() raises: var np = Python.import_module("numpy") var a = nm.random.rand(Shape(2, 3)) print(a) print(nm.routines.manipulation.broadcast_to(a, Shape(2, 2, 3))) print(np.broadcast_to(a.to_numpy(), (2, 2, 3))) ``` ```console [[0.8073 0.5361 0.4442] [0.9378 0.1910 0.2421]] 2D-array Shape(2,3) Strides(3,1) DType: f64 C-cont: True F-cont: False own data: True [[[0.8073 0.5361 0.4442] [0.9378 0.1910 0.2421]] [[0.8073 0.5361 0.4442] [0.9378 0.1910 0.2421]]] 3D-array Shape(2,2,3) Strides(6,3,1) DType: f64 C-cont: True F-cont: False own data: True [[[0.8074 0.5361 0.4442] [0.9378 0.1911 0.2421]] [[0.8074 0.5361 0.4442] [0.9378 0.1911 0.2421]]] ```

…er docstrings (#205) This PR aims to add all necessary boundary checks for `NDArrayShape` to ensure a safe use. - Add boundary checks for `ndim > 0` at initialization. - Add boundary checks for `shape[i] > 0` at initialization. - Add complete docstrings for all methods of `Shape` type, e.g., `raises`, `args`, `returns`.

…better docstrings (#206) This PR aims to add all necessary boundary checks for `NDArrayStrides` to ensure safe use. - Add boundary checks for `ndim > 0` at initialization. - Add complete docstrings for all methods of `Shape` type, e.g., `raises`, `args`, `returns`. - Chain calling `__init__(shape: NDArrayShape, order: String)` for other list-like shape argument. - Fix `__eq__` method. - Add new initialization method to create an uninitialized strides with given length.

…207) 1. Allow calculating variance and std of an array by axis: `numojo.statistics.variance()` and `numojo.statistics.std()`. 2. Add corresponding methods for `NDArray`. 3. Add auxiliary function `numojo.manipulation._broadcast_back_to()`. 4. Add tests. 5. Remove un-used imports.

This PR aims to improve the behaviors of 0-dimensional array (numojo scalar). Note that `a.item(0)` or `a[Item(0)]` is always preferred because the behavior is more determined, but we also allow some ***basic*** operations on 0darray to make users' life easier. 0-dimensional array cannot be constructed by users but can be obtained by array indexing and slicing. Printing this variable gives the scalar and a note that it is an 0darray instead of a mojo scalar. It is similar to `numpy` in that `a[0]` returns a numpy scalar and `a.item(0)` returns a Python scalar. For example, ``` >>> var a = nm.random.arange[f16](0, 3, 0.12) >>> print(a[1]) 0.11999512 (0darray[f16]) >>> print(a.item(1)) 0.11999512 >>> var c = nm.array[f16]("[[1,2], [3,4]]") >>> print(c[1, 1]) 4.0 (0darray[f16]) ``` 0-dimensional array can be unpacked to get the corresponding mojo scalar either by `[]` or by `item()`. For example, ``` >>> var a = nm.random.arange[f16](0, 3, 0.12) >>> var b = a[1] >>> print(b) 0.11999512 (0darray[f16]) >>> print(b[]) # Unpack using [] 0.11999512 >>> print(b.item()) # Unpack using item() 0.11999512 ``` 0-dimensional array can be used in arithmetic operations, just like a scalar. ``` >>> var a = nm.random.arange[f16](0, 3, 0.12) >>> var b = a[1] >>> var c = nm.array[f16]("[[1,2], [3,4]]") >>> var d = c[1, 1] >>> print(b - d) # Arithmetic operations between two 0darrays -3.8808594 (0darray[f16]) >>> print(b[] - d[]) # Arithmetic operations after unpacking -3.8808594 >>> print(b < d) # Comparison between 0darray and 0darray True (0darray[boolean]) >>>print(b == d[]) # Comparison between 0darray and unpacked 0darray False (0darray[boolean]) ```

This PR: - Updates the roadmap document according to our current progress. - Remove the auto-generated `magic.lock` from the cache. - Remove the `.readthedocs.yaml` from the cache. - Update the toml file and update channels.

…ut (#210) Adds the `Flags` type for storing information on memory layout. It replaces the current `Dict[String, Bool]` type. The Flags object can also be accessed dictionary-like. Short names are available for convenience. It is similar to `numpy.flags` object. Example: ```mojo fn main() raises: var A = nm.random.rand(2, 3, 4) print(A.flags.C_CONTIGUOUS) print(A.flags["C_CONTIGUOUS"]) print(A.flags["C"]) ``` They all print `True`.

Updates the code to accommodate to Mojo v25.1. The changes include but are not limited to: Change constructors. `str(` -> `String(` `int(` -> `Int(` `float(` -> `Float64(` Change `index()` to `Int()`. `index(T)` -> `Int(T)` The function `isdigit()` becomes a method. `isdigit(a)` -> `a.isdigit()` Use explicit constructor for complex ndarray. ```mojo self._re = NDArray[dtype](shape, order) self._im = NDArray[dtype](shape, order) ```

…ate array by any axis (#212) Adds `NDAxisIter` struct and `iter_by_axis` method that iterate array by any axis. In each iteration, the iterator yields a 1-d array by specified axis. It is useful when we want to write a universal function to reduce the array by certain axis. Example: ```mojo from numojo.prelude import * var a = nm.arange[i8](24).reshape(Shape(2, 3, 4)) print(a) for i in a.iter_by_axis(axis=0): print(String(i)) ``` This prints: ```console [[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] [[12 13 14 15] [16 17 18 19] [20 21 22 23]]] 3D-array Shape(2,3,4) Strides(12,4,1) DType: i8 C-cont: True F-cont: False own data: True [ 0 12] [ 1 13] [ 2 14] [ 3 15] [ 4 16] [ 5 17] [ 6 18] [ 7 19] [ 8 20] [ 9 21] [10 22] [11 23] ``` Another example: ```mojo from numojo.prelude import * var a = nm.arange[i8](24).reshape(Shape(2, 3, 4)) print(a) for i in a.iter_by_axis(axis=2): print(String(i)) ``` This prints: ```console [[[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] [[12 13 14 15] [16 17 18 19] [20 21 22 23]]] 3D-array Shape(2,3,4) Strides(12,4,1) DType: i8 C-cont: True F-cont: False own data: True [0 1 2 3] [4 5 6 7] [ 8 9 10 11] [12 13 14 15] [16 17 18 19] [20 21 22 23] ```.

… to work on any axis (#213) 1. Adds functions that are able to apply any functions working on 1-d array to any axis, with or without dimension reduction. - Add `apply_func_on_array_with_dim_reduction()` and `apply_func_on_array_without_dim_reduction()`. They try to utilize parallelization as much as possible. - In future, we only need to focus on writing (and optimizing) functions for 1-d arrays. Operating along certain axis can be easily achieved by applying the function, e.g., `apply_func_on_array_with_dim_reduction[max_1d](array, axis=axis). 3. Implements this approach on functions in statistics.averages module and on the `sort` function. The `sort` function gain speed increase compared to the old method and is quicker than `numpy.sort` for large arrays.

…214) ## Changes 1. Refine `argsort` function by applying the universal function. Improve the speed significantly (see below). Also, it fixes the problem that `argsort` does not work for F-order array. 2. Improve the speed of `sort` for 1d-array by adding partition functions which do not construct the indices array. 3. Update `_NDAxisIter` to allow order argument. 4. Re-write `ravel` function by means of `_NDAxisIter`, so that it will not break for F-order arrays. 5. Add many tests for the functions, to allow C or F operations both C and F arrays (4 different scenarios). 6. Add `FORC` attribute for the `Flags` type. ## Comparison `argsort` numojo vs numpy: ```console 100000000 1-d array. numojo 8.672953000001144 numpy 11.353579999995418 10_000 * 10_000 2-d array sorted by axis 0. numojo 1.9524170000222512 numpy 4.66693300002953 10_000 * 10_000 2-d array sorted by axis 1. numojo 0.5791429999517277 numpy 4.1895380000351 ```

This PR changes the approach in determining the min and max values of the printable regions of an array. This significantly improves the speed of printing arrays. This improvement is particularly significant when we encounter very large arrays. The speed increase can be x100000. See the following comparison on a (10000, 1000) array. ```console # before the change 2D-array Shape(10000,1000) Strides(1000,1) DType: f64 C-cont: True F-cont: False own data: True Time to print array: 19.190531999978703 # after the change 2D-array Shape(10000,1000) Strides(1000,1) DType: f64 C-cont: True F-cont: False own data: True Time to print array: 0.0001010000123642385 ```

…er and any axis + some optimization work (#216) This PR updates the `numojo.math.extrema` module and performs some other optimization work: - Update `max()` and `min()` to allow both C and F order arrays and by any axis. - Unify all the overloads and function signatures. The `maxT()` and `minT()` are removed. - Update the `max()` and `min()` methods for `NDArray` type. - Some other optimization work, including: - Use `//` and `%` to replace `divmod()` in all cases. - Use `a.size` attribute to replace `a.num_elements()` method in all cases. - Remove unnecessary copy of memory in the `apply_func_over_axis` functions. - Increase the speed of `nditer` by not re-constructing the strides in every loop.

#217) This PR: - Implements the `diagonal` function in `linalg.misc` module. Also includes it in the NDArray as a method. Add tests for it. - Fix the `NDArray.sort()` method (in-place sort). The default axis is -1 rather than None. Add tests for it.

… module for functional programming (#218) - Move private functions `_apply_func_on_array...()` from `utility` module to a new, dedicated module `numojo.routines.functional` that is used for functional programming purposes. - Rename the functions `_apply_func_on_array...()` as `apply_along_axis()`, making them public functions and can be used by users. The function fulfills the same goal as `numpy.apply_along_axis()` which executes a function working on 1-d arrays on the input n-d array along the given axis. - Rename `iter_by_axis()` as `iter_along_axis()` since meanings of these two expressions are different. The former one will be reserved for another purpose that will be implemented soon. - Add unit tests for this function, e.g., C-order vs F-order, along axis 0, 1, and 2.

… + fix `__bool__()` (#219) - Add `compress` function in indexing routine (`numojo.routines.indexing`) which return selected slices of an array along given axis or without the `axis` argument. - Add the function as one of the ndarray methods. - Enhance the `NDArrayIter` to allow iterating over any dimension. - Add `ith()` method to `NDArrayIter` to get the i-th item. - Fix the `NDArray.__bool__()` method which only returns a value if the size of the array is 1. - Add a number of tests for `compress()`. Example: ```mojo print(a) print(nm.indexing.compress(nm.array[boolean]("[1, 1, 1]"), a, axis=1)) print(np.compress(np.array([1, 1, 1]), anp, axis=1)) ``` ```console [[[ 0 6 12 18] [ 2 8 14 20] [ 4 10 16 22]] [[ 1 7 13 19] [ 3 9 15 21] [ 5 11 17 23]]] 3D-array Shape(2,3,4) Strides(1,2,6) DType: i8 C-cont: False F-cont: True own data: True [[[ 0 6 12 18] [ 2 8 14 20] [ 4 10 16 22]] [[ 1 7 13 19] [ 3 9 15 21] [ 5 11 17 23]]] 3D-array Shape(2,3,4) Strides(12,4,1) DType: i8 C-cont: True F-cont: False own data: True [[[ 0 6 12 18] [ 2 8 14 20] [ 4 10 16 22]] [[ 1 7 13 19] [ 3 9 15 21] [ 5 11 17 23]]] ``` --------- Co-authored-by: MadAlex1997 <[email protected]>

Implement `clip()` function for scalar a_min and a_max in `math.misc` module. Add corresponding method in `NDArray`. Add tests.

…222) 1) This PR standardizes the Doctoring format according to [Mojo docstring style guide](https://github.com/modular/mojo/blob/main/stdlib/docs/docstring-style-guide.md) (and to be aligned with numpy) which is as follows, ``` Description: Parameters: Args: Returns: Raises: See Also: Notes: References Examples: ``` 2) Add more descriptive errors in the internal functions of NDArray to give better understanding of the errors and also their source. --------- Co-authored-by: ZHU Yuhao 朱宇浩 <[email protected]>

forfudan · 2025-02-25T14:24:52Z

@MadAlex1997 @shivasankarka, we could make a monthly release at the end of this month as v0.6 for MAX 25.1. This is a placeholder draft pull request. We can merge it when we #221, #201, and other PRs are merged into branch pre-0.6.

Good to see that there is no conflict with the main branch.

- Improve `_NDIter` to allow arbitrary axis to travel. - Add method `ith()` to get the i-th item of the iterator. - Add `swapaxes()` for shape and strides. - Add `offset()` for `Item` type to get offset. - Constructor for `Item` from index and shape. - Add tests for C or F array with `nditer` from C or F orders.

As title.

forfudan and others added 20 commits January 28, 2025 16:14

[doc] Update the Roadmap document + other cleansing (#208)

c6ad872

This PR: - Updates the roadmap document according to our current progress. - Remove the auto-generated `magic.lock` from the cache. - Remove the `.readthedocs.yaml` from the cache. - Update the toml file and update channels.

[routines] Implement clip() function (#220)

f390de3

Implement `clip()` function for scalar a_min and a_max in `math.misc` module. Add corresponding method in `NDArray`. Add tests.

forfudan added 2 commits February 25, 2025 14:00

[doc][changelog] Change log for the release v0.6 (#201)

a07e867

As title.

forfudan marked this pull request as ready for review February 28, 2025 08:38

forfudan requested review from MadAlex1997 and shivasankarka February 28, 2025 22:52

MadAlex1997 approved these changes Mar 1, 2025

View reviewed changes

MadAlex1997 merged commit 23f316c into main Mar 1, 2025
2 checks passed

MadAlex1997 deleted the pre-0.6 branch March 1, 2025 01:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[release] NuMojo v0.6 for MAX and mojo 25.1 #223

[release] NuMojo v0.6 for MAX and mojo 25.1 #223

Uh oh!

forfudan commented Feb 25, 2025

Uh oh!

forfudan commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[release] NuMojo v0.6 for MAX and mojo 25.1 #223

[release] NuMojo v0.6 for MAX and mojo 25.1 #223

Uh oh!

Conversation

forfudan commented Feb 25, 2025

Uh oh!

forfudan commented Feb 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants