Skip to content

Conversation

@durnwalder
Copy link
Contributor

@durnwalder durnwalder commented Apr 16, 2025

This PR optimizes the QR decomposition utilizing configurable memory layouts and SIMD. When the input matrix is detected to be in row-major (C) layout, it is reordered to column-major (F) layout before performing Householder transformations. After factorization, the final Q and R are reordered to the original layout of the input Matrix, preserving consistency for downstream operations. For an input matrix with column-major (F) layout, no reordering is needed.

Changes

  1. Vectorized Householder Transformations
    Updated algorithm leverages SIMD with calls to vectorize for computing and applying Householder reflections.

  2. Reduced and Complete Modes
    Adds support for both “reduced” and “complete” modes, similar to NumPy’s QR. For an input Matrix A with shape (m,n):

    • “reduced”: Q has shape(m, min(m,n)) and R has shape(min(m,n), n).
    • “complete”: Q has shape(m, m) and R has shape(m, n).
  3. Additional tests
    Beyond the existing test with shape(20,20) and default parameters, new tests now verify the QR decomposition for non-square matrices (shape(12,5) and shape(5,12)) in both reduced and complete modes.

@durnwalder durnwalder changed the title vectorize qr decomposition [routines] Optimize QR Algorithm with SIMD and Column-Major Layout Apr 16, 2025
@durnwalder durnwalder changed the title [routines] Optimize QR Algorithm with SIMD and Column-Major Layout [routines] Optimized QR Algorithm with SIMD and Column-Major Layout Apr 16, 2025
@durnwalder durnwalder changed the title [routines] Optimized QR Algorithm with SIMD and Column-Major Layout [routines] Optimized QR Decomposition with SIMD and Column-Major Layout Apr 16, 2025
@shivasankarka shivasankarka merged commit c0e44fc into Mojo-Numerics-and-Algorithms-group:pre-0.7 Apr 18, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants