-
Notifications
You must be signed in to change notification settings - Fork 17
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #117 from raneamri/feature/hessian-jacobian
Hessian and Jacobian functions
- Loading branch information
Showing
18 changed files
with
1,625 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# Hessian | ||
|
||
## Overview | ||
|
||
XAD implements a set of methods to compute the Hessian matrix of a function in `XAD/Hessian.hpp`. | ||
|
||
Note that the Hessian header is not automatically included with `XAD/XAD.hpp`. | ||
Users must include it as needed. | ||
|
||
Hessians can be computed in `fwd_adj` or `fwd_fwd` higher-order mode. | ||
|
||
The `computeHessian()` method takes a set of variables packaged in a | ||
`std::vector<T>` and a function with signature `T foo(std::vector<T>)`. | ||
|
||
## Return Types | ||
|
||
If provided with `RowIterators`, `computeHessian()` will write directly to | ||
them and return `void`. If no `RowIterators` are provided, the Hessian will | ||
be written to a `std::vector<std::vector<T>>` and returned (`T` is | ||
usually `double`). | ||
|
||
## Specialisations | ||
|
||
### Forward over Adjoint Mode | ||
|
||
```c++ | ||
template <typename RowIterator, typename T> | ||
void computeHessian( | ||
const std::vector<AReal<FReal<T>>> &vec, | ||
std::function<AReal<FReal<T>>(std::vector<AReal<FReal<T>>> &)> foo, | ||
RowIterator first, RowIterator last, | ||
Tape<FReal<T>> *tape = Tape<FReal<T>>::getActive()) | ||
``` | ||
This mode uses a [Tape](ref/tape.md) to compute second derivatives. This Tape | ||
will be instantiated within the method or set to the current active Tape using | ||
`Tape::getActive()` if none is passed as argument. | ||
### Forward over Forward Mode | ||
```c++ | ||
template <typename RowIterator, typename T> | ||
void computeHessian( | ||
const std::vector<FReal<FReal<T>>> &vec, | ||
std::function<FReal<FReal<T>>(std::vector<FReal<FReal<T>>> &)> foo, | ||
RowIterator first, RowIterator last) | ||
``` | ||
|
||
This mode does not require a Tape which can help reduce the overhead that comes | ||
with it, at the expense of requiring more executions of the function given to | ||
determine the full Hessian. | ||
|
||
## Example Use | ||
|
||
Given $f(x, y, z, w) = sin(x y) - cos(y z) - sin(z w) - cos(w x)$, or | ||
|
||
```c++ | ||
auto foo = [](std::vector<AD> &x) -> AD | ||
{ | ||
return sin(x[0] * x[1]) - cos(x[1] * x[2]) | ||
- sin(x[2] * x[3]) - cos(x[3] * x[0]); | ||
}; | ||
``` | ||
|
||
with the derivatives of interest calculated at the point | ||
|
||
```c++ | ||
std::vector<AD> x_ad({1.0, 1.5, 1.3, 1.2}); | ||
``` | ||
we'd like to compute the Hessian | ||
$$ | ||
H = \begin{bmatrix} | ||
\frac{\partial^2 f}{\partial x^2} & | ||
\frac{\partial^2 f}{\partial x \partial y} & | ||
\frac{\partial^2 f}{\partial x \partial z} & | ||
\frac{\partial^2 f}{\partial x \partial w} \\ | ||
\frac{\partial^2 f}{\partial y \partial x} & | ||
\frac{\partial^2 f}{\partial y^2} & | ||
\frac{\partial^2 f}{\partial y \partial z} & | ||
\frac{\partial^2 f}{\partial y \partial w} \\ | ||
\frac{\partial^2 f}{\partial z \partial x} & | ||
\frac{\partial^2 f}{\partial z \partial y} & | ||
\frac{\partial^2 f}{\partial z^2} & | ||
\frac{\partial^2 f}{\partial z \partial w} \\ | ||
\frac{\partial^2 f}{\partial w \partial x} & | ||
\frac{\partial^2 f}{\partial w \partial y} & | ||
\frac{\partial^2 f}{\partial w \partial z} & | ||
\frac{\partial^2 f}{\partial w^2} | ||
\end{bmatrix} | ||
$$ | ||
First step is to setup the tape and active data types | ||
```c++ | ||
typedef xad::fwd_adj<double> mode; | ||
typedef mode::tape_type tape_type; | ||
typedef mode::active_type AD; | ||
tape_type tape; | ||
``` | ||
|
||
Note that if no tape is setup, one will be created when computing the Hessian. | ||
`fwd_fwd` mode is also supported in the same fashion. All that is left to do | ||
is define our input values and our function, then call `computeHessian()`: | ||
|
||
```c++ | ||
std::function<AD(std::vector<AD> &)> foo = [](std::vector<AD> &x) -> AD | ||
{ return sin(x[0] * x[1]) - cos(x[1] * x[2]) | ||
- sin(x[2] * x[3]) - cos(x[3] * x[0]); }; | ||
|
||
auto hessian = computeHessian(x_ad, foo); | ||
``` | ||
Note the signature of `foo()`. Any other signature will throw an error. | ||
This computes the relevant matrix | ||
$$ | ||
H = \begin{bmatrix} | ||
-1.72257 & -1.42551 & 0 & 1.36687 \\ | ||
-1.42551 & -1.6231 & 0.207107 & 0 \\ | ||
0 & 0.207107 & 0.607009 & 1.54911 \\ | ||
1.36687 & 0 & 1.54911 & 2.05226 | ||
\end{bmatrix} | ||
$$ | ||
and prints it | ||
```c++ | ||
for (auto row : hessian) | ||
{ | ||
for (auto elem : row) std::cout << elem << " "; | ||
std::cout << std::endl; | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,143 @@ | ||
# Jacobian | ||
|
||
## Overview | ||
|
||
XAD implements a set of methods to compute the Jacobian matrix of a function in `XAD/Jacobian.hpp`. | ||
|
||
Note that the Jacobian header is not automatically included with `XAD/XAD.hpp`. | ||
Users must include it as needed. | ||
|
||
Jacobians can be computed in `adj` or `fwd` mode. | ||
|
||
The `computeJacobian()` method takes a set of variables packaged in a | ||
`std::vector<T>` and a function with signature | ||
`std::vector<T> foo(std::vector<T>)`, where `T` is either a forward-mode | ||
or adjoint-mode active type (`FReal` or `AReal`). | ||
|
||
## Return Types | ||
|
||
If provided with `RowIterators`, `computeJacobian()` will write directly to | ||
them and return `void`. If no `RowIterators` are provided, the Jacobian will be | ||
written to a `std::vector<std::vector<T>>` and returned, where `T` is the | ||
underlying passive type (usually `double`). | ||
|
||
## Specialisations | ||
|
||
### Adjoint Mode | ||
|
||
```c++ | ||
template <typename RowIterator, typename T> | ||
void computeJacobian( | ||
const std::vector<AReal<T>> &vec, | ||
std::function<std::vector<AReal<T>>(std::vector<AReal<T>> &)> foo, | ||
RowIterator first, RowIterator last, | ||
Tape<T> *tape = Tape<T>::getActive()) | ||
``` | ||
This mode uses a [Tape](ref/tape.md) to compute derivatives. This Tape will | ||
be instantiated within the method or set to the current active Tape using | ||
`Tape::getActive()` if none is passed as argument. | ||
### Forward Mode | ||
```c++ | ||
template <typename RowIterator, typename T> | ||
void computeJacobian( | ||
const std::vector<FReal<T>> &vec, | ||
std::function<std::vector<FReal<T>>(std::vector<FReal<T>> &)> foo, | ||
RowIterator first, RowIterator last) | ||
``` | ||
|
||
This mode does not require a Tape and can help reduce the overhead that | ||
comes with one. It is recommended for functions that have a higher number | ||
of outputs than inputs. | ||
|
||
## Example Use | ||
|
||
Given $f(x, y, z, w) = [sin(x + y) sin(y + z) cos(z + w) cos(w + x)]$, or | ||
|
||
```c++ | ||
std::function<std::vector<AD>(std::vector<AD>&)> foo = | ||
[](std::vector<AD> &x) -> std::vector<AD> { | ||
return {sin(x[0] + x[1]), | ||
sin(x[1] + x[2]), | ||
cos(x[2] + x[3]), | ||
cos(x[3] + x[0])}; | ||
}; | ||
``` | ||
|
||
with the derivatives calculated at the following point | ||
|
||
```c++ | ||
std::vector<AD> x_ad({1.0, 1.5, 1.3, 1.2}); | ||
``` | ||
we'd like to compute the Jacobian | ||
$$ | ||
J = \begin{bmatrix} | ||
\frac{\partial \sin(x + y)}{\partial x} & | ||
\frac{\partial \sin(x + y)}{\partial y} & | ||
\frac{\partial \sin(x + y)}{\partial z} & | ||
\frac{\partial \sin(x + y)}{\partial w} \\ | ||
\frac{\partial \sin(y + z)}{\partial x} & | ||
\frac{\partial \sin(y + z)}{\partial y} & | ||
\frac{\partial \sin(y + z)}{\partial z} & | ||
\frac{\partial \sin(y + z)}{\partial w} \\ | ||
\frac{\partial \cos(z + w)}{\partial x} & | ||
\frac{\partial \cos(z + w)}{\partial y} & | ||
\frac{\partial \cos(z + w)}{\partial z} & | ||
\frac{\partial \cos(z + w)}{\partial w} \\ | ||
\frac{\partial \cos(w + x)}{\partial x} & | ||
\frac{\partial \cos(w + x)}{\partial y} & | ||
\frac{\partial \cos(w + x)}{\partial z} & | ||
\frac{\partial \cos(w + x)}{\partial w} | ||
\end{bmatrix} | ||
$$ | ||
First step is to setup the tape and active data types | ||
```c++ | ||
typedef xad::adj<double> mode; | ||
typedef mode::tape_type tape_type; | ||
typedef mode::active_type AD; | ||
tape_type tape; | ||
``` | ||
|
||
Note that if no tape is setup, one will be created when computing the Jacobian. | ||
`fwd` mode is also supported in the same fashion. All that is left to do is | ||
define our input values and our function, then call `computeJacobian()`: | ||
|
||
```c++ | ||
std::function<std::vector<AD>(std::vector<AD>&)> foo = [](std::vector<AD>& x) -> std::vector<AD> | ||
{ return {sin(x[0] + x[1]), | ||
sin(x[1] + x[2]), | ||
cos(x[2] + x[3]), | ||
cos(x[3] + x[0])}; }; | ||
|
||
auto jacobian = computeJacobian(x_ad, foo); | ||
``` | ||
Note the signature of `foo()`. Any other signature will throw an error. | ||
This computes the relevant matrix | ||
$$ | ||
\begin{bmatrix} | ||
1 & 0.0707372 & 0 & 0 \\ | ||
0 & 1 & 0.267499 & 0 \\ | ||
0 & 0 & 1 & 0.362358 \\ | ||
0.540302 & 0 & 0 & 1 | ||
\end{bmatrix} | ||
$$ | ||
and prints it | ||
```c++ | ||
for (auto row : jacobian) | ||
{ | ||
for (auto elem : row) std::cout << elem << " "; | ||
std::cout << std::endl; | ||
} | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
############################################################################## | ||
# | ||
# Hessian sample CMake file | ||
# | ||
# This file is part of XAD, a comprehensive C++ library for | ||
# automatic differentiation. | ||
# | ||
# Copyright (C) 2010-2024 Xcelerit Computing Ltd. | ||
# | ||
# This program is free software: you can redistribute it and/or modify | ||
# it under the terms of the GNU Affero General Public License as published | ||
# by the Free Software Foundation, either version 3 of the License, or | ||
# (at your option) any later version. | ||
# | ||
# This program is distributed in the hope that it will be useful, | ||
# but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
# GNU Affero General Public License for more details. | ||
# | ||
# You should have received a copy of the GNU Affero General Public License | ||
# along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
# | ||
############################################################################## | ||
|
||
xad_add_sample(Hessian main.cpp) |
Oops, something went wrong.