Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -348,9 +348,3 @@ data.csv
data.txt

/build/TestCoverageReport

# The folder generated by make_yaml.bat
*_build
*mymodeluci.zip
build/sphinxmdoutput-0.2.4.1-py3-none-any.whl
*build
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ To build `nimbusml` from source please visit our [developer guide](docs/develope

## Contributing

The contributions guide can be found [here](CONTRIBUTING.md). Given the experimental nature of this project, support will be provided on a best-effort basis. We suggest opening an issue for discussion before starting a PR with big changes.
The contributions guide can be found [here](CONTRIBUTING.md).

## Support

Expand Down
4 changes: 2 additions & 2 deletions src/python/docs/docstrings/Dart.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@
<https://arxiv.org/abs/1505.01866>`_ is an
ensemble method of boosted regression trees. The Dropouts meet
Multiple Additive Regression
Trees (DART) employs dropouts in MART and overcomes the issues of over-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[](start = 0, length = 1)

This has been fixed in PR 369, right? Why its coming up again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you need to rebase again?


In reply to: 349970007 [](ancestors = 349970007)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ganik Yes, this was fixed in PR 369 into master. This is merging into temp/docs branch, which doesn't have this change. temp/docs is where I want the change in order to auto generate the API docs.

Trees (DART) employs dropouts in MART and overcomes the issues of over-
specialization of MART,
achiving better performance in many tasks.
achieving better performance in many tasks.


**Reference**
Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/docstrings/FastLinearBinaryClassifier.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""

A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer
for linear binary classification and regression.
for linear binary classification.

.. remarks::
``FastLinearBinaryClassifier`` is a trainer based on the Stochastic
Expand Down
3 changes: 2 additions & 1 deletion src/python/docs/docstrings/FastLinearClassifier.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""

Train an SDCA multi class model
A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer for
multi class classification.

.. remarks::
``FastLinearClassifier`` is a trainer based on the Stochastic Dual
Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/docstrings/FastLinearRegressor.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
"""

A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer
for linear binary classification and regression.
for linear regression.

.. remarks::
``FastLinearRegressor`` is a trainer based on the Stochastic Dual
Expand Down
3 changes: 1 addition & 2 deletions src/python/docs/docstrings/FromKey.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""

Text transforms that can be performed on data before training
a model.
Converts the key types back to their original values.

.. remarks::
The ``FromKey`` transform converts a column of keys, generated using
Expand Down
4 changes: 2 additions & 2 deletions src/python/docs/docstrings/Goss.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
.. remarks::
Gradient-based One-Side Sampling (GOSS) employs an adaptive sampling
named gradient-based
sampling. For datasets with large sample size, GOSS has considerable
sampling. For datasets with large sample size, GOSS has considerable
advantage in terms of
statistical and computational efficiency.
statistical and computational efficiency.



Expand Down
15 changes: 7 additions & 8 deletions src/python/docs/docstrings/Handler.txt
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,13 @@
For more details see `Columns </nimbusml/concepts/columns>`_.

:param replace_with: The method to use to replace NaN values. The
following choices are available.

* Def: Replace with default value of that type, usually ``0``. If no
replace
method is specified, this is the default strategy.
* Mean: Replace NaN values with the mean of the values in that column.
* Min: Replace with minimum value in the column.
* Max: Replace with maximum value in the column.
following choices are available.

* Def: Replace with default value of that type, usually ``0``. If no
replace method is specified, this is the default strategy.
* Mean: Replace NaN values with the mean of the values in that column.
* Min: Replace with minimum value in the column.
* Max: Replace with maximum value in the column.

.. seealso::
:py:class:`Filter <nimbusml.preprocessing.missing_values.Filter>`,
Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/docstrings/Loader.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""

Loaders image data.
Loads image data.

.. remarks::
``Loader`` loads images from paths.
Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/docstrings/NGram.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""

Extracts NGrams from text and convert them to vector using
Extracts NGrams from text and converts them to vector using
dictionary.

.. remarks::
Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/docstrings/NgramHash.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""

Extracts NGrams from text and convert them to vector using hashing
Extracts NGrams from text and converts them to vector using hashing
trick.

.. remarks::
Expand Down
44 changes: 44 additions & 0 deletions src/python/docs/docstrings/PrefixColumnConcatenator.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
"""

Combines several columns into a single vector-valued column by prefix.

.. remarks::
``PrefixColumnConcatenator`` creates a single vector-valued column from
multiple
columns. It can be performed on data before training a model. The
concatenation
can significantly speed up the processing of data when the number of
columns
is as large as hundreds to thousands.

:param columns: a dictionary of key-value pairs, where key is the output
column name and value is a list of input column names.

* Only one key-value pair is allowed.
* Input column type: numeric or string.
* Output column type:
`Vector Type </nimbusml/concepts/types#vectortype-column>`_.

The << operator can be used to set this value (see
`Column Operator </nimbusml/concepts/columns>`_)

For example
* ColumnConcatenator(columns={'features': ['age', 'parity',
'induced']})
* ColumnConcatenator() << {'features': ['age', 'parity',
'induced']})

For more details see `Columns </nimbusml/concepts/columns>`_.

.. seealso::
:py:class:`ColumnDropper
<nimbusml.preprocessing.schema.ColumnDropper>`,
:py:class:`ColumnSelector
<nimbusml.preprocessing.schema.ColumnSelector>`.

.. index:: transform, schema

Example:
.. literalinclude:: /../nimbusml/examples/PrefixColumnConcatenator.py
:language: python
"""
6 changes: 3 additions & 3 deletions src/python/docs/docstrings/Resizer.txt
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
"""

Resizers an image to a specified dimension using a specified
Resizes an image to a specified dimension using a specified
resizing method.

.. remarks::
``Resizer`` resizers an image to the specified height and width
``Resizer`` resizes an image to the specified height and width
using a specified resizing method. The input variables to this
transforms must
be images, typically the result of the ``Loader`` transform.

:param columns: a dictionary of key-value pairs, where key is the output
:param columns: A dictionary of key-value pairs, where key is the output
column name and value is the input column name.

* Multiple key-value pairs are allowed.
Expand Down
3 changes: 1 addition & 2 deletions src/python/docs/docstrings/ToKey.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
"""

Text transforms that can be performed on data before training
a model.
Converts input values (words, numbers, etc.) to index in a dictionary.

.. remarks::
The ``ToKey`` transform converts a column of text to key values
Expand Down
4 changes: 2 additions & 2 deletions src/python/nimbusml/_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,10 +119,10 @@ class Pipeline:
for more details on how to select these.

:param steps: the list of operator or (name, operator) tuples that
are chained in the appropriate order.
are chained in the appropriate order.

:param model: the path to the model file (".zip") if want to load a
model directly from file (such as a trained model from ML.NET).
model directly from file (such as a trained model from ML.NET).

:param random_state: the integer used as the random seed.

Expand Down
4 changes: 2 additions & 2 deletions src/python/nimbusml/ensemble/booster/_dart.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,9 @@ class Dart(core):
<https://arxiv.org/abs/1505.01866>`_ is an
ensemble method of boosted regression trees. The Dropouts meet
Multiple Additive Regression
Trees (DART) employs dropouts in MART and overcomes the issues of over-
Trees (DART) employs dropouts in MART and overcomes the issues of over-
specialization of MART,
achiving better performance in many tasks.
achieving better performance in many tasks.


**Reference**
Expand Down
4 changes: 2 additions & 2 deletions src/python/nimbusml/ensemble/booster/_goss.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,9 @@ class Goss(core):
.. remarks::
Gradient-based One-Side Sampling (GOSS) employs an adaptive sampling
named gradient-based
sampling. For datasets with large sample size, GOSS has considerable
sampling. For datasets with large sample size, GOSS has considerable
advantage in terms of
statistical and computational efficiency.
statistical and computational efficiency.



Expand Down
2 changes: 1 addition & 1 deletion src/python/nimbusml/feature_extraction/image/_loader.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
class Loader(core, BaseTransform, TransformerMixin):
"""

Loaders image data.
Loads image data.

.. remarks::
``Loader`` loads images from paths.
Expand Down
6 changes: 3 additions & 3 deletions src/python/nimbusml/feature_extraction/image/_resizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@
class Resizer(core, BaseTransform, TransformerMixin):
"""

Resizers an image to a specified dimension using a specified
Resizes an image to a specified dimension using a specified
resizing method.

.. remarks::
``Resizer`` resizers an image to the specified height and width
``Resizer`` resizes an image to the specified height and width
using a specified resizing method. The input variables to this
transforms must
be images, typically the result of the ``Loader`` transform.

:param columns: a dictionary of key-value pairs, where key is the output
:param columns: A dictionary of key-value pairs, where key is the output
column name and value is the input column name.

* Multiple key-value pairs are allowed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
class Ngram(core):
"""

Extracts NGrams from text and convert them to vector using
Extracts NGrams from text and converts them to vector using
dictionary.

.. remarks::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
class NgramHash(core):
"""

Extracts NGrams from text and convert them to vector using hashing
Extracts NGrams from text and converts them to vector using hashing
trick.

.. remarks::
Expand Down
4 changes: 2 additions & 2 deletions src/python/nimbusml/internal/core/ensemble/booster/_dart.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ class Dart(Component):
<https://arxiv.org/abs/1505.01866>`_ is an
ensemble method of boosted regression trees. The Dropouts meet
Multiple Additive Regression
Trees (DART) employs dropouts in MART and overcomes the issues of over-
Trees (DART) employs dropouts in MART and overcomes the issues of over-
specialization of MART,
achiving better performance in many tasks.
achieving better performance in many tasks.


**Reference**
Expand Down
4 changes: 2 additions & 2 deletions src/python/nimbusml/internal/core/ensemble/booster/_goss.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ class Goss(Component):
.. remarks::
Gradient-based One-Side Sampling (GOSS) employs an adaptive sampling
named gradient-based
sampling. For datasets with large sample size, GOSS has considerable
sampling. For datasets with large sample size, GOSS has considerable
advantage in terms of
statistical and computational efficiency.
statistical and computational efficiency.



Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
class Loader(BasePipelineItem, DefaultSignature):
"""

Loaders image data.
Loads image data.

.. remarks::
``Loader`` loads images from paths.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@
class Resizer(BasePipelineItem, DefaultSignature):
"""

Resizers an image to a specified dimension using a specified
Resizes an image to a specified dimension using a specified
resizing method.

.. remarks::
``Resizer`` resizers an image to the specified height and width
``Resizer`` resizes an image to the specified height and width
using a specified resizing method. The input variables to this
transforms must
be images, typically the result of the ``Loader`` transform.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
class Ngram(Component):
"""

Extracts NGrams from text and convert them to vector using
Extracts NGrams from text and converts them to vector using
dictionary.

.. remarks::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
class NgramHash(Component):
"""

Extracts NGrams from text and convert them to vector using hashing
Extracts NGrams from text and converts them to vector using hashing
trick.

.. remarks::
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class FastLinearBinaryClassifier(
"""

A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer
for linear binary classification and regression.
for linear binary classification.

.. remarks::
``FastLinearBinaryClassifier`` is a trainer based on the Stochastic
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ class FastLinearClassifier(
DefaultSignatureWithRoles):
"""

Train an SDCA multi class model
A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer for
multi class classification.

.. remarks::
``FastLinearClassifier`` is a trainer based on the Stochastic Dual
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ class FastLinearRegressor(
"""

A Stochastic Dual Coordinate Ascent (SDCA) optimization trainer
for linear binary classification and regression.
for linear regression.

.. remarks::
``FastLinearRegressor`` is a trainer based on the Stochastic Dual
Expand Down
3 changes: 1 addition & 2 deletions src/python/nimbusml/internal/core/preprocessing/_fromkey.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@
class FromKey(BasePipelineItem, DefaultSignature):
"""

Text transforms that can be performed on data before training
a model.
Converts the key types back to their original values.

.. remarks::
The ``FromKey`` transform converts a column of keys, generated using
Expand Down
3 changes: 1 addition & 2 deletions src/python/nimbusml/internal/core/preprocessing/_tokey.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,7 @@
class ToKey(BasePipelineItem, DefaultSignature):
"""

Text transforms that can be performed on data before training
a model.
Converts input values (words, numbers, etc.) to index in a dictionary.

.. remarks::
The ``ToKey`` transform converts a column of text to key values
Expand Down
Loading