Skip to content
This repository was archived by the owner on Nov 16, 2023. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -346,3 +346,4 @@ _doc_report.txt
.pytest_cache/
data.csv
data.txt

2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ML.NET for Python
NimbusML
Copyright (c) Microsoft Corporation
All rights reserved.

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ ML.NET was originally developed in Microsoft Research and is used across many pr

This package enables training ML.NET pipelines or integrating ML.NET components directly into Scikit-Learn pipelines (it supports `numpy.ndarray`, `scipy.sparse_cst`, and `pandas.DataFrame` as inputs).

Documentation can be found [here](https://docs.microsoft.com/en-us/nimbusml/overview) with additional [notebook samples](https://github.com/Microsoft/ML.NET-for-Python-Samples).
Documentation can be found [here](https://docs.microsoft.com/en-us/NimbusML/overview) with additional [notebook samples](https://github.com/Microsoft/NimbusML-Samples).

## Installation

Expand Down Expand Up @@ -53,7 +53,7 @@ results = pipeline.predict(data)



Many additional examples and tutorials can be found in the [documentation](https://docs.microsoft.com/en-us/nimbusml/overview).
Many additional examples and tutorials can be found in the [documentation](https://docs.microsoft.com/en-us/NimbusML/overview).


## Building
Expand Down
3 changes: 2 additions & 1 deletion build/ci/phase-template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ phases:
- script: $(_buildScript) --configuration $(_configuration) --runTests
# Mac phases
- ${{ if eq(parameters.name, 'Mac') }}:
- script: brew install gcc
- script: chmod 777 $(_buildScript) && $(_buildScript) --configuration $(_configuration) --runTests
# Linux phases
- ${{ if ne(parameters.testDistro, '') }}:
Expand All @@ -48,5 +49,5 @@ phases:
displayName: Publish wheel file to VSTS artifacts
inputs:
pathToPublish: $(Build.SourcesDirectory)/target
artifactName: Mlnet Wheels
artifactName: NimbusML Wheels
artifactType: container
4 changes: 2 additions & 2 deletions build/signed_build_phase.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,13 @@ phases:
displayName: Copy wheel file to Staging Directory in preparation for publishing
inputs:
SourceFolder: $(Build.SourcesDirectory)/target
Contents: mlnet-*.whl
Contents: nimbusml-*.whl
TargetFolder: $(Build.StagingDirectory)/artifacts

- task: PublishBuildArtifacts@1
condition: and(always(), ne(variables['Build.Reason'], 'PullRequest'))
displayName: Publish wheel file to VSTS artifacts
inputs:
pathToPublish: $(Build.StagingDirectory)/artifacts
artifactName: Mlnet Wheels
artifactName: NimbusML Wheels
artifactType: container
10 changes: 5 additions & 5 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
Documents Index
===============

Intro to mlnet
Intro to NimbusML
===============

`mlnet` provides state-of-the-art ML algorithms, transforms and components, aiming to make them useful for all developers, data scientists, and information workers and helpful in all products, services and devices.
NimbusML provides state-of-the-art ML algorithms, transforms and components, aiming to make them useful for all developers, data scientists, and information workers and helpful in all products, services and devices.

Project Docs
============

- [API](https://docs.microsoft.com/en-us/mlnet/overview)
- [Tutorials](https://docs.microsoft.com/en-us/mlnet/tutorials)
- [API](https://docs.microsoft.com/en-us/nimbusml/overview)
- [Tutorials](https://docs.microsoft.com/en-us/nimbusml/tutorials)
- [Developer Guide](developers/developer-guide.md)
- [Contributing to ML.NET](project-docs/contributing.md)
- [Contributing to ML.NET](project-docs/contributing.md)
6 changes: 3 additions & 3 deletions docs/developers/developer-guide.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
Developer Guide
===============

`mlnet` runs on Windows, Linux, and macOS and supports Python 3.6, 3.5, and 2.7, 64 bit versions only. It has been tested on Windows 10, MacOS 10.13, Ubuntu 14.04, Ubuntu 16.04, Ubuntu 18.04, CentOS 7, and RHEL 7.
NimbusML runs on Windows, Linux, and macOS and supports Python 3.6, 3.5, and 2.7, 64 bit versions only. It has been tested on Windows 10, MacOS 10.13, Ubuntu 14.04, Ubuntu 16.04, Ubuntu 18.04, CentOS 7, and RHEL 7.

Building the repository
=======================

The `mlnet` repo can be built directly from a terminal or cmd prompt. See the platform-specific build instructions for your dev environment:
The NimbusML repo can be built directly from a terminal or cmd prompt. See the platform-specific build instructions for your dev environment:

| [Windows](windows-build.md) | [Linux](linux-build.md) | [Mac](mac-build.md) |

`mlnet` official builds are produced in Azure Dev Ops, as specified by the file `.vsts-ci.yml`.
Nimbus official builds are produced in Azure Dev Ops, as specified by the file `.vsts-ci.yml`.
6 changes: 3 additions & 3 deletions docs/developers/linux-build.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Building `mlnet` from source on Linux
Building NimbusML from source on Linux
==========================================
## Prerequisites
1. gcc >= 5.4
Expand All @@ -12,9 +12,9 @@ Building `mlnet` from source on Linux
## Build
Run `./build.sh`

This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages `mlnet` into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `./build.sh --configuration RlsLinPy3.6` for examle.
This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages NimbusML into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `./build.sh --configuration RlsLinPy3.6` for examle.

For additional options including running tests and building components independently, see `./build.sh -h`.

### Known Issues
The LightGBM estimator fails on Linux when building from source. The official `mlnet` Linux wheel package on Pypi.org has a working version of LightGBM.
The LightGBM estimator fails on Linux when building from source. The official NimbusML Linux wheel package on Pypi.org has a working version of LightGBM.
4 changes: 2 additions & 2 deletions docs/developers/mac-build.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Building `mlnet` from source on Mac
Building NimbusML from source on Mac
==========================================
## Prerequisites
1. Xcode Command Line Tools (for Clang compiler)
Expand All @@ -7,7 +7,7 @@ Building `mlnet` from source on Mac
## Build
Run `./build.sh`

This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages `mlnet` into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `./build.sh --configuration RlsMacPy3.6` for examle.
This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages NimbusML into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `./build.sh --configuration RlsMacPy3.6` for examle.

For additional options including running tests and building components independently, see `./build.sh -h`.

Expand Down
4 changes: 2 additions & 2 deletions docs/developers/windows-build.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
Building `mlnet` from source on Windows
Building NimbusML from source on Windows
==========================================
## Prerequisites
1. Visual Studio 2015 or higher
Expand All @@ -7,6 +7,6 @@ Building `mlnet` from source on Windows
## Build
Run `build.cmd`

This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages `mlnet` into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `build.cmd --configuration RlsWinPy3.6` for examle.
This downloads dependencies (.NET SDK, specific versions of Python and Boost), builds native code and managed code, and packages NimbusML into a pip-installable wheel. This produces debug binaries by default, and release versions can be specified by `build.cmd --configuration RlsWinPy3.6` for examle.

For additional options including running tests and building components independently, see `build.cmd -?`.
6 changes: 3 additions & 3 deletions docs/project-docs/contributing.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# Welcome!

If you are here, it means you are interested in helping us out. A hearty welcome and thank you! There are many ways you can contribute to the `mlnet` project:
If you are here, it means you are interested in helping us out. A hearty welcome and thank you! There are many ways you can contribute to the NimbusML project:

* Offer PRs to fix bugs or implement new features.
* Give us feedback and bug reports regarding the software or the documentation.
* Improve our examples, tutorials, and documentation.

### mlnet and ML.NET
### NimbusML and ML.NET

`mlnet` provides Python bindings for the [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) library of machine learning algorithms. If you would like to contribute to the underlying library of algorithms, please check out [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet). If you would like to contribute to the `mlnet` python bindings project, please read on.
NimbusML provides Python bindings for the [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) library of machine learning algorithms. If you would like to contribute to the underlying library of algorithms, please check out [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet). If you would like to contribute to the NimbusML python bindings project, please read on.

## New Contributers

Expand Down
4 changes: 2 additions & 2 deletions docs/project-docs/style-guide.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
Contributing to Machine Learning
======================

This document describes contribution guidelines that are specific to `mlnet`. Please read [Python Style Guide](https://www.python.org/dev/peps/pep-0008/) for more general Python style guidelines.
This document describes contribution guidelines that are specific to NimbusML. Please read [Python Style Guide](https://www.python.org/dev/peps/pep-0008/) for more general Python style guidelines.

Coding Style Changes
--------------------

We intend to bring `mlnet` into full conformance with the style guidelines described in [Python Style Guide](https://www.python.org/dev/peps/pep-0008/). We plan to do that with tooling, in a holistic way. In the meantime, please:
We intend to bring NimbusML into full conformance with the style guidelines described in [Python Style Guide](https://www.python.org/dev/peps/pep-0008/). We plan to do that with tooling, in a holistic way. In the meantime, please:

* **DO NOT** send PRs for style changes. For example, do not send PRs that are focused on changing usage of ```Int32``` to ```int```.
* **DO NOT** send PRs for upgrading code to use newer language features, though it's ok to use newer language features as part of new code that's written. For example, it's ok to use expression-bodied members as part of new code you write, but do not send a PR focused on changing existing properties or methods to use the feature.
Expand Down
2 changes: 1 addition & 1 deletion src/DotNetBridge/Bridge.cs
Original file line number Diff line number Diff line change
Expand Up @@ -310,7 +310,7 @@ private static unsafe int GenericExec(EnvironmentBlock* penv, sbyte* psz, int cd
using (var env = new RmlEnvironment(MarshalDelegate<CheckCancelled>(penv->checkCancel), penv->seed,
verbose: penv != null && penv->verbosity > 3, conc: penv != null ? penv->maxThreadsAllowed : 0))
{
var host = env.Register("MlNetExecution");
var host = env.Register("ML.NET_Execution");
env.ComponentCatalog.RegisterAssembly(typeof(TextLoader).Assembly); // ML.Data
env.ComponentCatalog.RegisterAssembly(typeof(LinearPredictor).Assembly); // ML.StandardLearners
env.ComponentCatalog.RegisterAssembly(typeof(CategoricalTransform).Assembly); // ML.Transforms
Expand Down
8 changes: 4 additions & 4 deletions src/NativeBridge/dllmain.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
#define PARAM_SEED "seed"
#define PARAM_GRAPH "graph"
#define PARAM_VERBOSE "verbose"
#define PARAM_MLNET_PATH "nimbusmlPath"
#define PARAM_NIMBUSML_PATH "nimbusmlPath"
#define PARAM_DATA "data"

#define WIN_FOLDER L"\\Win"
Expand Down Expand Up @@ -70,13 +70,13 @@ bp::dict pxCall(bp::dict& params)
try
{
bp::extract<std::string> graph(params[PARAM_GRAPH]);
bp::extract<std::string> nimbusmlPath(params[PARAM_MLNET_PATH]);
bp::extract<std::string> nimbusmlPath(params[PARAM_NIMBUSML_PATH]);
bp::extract<std::int32_t> verbose(params[PARAM_VERBOSE]);
std::int32_t i_verbose = std::int32_t(verbose);
std::string s_nimbusmlPath = std::string(nimbusmlPath);
std::string s_graph = std::string(graph);
const char *path = s_nimbusmlPath.c_str(); // nimbusmlPath
const char *coreclrpath = s_nimbusmlPath.c_str(); // mlnet core clr
const char *path = s_nimbusmlPath.c_str();
const char *coreclrpath = s_nimbusmlPath.c_str();

GENERICEXEC exec = EnsureExec(path, coreclrpath);
if (exec == nullptr)
Expand Down
4 changes: 2 additions & 2 deletions src/python/docs/sphinx/concepts/columns.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ How To Select Columns to Transform
``transform()`` and ``fit_transform()`` methods of trainers and transforms. By default, all
columns are transformed equally.

``nimbusml`` additionally provides a syntax to transform only a subset of columns. This is a useful
NimbusML additionally provides a syntax to transform only a subset of columns. This is a useful
feature for many transforms, especially when the dataset containts columns of mixed types. For
example, a dataset with both numeric features and free text features. Similarly for trainers, the
concept of :ref:`roles` provides a mechanism to select which columns to use as labels and features.
Expand Down Expand Up @@ -55,7 +55,7 @@ What if we only want to encode one of the columns? We simply use the ``<<`` oper
transform to restrict operations to the columns of interest. The ``<<`` operatator is syntactic
sugar for setting the ``columns`` argument of the transform.

All transforms in ``nimbusml`` have an implicit ``columns`` parameter to tell which columns to process,
All transforms in NimbusML have an implicit ``columns`` parameter to tell which columns to process,
and optionally how to name the output columns, if any. Refer to the reference sections for each
transform to see what format is allowed for the ``columns`` argument.

Expand Down
2 changes: 1 addition & 1 deletion src/python/docs/sphinx/concepts/experimentvspipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ operations.
Optimized Chaining of Trainers/Transforms
"""""""""""""""""""""""""""""""""""""""""

Using ``nimbusml``, trainers and transforms within a :py:class:`nimbusml.Pipeline` will
Using NimbusML, trainers and transforms within a :py:class:`nimbusml.Pipeline` will
generally result in better performance compared to using them in a
`sklearn.Pipeline <http://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html>`_.
Data copying is minimized when processing is limited to within the C# libraries, and if all
Expand Down
4 changes: 2 additions & 2 deletions src/python/docs/sphinx/concepts/roles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Column Roles for Trainers
Roles and Learners
------------------

Columns play different roles in the context of trainers. ``nimbusml`` supports the following roles, as defined in :py:class:`nimbusml.Role`
Columns play different roles in the context of trainers. NimbusML supports the following roles, as defined in :py:class:`nimbusml.Role`

* Role.Label - the column representing the dependent variable.
* Role.Feature - the column(s) representing the independent variable(s).
Expand Down Expand Up @@ -126,7 +126,7 @@ Example of GroupId Role

Same goes for the group. Rankers needs the GroupId to link rows to rank. A ranker for search engine needs a
dataset with a row per displayed result. The GroupId is ued to tell the learner which results belong to the
same query, to group together the candidate set of documents for a single query. ``nimbusml`` needs features,
same query, to group together the candidate set of documents for a single query. NimbusML needs features,
a target (relevance label of the result) and a GroupId.

Below is an example of using GroupId at the trainer.
Expand Down
6 changes: 3 additions & 3 deletions src/python/docs/sphinx/concepts/schema.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,13 @@ Schema
Introduction to Schema
----------------------

The ``nimbusml`` data framework relies on a schema to understand the column names and mix of column
The NimbusML data framework relies on a schema to understand the column names and mix of column
types in the dataset, which may originate from any of the supported :ref:`datasources`. It is
automatically inferred when a :py:class:`nimbusml.FileDataStream` or :py:class:`nimbusml.DataSchema` is created.

Transforms have the ability to operate on subsets of columns in the dataset, as well as alter the
resulting output schema, which effects other transforms downstream. For users, it would be very useful to
understand how ``nimbusml`` processes the data in a pipeline for debugging purposes or training the model with :py:class:`nimbusml.FileDataStream`.
understand how NimbusML processes the data in a pipeline for debugging purposes or training the model with :py:class:`nimbusml.FileDataStream`.

The schema comes with two formats for its representation, (1) object representation and (2) string format. After generating a :py:class:`nimbusml.FileDataStream`, users can view the
object representation of the schema by using ``repr()`` function:
Expand Down Expand Up @@ -168,7 +168,7 @@ all of types R8, I8 and TX, with column names *X1*, *X2* and *X3*.
Example of Schema for a File
""""""""""""""""""""""""""""""""""""""

The transforms and trainers in ``nimbusml`` support various :ref:`datasources` as inputs.
The transforms and trainers in NimbusML support various :ref:`datasources` as inputs.
When the data is in a ``pandas.DataFrame``, the schema is inferred automatically from the
``dtype`` of the columns.

Expand Down
4 changes: 2 additions & 2 deletions src/python/docs/sphinx/concepts/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Types
Column Types
------------

``nimbusml`` wraps a library written in C#, which is a strongly typed language. Columns of the input data sources are ascribed a type, which is used by
NimbusML wraps a library written in C#, which is a strongly typed language. Columns of the input data sources are ascribed a type, which is used by
transforms and trainers to decide if they can operate on that column. Some transforms may only allow
text data types, while others only numeric. Trainers almost exclusively require the features and
labels to be of a numeric type.
Expand All @@ -41,7 +41,7 @@ VectorType Columns
A VectorType column contains a vector of values of a homogenous type, and is associated with a
``column_name``.

The following table shows how ``nimbusml`` processes a dataset:
The following table shows how NimbusML processes a dataset:

.. image:: ../_static/images/table_car.png
The third column is a VectorType column named *Features* with 10 ``slots``. A VectorType column can
Expand Down
Loading