Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md #295

sfilipi · 2018-06-04T18:47:33Z

Adding the EntryPoints.md and the GraphRunner.md files.
EntryPoints.md introduces the entry points, the entry points manifests and classes that are associated with them.
GraphRunner.md introduces and describes the entry points graph structure.

Addresses #390

dnfclas · 2018-06-04T18:47:44Z

All CLA requirements met. #Resolved

TomFinley · 2018-06-04T20:09:01Z

docs/code/EntryPoints.md

@@ -0,0 +1,188 @@
+# Overview
+
+An 'entry point', is a representation of a ML.Net type in json format and it is used to serialize and deserialize an ML.Net type in JSON. 


ML.Net [](start = 43, length = 6)

I think the branding is not the lower case ML.Net but ML.NET. #Closed

TomFinley · 2018-06-04T20:09:11Z

docs/code/EntryPoints.md

@@ -0,0 +1,188 @@
+# Overview
+
+An 'entry point', is a representation of a ML.Net type in json format and it is used to serialize and deserialize an ML.Net type in JSON. 


json [](start = 58, length = 4)

JSON is typically capitalized. #Closed

TomFinley · 2018-06-04T20:09:47Z

docs/code/EntryPoints.md

@@ -0,0 +1,188 @@
+# Overview
+
+An 'entry point', is a representation of a ML.Net type in json format and it is used to serialize and deserialize an ML.Net type in JSON. 


it [](start = 74, length = 2)

Ambiguous, when we say "it" what are we referring to? Entry-points? JSON? An ML.NET type? #Closed

TomFinley · 2018-06-04T20:10:29Z

docs/code/EntryPoints.md

+
+An 'entry point', is a representation of a ML.Net type in json format and it is used to serialize and deserialize an ML.Net type in JSON. 
+It is also one of the ways ML.Net uses to deserialize experiments, and the recommended way to interface with other languages. 
+In terms defining experiments w.r.t entry points, experiments are entry points DAGs, and respectively, entry points are experiment graph nodes.


w.r.t [](start = 30, length = 5)

If we want to use this initialism it would be "w.r.t." not "w.r.t". #Closed

TomFinley · 2018-06-04T20:11:09Z

docs/code/EntryPoints.md

+			"OutputData": "$Output_1528136517433",
+			"Model": "$TransformModel_1528136517433"
+		}
+	}


Be consistent with usage of spaces vs. tabs above. Prefer spaces. #Closed

TomFinley · 2018-06-04T20:12:50Z

docs/code/GraphRunner.md

@@ -0,0 +1,123 @@
+# JSON Graph format
+
+The entry point graph in TLC is an array of _nodes_. Each node is an object with the following fields:


entry point [](start = 4, length = 11)

This might be a good place to have a link to EntryPoints.md. #Closed

TomFinley · 2018-06-04T20:13:56Z

docs/code/GraphRunner.md

+- _array_ of the above. Represented as a JSON array, maps to a C# array.
+- _dictionary_. Currently not implemented. Represented as a JSON object, maps to a C# `Dictionary<string,T>`.
+- _component_. Currently not implemented. Represented as a JSON object with 2 fields: _name_:string and _settings_:object.
+


Is this information current? I thought I saw some support for these. Certainly components are supported (not as SubComponent type specifically, but we can use dependency injection through the component factories). #Resolved

Ah, my edits to this file are not reflected. Fixing that. #Closed

corrected the component part. Double-checking on the dictionaries and indexing in arrays. I don't think we do that yet.

In reply to: 192866131 [](ancestors = 192866131)

TomFinley · 2018-06-04T20:14:35Z

docs/code/GraphRunner.md

+- _TransformModel_
+- _PredictorModel_
+
+These must be passed as _variables_. The variable is represented as a JSON string that begins with "$". 


"$" [](start = 99, length = 3)

For code like this I might prefer `$` to "$". #Closed

TomFinley · 2018-06-04T20:18:41Z

docs/code/GraphRunner.md

+
+## Example of a JSON entry point manifest object, and the respective entry point graph node
+Let's consider the following manifest snippet, describing an entry point _'CVSplit.Split'_:
+```


You have in the other file been using the javascript type on these code blocks. This is probably a good practice to carry over to this file. #Closed

TomFinley · 2018-06-04T20:20:15Z

JSON Graph format

Maybe "Entry Points JSON Graph Format" might be a more unambiguous title. #Closed

Refers to: docs/code/GraphRunner.md:1 in 5da49a3. [](commit_id = 5da49a3, deletion_comment = False)

TomFinley · 2018-06-04T20:21:13Z

docs/code/GraphRunner.md

+## Input and output types
+The following types are supported in JSON graphs:
+
+- _string_. Represented as a JSON string, maps to a C# string.


string [](start = 55, length = 6)

Should these types when listed here be listed as string vs. plain old string, since we are using C# keywords to describe them? (E.g.: string, float, double, bool, enum, int, long, etc.) This comment would not apply to things that are actually meant to be interpreted as prose descriptions of the type, e.g., "array." #WontFix

TomFinley · 2018-06-04T20:22:34Z

docs/code/GraphRunner.md

+
+## Variables
+The following input/output types can not be represented as a JSON value:
+- _DataView_


DataView [](start = 2, length = 10)

Is this usage intentional? There is no DataView, but there is an IDataView. Similar for file handles, the models, etc. #Closed

Also if these are meant to be actual types, should they not be in ` backticks, since they're meant to be interpreted as code?

In reply to: 192868598 [](ancestors = 192868598)

TomFinley · 2018-06-04T20:23:49Z

docs/code/EntryPoints.md

+                    "src"
+                  ],
+                  "Required": false,
+                  "SortOrder": 150.0,


"SortOrder": 150.0, [](start = 18, length = 19)

These are kind of poor examples... SortOrder is identical between the two properties here, and in the enclosing scope they are also identical with sort order of 1. :) #Resolved

leaving the 150 sort order intact, since it seems to be the de fact (not sure if intentional, though) default for advanced properties.

Updating the transform used for the example to a better one.

In reply to: 192868923 [](ancestors = 192868923)

TomFinley · 2018-06-04T21:33:12Z

docs/code/EntryPoints.md

+This document briefly describes the structure of the entry points, the structure of an entry point manifest, and mentions the ML.Net classes that help construct an entry point
+graph.
+
+## `EntryPoint manifest - the definition of an entry point`


EntryPoint manifest - the definition of an entry point [](start = 4, length = 54)

This was put in code formatting. Was that intentional? #Closed

TomFinley · 2018-06-04T21:41:04Z

Overview

The header structure of this document is interesting. There is one top level header # Overview, but that seems to comprise the entire document, with the remainder of the document being ## headers.. (Which is odd, since we'd expect an overview to be a summary rather than the entire document.) Was the intention that there be another top level header, somewhere? #Closed

Refers to: docs/code/EntryPoints.md:1 in 5da49a3. [](commit_id = 5da49a3, deletion_comment = False)

TomFinley · 2018-06-08T17:16:29Z

docs/code/EntryPoints.md

+
+## Overview
+
+An 'entry point', is a representation of a ML.NET type in JSON format. Entry points are used to serialize and deserialize an ML.NET type in JSON. 


An 'entry point', is a representation of a ML.NET type in JSON format. [](start = 0, length = 70)

I am not entirely enthusiastic about that description. I think the primary reason why I don't like it is, I think the phrase ML.NET type is misleading, or at least vague. If I were asked what an ML.NET type is I might say something like VBuffer or IDataView, and to me a representation as JSON makes me think that thing is being serialized, which is not the point of entry-points at all.

So maybe, we could replace a lot of this language with something like this (I don't insist on this exact wording):

Entry-points are a way to interface with ML.NET components, by specifying an execution graph of connected inputs and outputs of those components. Both the manifest describing available components and their inputs/outputs, and an "experiment" graph description, are expressed in JSON. Etc. Etc. #Closed

TomFinley · 2018-06-08T17:17:14Z

docs/code/EntryPoints.md

+
+An 'entry point', is a representation of a ML.NET type in JSON format. Entry points are used to serialize and deserialize an ML.NET type in JSON. 
+It is also the recommended way to interface with other languages. 
+Defined based on entry points, experiments are entry points DAGs, and respectively, entry points are experiment graph nodes.


Defined based on entry points, experiments are entry points DAGs, and respectively...

Could this be rephrased? I'm not quite sure what it is mean to express.

Experiments #Closed

TomFinley · 2018-06-08T17:17:19Z

docs/code/EntryPoints.md

+
+An example of an entry point manifest object, specifically for the MissingValueIndicator transform, is:
+
+```javascript


This is how it is actually written out, but I wonder if we could just format it a bit to make it a bit more tolerable. The document is dominated by this ~180 line monstrosity. I think it could be improved significantly by just deleting a bunch of whitespace... so for example if the stuff from lines 40 through 65, we could make it look more like this to save a bunch of lines.

"Values": ["I1", "U1", "I2", "U2", "I4", "U4", "I8", "U8", "R4", "Num", R8", "TX", "Text", "TXT", "BL", "Bool", "TimeSpan", "TS", "DT", DateTime", "DZ", "DateTimeZone", "UG", "U16"]

Basically I suppose I'd say if it looked more like someone actually wrote it vs. code-generated it would be a lot easier to appreciate and comprehend. I think we can get it to all fit on one page. Sometimes more lengthy cannot be helped, but in general and especially for the first example, I think it's important that it fit on one page. #Closed

TomFinley · 2018-06-12T15:05:36Z

docs/code/EntryPoints.md

+                            "Name": "ResultType",
+                            "Type": {
+                                "Kind": "Enum",
+                                "Values": [ "I1","I2","U2","I4","U4","I8","U8","R4","Num","R8","TX","Text","TXT","BL","Bool","TimeSpan","TS","DT","DateTime","DZ","DateTimeZone","UG","U16"]


"Values": [ "I1","I2","U2","I4","U4","I8","U8","R4","Num","R8","TX","Text","TXT","BL","Bool","TimeSpan","TS","DT","DateTime","DZ","DateTimeZone","UG","U16"] [](start = 32, length = 156)

Having judicious linebreaks is fine, just that one per element was a bit much.

I condensed all the '[ ] ' to be on the same line as the element. Most of the arrays contain one element.

Keeping this in-line as well for consistency and to keep the graph shorter. I'll fix the spacing before/after the '['']'to be consistent.

In reply to: 194775271 [](ancestors = 194775271)

A comprehensible document understandable by its reader is the goal. Syntactic "consistency" isn't a goal. One way this can be incomprehensible is for it to be so long that the reader gets lost in the weeds, as was the case previously (and, frankly, is still the case). The other way it can be incomprehensible is to put everything on one line so structure can't be appreciated.

Think of it in these terms. If you were personally writing out this yourself, I doubt you would structure code in this way.

In reply to: 194776883 [](ancestors = 194776883,194775271)

TomFinley · 2018-06-12T15:06:18Z

docs/code/EntryPoints.md

+
+Entry-points are a way to interface with ML.NET components, by specifying an execution graph of connected inputs and outputs of those components.
+Both the manifest describing available components and their inputs/outputs, and an "experiment" graph description, are expressed in JSON. 
+The recommended way of interacting with ML.NET through other programming languages is by composing, and exchanging pipeline or experiment graphs.  


through other programming languages [](start = 47, length = 35)

Specifically, non-.NET programming languages. #Closed

TomFinley

Thanks @sfilipi ! Still not wild about the example, but that's OK. If that's the most confusing thing people find about entry-points we ought to consider ourselves lucky I guess. :)

TomFinley · 2018-06-12T15:37:04Z

docs/code/EntryPoints.md

+
+## EntryPoint manifest - the definition of an entry point
+
+An example of an entry point manifest object, specifically for the MissingValueIndicator transform, is:


MissingValueIndicator [](start = 67, length = 21)

Consider using code formatting for class names. #Resolved

shauheen · 2018-06-13T21:36:14Z

Is this PR related to #160 ? #Resolved

sfilipi · 2018-06-14T17:17:00Z

Addresses part of it. I keep logging bugs about Entry Points, need to give everybody context about what they are.

In reply to: 397095572 [](ancestors = 397095572)

GalOshri · 2018-06-18T17:30:34Z

docs/code/EntryPoints.md

+
+## Overview
+
+Entry-points are a way to interface with ML.NET components, by specifying an execution graph of connected inputs and outputs of those components.


Should we choose either "entry-points" or "entry points"? #Resolved

GalOshri · 2018-06-18T17:31:09Z

docs/code/EntryPoints.md

+Both the manifest describing available components and their inputs/outputs, and an "experiment" graph description, are expressed in JSON. 
+The recommended way of interacting with ML.NET through other, non-.NET programming languages, is by composing, and exchanging pipeline or experiment graphs.  
+
+Through the documentaiton, we also refer to them as 'entry points nodes', and not just entry points, and that is because they are used as nodes of the experiemnt graphs. 


Typo on "documentation" #Resolved

GalOshri · 2018-06-18T17:31:33Z

docs/code/EntryPoints.md

+Both the manifest describing available components and their inputs/outputs, and an "experiment" graph description, are expressed in JSON. 
+The recommended way of interacting with ML.NET through other, non-.NET programming languages, is by composing, and exchanging pipeline or experiment graphs.  
+
+Through the documentaiton, we also refer to them as 'entry points nodes', and not just entry points, and that is because they are used as nodes of the experiemnt graphs. 


Typo on "experiment" #Resolved

GalOshri · 2018-06-18T17:32:22Z

docs/code/EntryPoints.md

+
+Through the documentaiton, we also refer to them as 'entry points nodes', and not just entry points, and that is because they are used as nodes of the experiemnt graphs. 
+The graph 'variables', the various values of the experiment graph JSON properties serve to describe the relationship between the entry point nodes. 
+The 'variables' are therefore the edges of the DAG. 


Introduce the acronym "directed acyclic graph" #Resolved

GalOshri · 2018-06-18T17:35:02Z

docs/code/EntryPoints.md

+
+## EntryPoint manifest - the definition of an entry point
+
+An example of an entry point manifest object, specifically for the `MissingValueIndicator` transform, is:


I think the example is not the MissingValueIndicator transform #Resolved

GalOshri · 2018-06-18T17:38:14Z

docs/code/GraphRunner.md

+- If the variable is present only in _inputs_, but never in _outputs_, it is a _graph input_. All graph inputs must be provided before
+a graph can be run.
+- The variable has a type, which is the type of inputs (and, optionally, output) that it appears in. If the type of the variable is 
+ambiguous, TLC throws an exception.


Change to ML.NET? #Resolved

GalOshri · 2018-06-18T17:40:21Z

Thank you for creating this! Would it be useful to include a bit more information on how to turn an ML.NET component into an entrypoint (the C# code that needs to be added) and how the manifest is created? #Resolved

GalOshri · 2018-06-19T06:33:40Z

docs/code/EntryPoints.md

+
+## How to create an entry point for an existing ML.Net component
+
+The steps to take, to create an entry point for an existing ML.Net component, are:


"ML.Net" -> ML.NET (also in the header) #Resolved

GalOshri · 2018-06-19T06:34:38Z

docs/code/EntryPoints.md

-parameter.
+parameter.
+
+## How to create an entry point for an existing ML.Net component


Might be worth linking to an example somewhere in the code. #Resolved

shauheen

Adding EntryPoints.md and GraphRunner.md

5da49a3

sfilipi requested review from Ivanidzo4ka, TomFinley and yaeldekel June 4, 2018 18:48

TomFinley reviewed Jun 4, 2018

View reviewed changes

sfilipi added 2 commits June 5, 2018 09:35

addressing PR feedback

63b9fe8

Updating the title of the GraphRunner.md file

73cb7c8

sfilipi requested a review from GalOshri June 5, 2018 17:24

TomFinley reviewed Jun 8, 2018

View reviewed changes

adressing Tom's feedback

a962fe9

TomFinley reviewed Jun 12, 2018

View reviewed changes

adressing feedback

12d3537

TomFinley approved these changes Jun 12, 2018

View reviewed changes

TomFinley reviewed Jun 12, 2018

View reviewed changes

code formatting for class names

cca4f43

GalOshri reviewed Jun 18, 2018

View reviewed changes

Addressing Gal's comments

55174f3

GalOshri reviewed Jun 19, 2018

View reviewed changes

sfilipi added 2 commits June 19, 2018 08:10

Adding an example of an entry point. Fixing casing on ML.NET

14a727c

fixing link

e9b3a11

shauheen approved these changes Jun 21, 2018

View reviewed changes

sfilipi merged commit ecc6857 into dotnet:master Jun 21, 2018

sfilipi mentioned this pull request Jun 22, 2018

Add documentation about Entry Points #390

Closed

sfilipi deleted the entrypointdoc branch July 5, 2018 18:18

ghost locked as resolved and limited conversation to collaborators Mar 30, 2022

		@@ -0,0 +1,188 @@
		# Overview

		An 'entry point', is a representation of a ML.Net type in json format and it is used to serialize and deserialize an ML.Net type in JSON.

		@@ -0,0 +1,123 @@
		# JSON Graph format

		The entry point graph in TLC is an array of _nodes_. Each node is an object with the following fields:


		## Overview

		An 'entry point', is a representation of a ML.NET type in JSON format. Entry points are used to serialize and deserialize an ML.NET type in JSON.


		An example of an entry point manifest object, specifically for the MissingValueIndicator transform, is:

		```javascript


		## EntryPoint manifest - the definition of an entry point

		An example of an entry point manifest object, specifically for the MissingValueIndicator transform, is:


		## Overview

		Entry-points are a way to interface with ML.NET components, by specifying an execution graph of connected inputs and outputs of those components.


		## How to create an entry point for an existing ML.Net component

		The steps to take, to create an entry point for an existing ML.Net component, are:

Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md #295

Adding documentation about entry points, and entry points graphs: EntryPoints.md and GraphRunner.md #295

Uh oh!

Conversation

sfilipi commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dnfclas commented Jun 4, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Jun 5, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley commented Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

JSON Graph format

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited by sfilipi Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi Jun 5, 2018

Choose a reason for hiding this comment

Uh oh!

TomFinley Jun 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sfilipi commented Jun 4, 2018 •

edited

Loading

dnfclas commented Jun 4, 2018 •

edited by sfilipi

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited by sfilipi

Loading

sfilipi Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley commented Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley Jun 4, 2018 •

edited by sfilipi

Loading

TomFinley Jun 4, 2018 •

edited

Loading

TomFinley commented Jun 4, 2018 •

edited

Loading

TomFinley Jun 8, 2018 •

edited

Loading

TomFinley Jun 8, 2018 •

edited

Loading

TomFinley Jun 8, 2018 •

edited

Loading

TomFinley Jun 12, 2018 •

edited

Loading

TomFinley Jun 12, 2018 •

edited

Loading

TomFinley Jun 12, 2018 •

edited by sfilipi

Loading

shauheen commented Jun 13, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading

GalOshri Jun 18, 2018 •

edited by sfilipi

Loading