Skip to content

Xconfigs#1170

Closed
danpovey wants to merge 13 commits intokaldi-asr:masterfrom
danpovey:xconfig
Closed

Xconfigs#1170
danpovey wants to merge 13 commits intokaldi-asr:masterfrom
danpovey:xconfig

Conversation

@danpovey
Copy link
Contributor

@danpovey danpovey commented Nov 4, 2016

This PR is a start towards getting xconfig files working. I am hoping @vijayaditya will finish it...
Note: see the bottom of xconfig_to_configs.py for examples of usage.

This PR just establishes what the framework looks like and adds enough layer types to support TDNNs. But it's not well tested (other than to the extent that it runs and generates superficially plausible-looking config files). I'm sure there are many issues with the python that should be fixed or reorganized.

TODOs:

  • Create example scripts based on this, for TDNNs in regular and chain models. For now, I think example scripts should just create an xconfig file manually from the script in local/, and generate the config files from that. Then if we think it's needed we can create a nice script to generate the xconfig files themselves, we can do it at our leisure, when we're not rushed. E.g. for now do something like:
cat >xconfig <<EOF
input dim=100 name=ivector
input dim=40 name=input
fixed-affine-layer name=lda input=Append(-2,-1,0,1,2,ReplaceIndex(ivector, t, 0)) affine-transform-file=foo/bar/lda.mat
relu-renorm-layer name=layer1 input=Append(-1,0,1) dim=1024
relu-renorm-layer name=layer2 input=Append(-3,0,3) dim=1024
relu-renorm-layer name=layer3 input=Append(-3,0,3) dim=1024
relu-renorm-layer name=layer4 input=Append(-3,0,3) dim=1024
output-layer name=output dim=5431
EOF

(and obviously you can use variables for things like the hidden-dim and output-dim, but don't overdo it with variables for things like the MFCC dim, as this is only a temporary patch...)
I think it might be a good idea to expose the xconfig format like this, because it's something that users can easily edit, and putting it in their face like this might encourage experimentation.

  • Extend to LSTMs and BLSTMs and of course create example scripts for those.

There are some unfinished aspects to how this script creates the config/ directory, in particular the 'vars' file:

  • It does not write the model_left_context and model_right_context. This should be worked out by xconfig_to_configs.py by using nnet3-init | nnet3-info, that's why the 'ref.config' is generated. I think a function for this should be added in xconfig_to_configs.py. We can assume that the path is set up right, and just print an error message about the path if not.
  • The script should write num_hidden_layers=1 to vars. This will mean the current training script just works. Eventually we'll make the training script ignore this value, and remove support for layer by layer discriminative pretraining.
  • The script should write 'num_targets' as the output dim of whatever layer is called 'output'-- just go over all_layers, find a layer whose name is 'output', and inspect its member foo.configs['dim'].
  • The 'include_log_softmax' and 'objective_type' can be obtained in exactly the same way as 'num_targets' above. [if there is no layer named 'output', just don't write these variables.]

So obviously there is a bunch of testing that needs to be done, from comparing the generated config files, to running the recipes without layerwise pretraining.

@vijayaditya
Copy link
Contributor

OK, I am looking at this now.

On Fri, Nov 4, 2016 at 2:25 AM, Daniel Povey notifications@github.com
wrote:

This PR is a start towards getting xconfig files working. I am hoping
@vijayaditya https://github.com/vijayaditya will finish it...
Note: see the bottom of xconfig_to_configs.py for examples of usage.

This PR just establishes what the framework looks like and adds enough
layer types to support TDNNs. But it's not well tested (other than to the
extent that it runs and generates superficially plausible-looking config
files). I'm sure there are many issues with the python that should be fixed
or reorganized.

TODOs:

  • Create example scripts based on this, for TDNNs in regular and chain
    models. For now, I think example scripts should just create an xconfig file
    manually from the script in local/, and generate the config files from
    that. Then if we think it's needed we can create a nice script to generate
    the xconfig files themselves, we can do it at our leisure, when we're not
    rushed. E.g. for now do something like:

cat >xconfig <<EOF
input dim=100 name=ivector
input dim=40 name=input
fixed-affine-layer name=lda input=Append(-2,-1,0,1,2,ReplaceIndex(ivector, t, 0)) affine-transform-file=foo/bar/lda.mat
relu-renorm-layer name=layer1 input=Append(-1,0,1) dim=1024
relu-renorm-layer name=layer2 input=Append(-3,0,3) dim=1024
relu-renorm-layer name=layer3 input=Append(-3,0,3) dim=1024
relu-renorm-layer name=layer4 input=Append(-3,0,3) dim=1024
output-layer name=output dim=5431
EOF

(and obviously you can use variables for things like the hidden-dim and
output-dim, but don't overdo it with variables for things like the MFCC
dim, as this is only a temporary patch...)
I think it might be a good idea to expose the xconfig format like this,
because it's something that users can easily edit, and putting it in their
face like this might encourage experimentation.

  • Extend to LSTMs and BLSTMs and of course create example scripts for
    those.

There are some unfinished aspects to how this script creates the config/
directory, in particular the 'vars' file:

  • It does not write the model_left_context and model_right_context.
    This should be worked out by xconfig_to_configs.py by using nnet3-init |
    nnet3-info, that's why the 'ref.config' is generated. I think a function
    for this should be added in xconfig_to_configs.py. We can assume that the
    path is set up right, and just print an error message about the path if not.
  • The script should write num_hidden_layers=1 to vars. This will mean
    the current training script just works. Eventually we'll make the training
    script ignore this value, and remove support for layer by layer
    discriminative pretraining.
  • The script should write 'num_targets' as the output dim of whatever
    layer is called 'output'-- just go over all_layers, find a layer whose name
    is 'output', and inspect its member foo.configs['dim'].
  • The 'include_log_softmax' and 'objective_type' can be obtained in
    exactly the same way as 'num_targets' above. [if there is no layer named
    'output', just don't write these variables.]

So obviously there is a bunch of testing that needs to be done, from
comparing the generated config files, to running the recipes without

layerwise pretraining.

You can view, comment on, or merge this pull request online at:

#1170
Commit Summary

  • Adding early draft of xconfig library
  • Change how [-1] and the like are parsed in xconfig_lib.py
  • Adding some temporary work on xconfigs (will not work right now)
  • Some partial work
  • some partial changes
  • some minor reorganization
  • Import modules only
  • Add output layer with affine component
  • Add support for common layer types such as relu+renorm.
  • Remove unused config.
  • Add support for fixed-affine-layer
  • Small fix to example command.

File Changes

Patch Links:


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1170, or mute the thread
https://github.com/notifications/unsubscribe-auth/ADtwoHoWNUUwHQm_pHynwyLwloeY9SLjks5q6s_ngaJpZM4KpOnc
.

# f = open(config_dir + 'vars', 'w')
# print('model_left_context=' + str(left_context), file=f)
# print('model_right_context=' + str(right_context), file=f)
# print('num_hidden_layers=' + str(num_hidden_layers), file=f)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are not supporting layer-wise pretraining this would be an unnecessary variable. Also it does not have a proper interpretation in a non-linear network.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK to remove it, we'd just have to modify the training scripts to do the right thing if it was not ther.e


# Note: currently we just copy all lines that were going to go to 'all', into
# 'layer1', to avoid propagating this nastiness to the code in xconfig_layers.py
ans['layer1'] = ('# This file was created by the command:\n'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we are planning to make pretraining optional in the training scripts I think this would be unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe modifying the training scripts is the right thing to do. We could have them look for 'all.config' if layer1.config is absent.

# returns true if 'name' is valid as the name of a line (input, layer or output);
# this is the same as IsValidName() in the nnet3 code.
def IsValidLineName(name):
return isinstance(name, str) and re.match(r'^[a-zA-Z_][-a-zA-Z_0-9.]*', name) != None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reminder to myself : Add support for "." in C++ code. (See #944 )

# 'lstm2.memory_cell', into a dimension. 'all_layers' is a vector of objects
# inheriting from XconfigLayerBase. 'current_layer' is provided so that the
# function can make sure not to look in layers that appear *after* this layer
# (because that's not allowed).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what if we are trying to access the value of a layer which appears after this layer, at a previous time step. Google has published architectures like these in the previous conferences and they seem to benefit the model.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 7, 2016

I thought this had been done but it looks like it has not. Anyway, it's
just minor changes to nnet-parse.cc:IsValidName()

On Mon, Nov 7, 2016 at 6:10 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/nnet3/libs/xconfig_utils.py
#1170 (review):

  •    else:
    
  •        raise Exception("Unknown operator {0}".format(self.operator))
    
    +# This just checks that seen_item == expected_item, and raises an
    +# exception if not.
    +def ExpectToken(expected_item, seen_item, what_parsing):
  • if seen_item != expected_item:
  •    raise Exception("parsing {0}, expected '{1}' but got '{2}'".format(
    
  •        what_parsing, expected_item, seen_item))
    
    +# returns true if 'name' is valid as the name of a line (input, layer or output);
    +# this is the same as IsValidName() in the nnet3 code.
    +def IsValidLineName(name):
  • return isinstance(name, str) and re.match(r'^[a-zA-Z_][-a-zA-Z_0-9.]*', name) != None

A reminder to myself : Add support for "." in C++ code. (See #944
#944 )


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1170 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu3-XwW0qY2PCzcUmDJbtoMoNVwlRks5q76_lgaJpZM4KpOnc
.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 7, 2016

In egs/wsj/s5/steps/nnet3/libs/xconfig_utils.py
#1170 (review):

  •        break
    
  •    prev_names.append(layer.Name())
    
  • prev_names_set = set()
  • for name in prev_names:
  •    if name in prev_names_set:
    
  •        raise RuntimeError("{0}: Layer name {1} is used more than once.".format(
    
  •                sys.argv[0], name))
    
  •    prev_names_set.add(name)
    
  • return prev_names

+# [utility function used in xconfig_layers.py]
+# this converts a layer-name like 'ivector' or 'input', or a sub-layer name like
+# 'lstm2.memory_cell', into a dimension. 'all_layers' is a vector of objects
+# inheriting from XconfigLayerBase. 'current_layer' is provided so that the
+# function can make sure not to look in layers that appear after this layer
+# (because that's not allowed).

But what if we are trying to access the value of a layer which appears
after this layer, at a previous time step. Google has published
architectures like these in the previous conferences and they seem to
benefit the model.

Let's worry about handling that case later on. It's always possible to
define layer types in the xconfig file that actually add several layers at
once, to handle this kind of thing. The current code assumes this cannot
happen-it could be reworked in principle but I don't want this to be a
blocker, as it's a very obscure use-case.

# layer supports. These are either 'None' for the regular output, or a
# string (e.g. 'projection' or 'memory_cell') for any auxiliary outputs that
# the layer might provide. Most layer types will not need to override this.
def Qualifiers(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused about the choice of the name Qualifiers for auxiliary outputs. Could you please give a brief explanation.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 8, 2016

It was just the name that came to mind, dictionary defition is "a word or
phrase, especially an adjective, used to attribute a quality to another
word, especially a noun.". That function could be renamed
AuxiliaryOutputNames() if you feel that's unclear.

On Mon, Nov 7, 2016 at 7:37 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/nnet3/libs/xconfig_layers.py
#1170 (review):

  • will be interpreted as Descriptors. It is used in the function

  • 'NormalizeDescriptors()'. This implementation will work

  • layer types whose only Descriptor-valued config is 'input'.

  • If a child class adds more config variables that are interpreted as

  • descriptors (e.g. to read auxiliary inputs), or does not have an input

  • (e.g. the XconfigInputLayer), it should override this function's

  • implementation to something like: return ['input', 'input2']

  • def GetDescriptorConfigs(self):
  •    return [ 'input' ]
    
  • Returns a list of all qualifiers (meaning auxiliary outputs) that this

  • layer supports. These are either 'None' for the regular output, or a

  • string (e.g. 'projection' or 'memory_cell') for any auxiliary outputs that

  • the layer might provide. Most layer types will not need to override this.

  • def Qualifiers(self):

I am a bit confused about the choice of the name Qualifiers for auxiliary
outputs. Could you please give a brief explanation.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1170 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu83KeKCAzQo_9lTU2E-zx4b64zJiks5q78RkgaJpZM4KpOnc
.

@jtrmal
Copy link
Contributor

jtrmal commented Nov 8, 2016

for me, as a non-native, sounds AuxiliaryOutputNames more explaining.
just FYI :)
no pressure
y.

On Mon, Nov 7, 2016 at 7:59 PM, Daniel Povey notifications@github.com
wrote:

It was just the name that came to mind, dictionary defition is "a word or
phrase, especially an adjective, used to attribute a quality to another
word, especially a noun.". That function could be renamed
AuxiliaryOutputNames() if you feel that's unclear.

On Mon, Nov 7, 2016 at 7:37 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/nnet3/libs/xconfig_layers.py
<#1170 (review)
:

  • will be interpreted as Descriptors. It is used in the function

  • 'NormalizeDescriptors()'. This implementation will work

  • layer types whose only Descriptor-valued config is 'input'.

  • If a child class adds more config variables that are interpreted as

  • descriptors (e.g. to read auxiliary inputs), or does not have an

    input
  • (e.g. the XconfigInputLayer), it should override this function's

  • implementation to something like: return ['input', 'input2']

  • def GetDescriptorConfigs(self):
  • return [ 'input' ]
    +
  • Returns a list of all qualifiers (meaning auxiliary outputs) that

    this
  • layer supports. These are either 'None' for the regular output, or a

  • string (e.g. 'projection' or 'memory_cell') for any auxiliary

    outputs that
  • the layer might provide. Most layer types will not need to override

    this.
  • def Qualifiers(self):

I am a bit confused about the choice of the name Qualifiers for auxiliary
outputs. Could you please give a brief explanation.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1170 (review)
,
or mute the thread
<https://github.com/notifications/unsubscribe-
auth/ADJVu83KeKCAzQo_9lTU2E-zx4b64zJiks5q78RkgaJpZM4KpOnc>
.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1170 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AKisXzGtD1vOVkg5vbjEBh5u4oqpg423ks5q78lrgaJpZM4KpOnc
.

def __init__(self, first_token, key_to_value, prev_names = None):
# Here we just list some likely combinations.. you can just add any
# combinations you want to use, to this list.
assert first_token in [ 'relu-layer', 'relu-renorm-layer', 'sigmoid-layer',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these names were a bit confusing for me. I would like to change these to affine-relu-layer, affine-relu-renorm-layer ...

#
# Configuration values that we might one day want to add here, but which we
# don't yet have, include target-rms (affects 'renorm' component).
class XconfigSimpleLayer(XconfigLayerBase):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if readers would be assume that SimpleLayer is a special name like SimpleComponent in C++ which has a special meaning. I would recommend changing this to XconfigAffineNonlinLayer.

@danpovey
Copy link
Contributor Author

danpovey commented Nov 8, 2016

Maybe BasicLayer would be better than SimpleLayer.

I considered names of the form *-affine-layer, but the reason I didn't go
for it in the end, is that all known forms of neural net have some kind of
affine transform in there. It's inherent in the concept of a layer. So I
figured it didn't add information. We can reserve affine-layer for a
"bare" layer that has no nonlinearity in it.

On Mon, Nov 7, 2016 at 10:10 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/nnet3/libs/xconfig_layers.py
#1170 (review):

+# affine component.
+#
+# The dimension specified is the output dim; the input dim is worked out from the input descriptor.
+# This class supports only nonlinearity types that do not change the dimension; we can create
+# another layer type to enable the use p-norm and similar dimension-reducing nonlinearities.
+#
+# See other configuration values below.
+#
+# Parameters of the class, and their defaults:
+# input='[-1]' [Descriptor giving the input of the layer.]
+# dim=-1 [Output dimension of layer, e.g. 1024]
+# self-repair-scale=1.0e-05 [Affects relu, sigmoid and tanh layers.]
+#
+# Configuration values that we might one day want to add here, but which we
+# don't yet have, include target-rms (affects 'renorm' component).
+class XconfigSimpleLayer(XconfigLayerBase):

I am wondering if readers would be assume that SimpleLayer is a special
name like SimpleComponent in C++ which has a special meaning. I would
recommend changing this to XconfigAffineNonlinLayer.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1170 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu9iZ5fPjQGBVlpnM98mqVYPXNA5Sks5q7-g1gaJpZM4KpOnc
.

# Parse Descriptors and get their dims and their 'final' string form.
# Put them as 4-tuples (descriptor, string, normalized-string, final-string)
# in self.descriptors[key]
for key in self.GetDescriptorConfigs():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function just returns the names of the input descriptors used by a layer, right ? Or do you plan to use it to store other things ?

I would like to call it GetInputDescriptorNames() as it will be just returning the expected input names and not the actual configs.

@danpovey
Copy link
Contributor Author

Yes, that's right. Your name makes sense.

On Thu, Nov 10, 2016 at 3:56 PM, Vijayaditya Peddinti <
notifications@github.com> wrote:

@vijayaditya commented on this pull request.

In egs/wsj/s5/steps/nnet3/libs/xconfig_layers.py
#1170 (review):

  •    # the child-class constructor will deal with the configuration values
    
  •    # in a more specific way.
    
  •    for key,value in key_to_value.items():
    
  •        if key != 'name':
    
  •            if not key in self.config:
    
  •                raise RuntimeError("Configuration value {0}={1} was not expected in "
    
  •                                "layer of type {2}".format(key, value, self.layer_type))
    
  •            self.config[key] = xconfig_utils.ConvertValueToType(key, type(self.config[key]), value)
    
  •    self.descriptors = dict()
    
  •    self.descriptor_dims = dict()
    
  •    # Parse Descriptors and get their dims and their 'final' string form.
    
  •    # Put them as 4-tuples (descriptor, string, normalized-string, final-string)
    
  •    # in self.descriptors[key]
    
  •    for key in self.GetDescriptorConfigs():
    

This function just returns the names of the input descriptors used by a
layer, right ? Or do you plan to use it to store other things ?

I would like to call it GetInputDescriptorNames() as it will be just
returning the expected input names and not the actual configs.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#1170 (review),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADJVu22FAB5f6UxwFkNlz-qFI-HvRH5rks5q84TugaJpZM4KpOnc
.

@vijayaditya
Copy link
Contributor

Continued in #1197 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants