Conversation
|
OK, I am looking at this now. On Fri, Nov 4, 2016 at 2:25 AM, Daniel Povey notifications@github.com
|
| # f = open(config_dir + 'vars', 'w') | ||
| # print('model_left_context=' + str(left_context), file=f) | ||
| # print('model_right_context=' + str(right_context), file=f) | ||
| # print('num_hidden_layers=' + str(num_hidden_layers), file=f) |
There was a problem hiding this comment.
if we are not supporting layer-wise pretraining this would be an unnecessary variable. Also it does not have a proper interpretation in a non-linear network.
There was a problem hiding this comment.
I'm OK to remove it, we'd just have to modify the training scripts to do the right thing if it was not ther.e
|
|
||
| # Note: currently we just copy all lines that were going to go to 'all', into | ||
| # 'layer1', to avoid propagating this nastiness to the code in xconfig_layers.py | ||
| ans['layer1'] = ('# This file was created by the command:\n' |
There was a problem hiding this comment.
As we are planning to make pretraining optional in the training scripts I think this would be unnecessary.
There was a problem hiding this comment.
Maybe modifying the training scripts is the right thing to do. We could have them look for 'all.config' if layer1.config is absent.
| # returns true if 'name' is valid as the name of a line (input, layer or output); | ||
| # this is the same as IsValidName() in the nnet3 code. | ||
| def IsValidLineName(name): | ||
| return isinstance(name, str) and re.match(r'^[a-zA-Z_][-a-zA-Z_0-9.]*', name) != None |
There was a problem hiding this comment.
A reminder to myself : Add support for "." in C++ code. (See #944 )
| # 'lstm2.memory_cell', into a dimension. 'all_layers' is a vector of objects | ||
| # inheriting from XconfigLayerBase. 'current_layer' is provided so that the | ||
| # function can make sure not to look in layers that appear *after* this layer | ||
| # (because that's not allowed). |
There was a problem hiding this comment.
But what if we are trying to access the value of a layer which appears after this layer, at a previous time step. Google has published architectures like these in the previous conferences and they seem to benefit the model.
|
I thought this had been done but it looks like it has not. Anyway, it's On Mon, Nov 7, 2016 at 6:10 PM, Vijayaditya Peddinti <
|
|
| # layer supports. These are either 'None' for the regular output, or a | ||
| # string (e.g. 'projection' or 'memory_cell') for any auxiliary outputs that | ||
| # the layer might provide. Most layer types will not need to override this. | ||
| def Qualifiers(self): |
There was a problem hiding this comment.
I am a bit confused about the choice of the name Qualifiers for auxiliary outputs. Could you please give a brief explanation.
|
It was just the name that came to mind, dictionary defition is "a word or On Mon, Nov 7, 2016 at 7:37 PM, Vijayaditya Peddinti <
|
|
for me, as a non-native, sounds AuxiliaryOutputNames more explaining. On Mon, Nov 7, 2016 at 7:59 PM, Daniel Povey notifications@github.com
|
| def __init__(self, first_token, key_to_value, prev_names = None): | ||
| # Here we just list some likely combinations.. you can just add any | ||
| # combinations you want to use, to this list. | ||
| assert first_token in [ 'relu-layer', 'relu-renorm-layer', 'sigmoid-layer', |
There was a problem hiding this comment.
these names were a bit confusing for me. I would like to change these to affine-relu-layer, affine-relu-renorm-layer ...
| # | ||
| # Configuration values that we might one day want to add here, but which we | ||
| # don't yet have, include target-rms (affects 'renorm' component). | ||
| class XconfigSimpleLayer(XconfigLayerBase): |
There was a problem hiding this comment.
I am wondering if readers would be assume that SimpleLayer is a special name like SimpleComponent in C++ which has a special meaning. I would recommend changing this to XconfigAffineNonlinLayer.
|
Maybe BasicLayer would be better than SimpleLayer. I considered names of the form *-affine-layer, but the reason I didn't go On Mon, Nov 7, 2016 at 10:10 PM, Vijayaditya Peddinti <
|
| # Parse Descriptors and get their dims and their 'final' string form. | ||
| # Put them as 4-tuples (descriptor, string, normalized-string, final-string) | ||
| # in self.descriptors[key] | ||
| for key in self.GetDescriptorConfigs(): |
There was a problem hiding this comment.
This function just returns the names of the input descriptors used by a layer, right ? Or do you plan to use it to store other things ?
I would like to call it GetInputDescriptorNames() as it will be just returning the expected input names and not the actual configs.
|
Yes, that's right. Your name makes sense. On Thu, Nov 10, 2016 at 3:56 PM, Vijayaditya Peddinti <
|
|
Continued in #1197 . |
This PR is a start towards getting xconfig files working. I am hoping @vijayaditya will finish it...
Note: see the bottom of xconfig_to_configs.py for examples of usage.
This PR just establishes what the framework looks like and adds enough layer types to support TDNNs. But it's not well tested (other than to the extent that it runs and generates superficially plausible-looking config files). I'm sure there are many issues with the python that should be fixed or reorganized.
TODOs:
(and obviously you can use variables for things like the hidden-dim and output-dim, but don't overdo it with variables for things like the MFCC dim, as this is only a temporary patch...)
I think it might be a good idea to expose the xconfig format like this, because it's something that users can easily edit, and putting it in their face like this might encourage experimentation.
There are some unfinished aspects to how this script creates the config/ directory, in particular the 'vars' file:
So obviously there is a bunch of testing that needs to be done, from comparing the generated config files, to running the recipes without layerwise pretraining.