Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deep matrix factorization question #41

Open
jmschrei opened this issue Feb 23, 2017 · 1 comment
Open

deep matrix factorization question #41

jmschrei opened this issue Feb 23, 2017 · 1 comment

Comments

@jmschrei
Copy link

jmschrei commented Feb 23, 2017

The code currently looks like this:

def get_one_layer_mlp(hidden, k):
    # input
    user = mx.symbol.Variable('user')
    item = mx.symbol.Variable('item')
    score = mx.symbol.Variable('score')
    # user latent features
    user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)
    user = mx.symbol.Activation(data = user, act_type="relu")
    user = mx.symbol.FullyConnected(data = user, num_hidden = hidden)
    # item latent features
    item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)
    item = mx.symbol.Activation(data = item, act_type="relu")
    item = mx.symbol.FullyConnected(data = item, num_hidden = hidden)
    # predict by the inner product
    pred = user * item
    pred = mx.symbol.sum_axis(data = pred, axis = 1)
    pred = mx.symbol.Flatten(data = pred)
    # loss layer
    pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
    return pred

My understanding is that the embedding layer should be able to learn anything that having a single dense layer on top of it could learn, since it can be basically anything. I had thought a deep matrix factorization would look something more like this:

def get_one_layer_mlp(hidden, k):
    # input
    user = mx.symbol.Variable('user')
    item = mx.symbol.Variable('item')
    score = mx.symbol.Variable('score')
    # user latent features
    user = mx.symbol.Embedding(data = user, input_dim = max_user, output_dim = k)

    # item latent features
    item = mx.symbol.Embedding(data = item, input_dim = max_item, output_dim = k)

    # predict by the inner product
    pred = mx.symbol.Concat([user, item])
    pred = mx.symbol.FullyConnected(data = pred, num_hidden = hidden)
    pred = mx.symbol.Activation(data = pred, act_type="relu")
    pred = mx.symbol.FullyConnected(data = pred, num_hidden = 1)

    # loss layer
    pred = mx.symbol.LinearRegressionOutput(data = pred, label = score)
    return pred

Basically, the layers should take a concatenation of the latent variables and have layers on top of that, instead of having layers on top of the embedding layers.

@mli
Copy link
Member

mli commented Mar 3, 2017

the second one also makes sense. but you can think the first one is special case of the second one, the former uses a fullyc on both user and pred with a block structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants