You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I am very confused about the specific role of gr_output_length.
The length of hstu's input seq is (max_sequence_length + gr_output_length + 1),
For the 'gr_output_length' part, the corresponding historical_id is 0, as shown in the code below:
Since the corresponding historical_ids is 0, this part seems to have no effect on the final calculation result. So what is its purpose?
I wonder if it is related to your paper, where gr_output_length corresponds to the number of vectors in the user's multi-vector representation?
Hi, the 0 padding in the id sequence is due to our specific implementation of relative position bias. We use timestamp[j+1] - timestamp[i] to derive the delta time span from i to j (note here we autoregressively predict the j+1-th element) in our public experiments for sequential recommender (retrieval) settings.
Hi, I am very confused about the specific role of gr_output_length.
The length of hstu's input seq is (max_sequence_length + gr_output_length + 1),
For the 'gr_output_length' part, the corresponding historical_id is 0, as shown in the code below:
Since the corresponding historical_ids is 0, this part seems to have no effect on the final calculation result. So what is its purpose?
I wonder if it is related to your paper, where gr_output_length corresponds to the number of vectors in the user's multi-vector representation?
https://arxiv.org/pdf/2306.04039
The text was updated successfully, but these errors were encountered: