Skip to content

Conversation

@abheesht17
Copy link
Collaborator

@abheesht17 abheesht17 commented Dec 30, 2022

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!! I only see super minor cosmetic stuff, so I will just fix directly and merge.


@keras.utils.register_keras_serializable(package="keras_nlp")
class AlbertPreprocessor(keras.layers.Layer):
"""A ALBERT preprocessing layer which tokenizes and packs inputs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A -> An

- Pack the inputs together using a `keras_nlp.layers.MultiSegmentPacker`.
with the appropriate `"[CLS]"`, `"[SEP]"` and `"<pad>"` tokens.
- Construct a dictionary with keys `"token_ids"`, `"segment_ids"` and
`"padding_mask"`, that can be passed directly to a ALBERT model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably preexisting, but might be better to be specific here.

that can be passed directly to keras_nlp.models.AlbertBackbone

@mattdangerw mattdangerw merged commit 802c7ef into keras-team:master Jan 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add AlbertTokenizer and AlbertPreprocessor

2 participants