-
Notifications
You must be signed in to change notification settings - Fork 310
Closed
Labels
stat:contributions welcomeAdd this label to feature request issues so they are separated out from bug reporting issuesAdd this label to feature request issues so they are separated out from bug reporting issues
Description
The code should go into keras_nlp/benchmarks.
We can use IMDB sentiment analysis task, guidance for which can be found here.
One challenging point is we want this script to be able to evaluate all our Classifier models without writing custom code. Since for all models Classifier we have Preprocessor, and they have the unified name format {model_name}Classifier/{model_name}Preprocessor, e.g., BertClassifier/BertPreprocessor, we should be able to make the code reusable by having a flag model_name.
Here is the requirement in more details:
- example file name:
keras_nlp/benchmarks/sentiment_analysis.py - example running command:
flag
python keras_nlp/benchmarks/sentiment_analysis.py \ --model="bert" \ --preset="bert_small_en_uncased" \ --learning_rate=5e-5 \ --num_epochs=5 \ --batch_size=32--modelspecifies the model name, and--presetspecifies the preset under testing.--presetcould be None, while--modelis required. Other flags are common training flags. - output: print out a few metrics, including
- validation accuracy/F1 for each epoch.
- testing accuracy/F1 after training is done.
- total elapsed time (in seconds).
Metadata
Metadata
Assignees
Labels
stat:contributions welcomeAdd this label to feature request issues so they are separated out from bug reporting issuesAdd this label to feature request issues so they are separated out from bug reporting issues