We are interested in building out our offering of seq2seq capable models. One higher priority model here is [BART](https://arxiv.org/abs/1910.13461), which is reasonably small, useful for both discriminative and generative tasks, and decently popular. First step will be to implement a backbone. Here's a template PR -> https://github.com/keras-team/keras-nlp/pull/622