The current text extraction benchmark does not tell anything about how well newline characters are recognized. We need a new benchmark for that.