This simple library provides ability to separate the quotation from the useful content in email messages. The main purpose of this library is to process as much different quotation formats as possible. It is also independent from the language used in email.
Efficiency estimation we have got during testing: > 97.5 % correctly processed emails.
For now it works only with text/plain Content-Type. Other content-types may be added later (you are welcome to make a pull request).
You can download library sources and add them into your project.
Clone project via git and change directory to email-parser
.
Enter gradlew runProcessing -PemlFile="path"
in the console, where path
is path to eml file.
Output format
Header of the quotation is in uppercase.
Quotation is marked with '>' symbol beginning with the first line of the header of the quotation till the end of the message.
Working time is also provided.
Enter gradlew test
in the console.
To get documentation in dokka format enter gradlew dokka
in the console. Then run build/dokka/email-parser/index.html
To get documentation in Javadoc format enter gradlew dokkaJavadoc
in the console. Then run build/javadoc/index.html
The main package of the library is quoteParser. Its main class is QuoteParser.
To use parser call quoteParserObj.parse()
method. This method will return Content object with a separate body, header of the quote if exists and quotation itself if exists.
For more information, read the documentation.
Simple usage example:
val content = QuoteParser.Builder()
.build()
.parse(file)
You also can parse list of strings and customize parser parameters via different builder methods:
val content = QuoteParser.Builder()
.deleteQuoteMarks(true)
.recursive(false)
.build()
.parse(emlText.lines())
Kotlin-style builder is also supported:
val content = QuoteParser.Builder()
.build {
deleteQuoteMarks = true
recursive = false
}.parse(emlText.lines())
Complete code of the examples is placed here.