This function retrieves all un-read e-mails from an Exchange Web Service (EWS), stores these in a Google Cloud Storage bucket and then posts a message to a Pub/Sub topic. The Google Cloud Storage location for each e-mail will be defined by the e-mail's received timestamp, e.g. base/path/2019/11/01/20191101120000Z
The config.py
file (see config.example.py for an example) defines which configuration will be used.
-
Make sure a
config.py
file exists within the function directory, based on the config.example.py, with the correct configuration:EXCHANGE_URL = The Exchange service endpoint for the mailbox EXCHANGE_VERSION = An object containing the 'major' and 'minor' Exchange server versions TOPIC_PROJECT_ID = The GCP project which houses the Pub/Sub topic defined next TOPIC_NAME = The GCP Pub/Sub topic all processed e-mail meta-info will be posted to BUCKET_NAME = The (Google Cloud Platform) bucket name where the e-mail attachments (if there are any) will be uploaded to ATTACHMENTS_TO_STORE = List of possible mime-types for e-mail attachments. These mimetypes determine which type of files will be stored in the associated bucket. Unknown mime-types will be ignored. ALLOWED_HTML_BODY_TAGS = tags that should not be filtered out of the body of the (html) e-mail ERROR_EMAIL_ADDRESS = E-mail address where an error message will be send to if an e-mail cannot be send ERROR_EMAIL_MESSAGE = Message that will be send to the aforementioned e-mail address EMAIL_ADDRESSES = A dictionary containing mailboxes to be read by the mailingest function. Each mailbox should contain a secret_id of the secret contained in the secret manager, and the email. Optionally an alias field can be added (ews-mail-ingest will publish emails as if they were received by the alias.), as well as a folder field, which will instruct ews-mail-ingest to read a different folder than the default inbox. (Subfolders can be defined with backslashes. 'inbox/today' is a valid folder.)
If no attachments are ever send along with the email,
BUCKET_NAME
should be an empty string andATTACHMENTS_TO_STORE
should be an empty list.If no other body tags should be allowed than the standard ones used in bleach Cleaner,
ALLOWED_HTML_BODY_TAGS
can be an empty list.If no error message should be send to an e-mail address if an e-mail cannot be send,
ERROR_EMAIL_ADDRESS
andERROR_EMAIL_MESSAGE
can be empty strings. -
For each mailbox, provision a secret in the project's secret manager containing the password in plain text.
-
Make sure the GCP-project and Cloud Builder accounts have access to write to the specific GCS Bucket and GCP Pub/Sub topic
-
Deploy the function to GCP as a HTTP triggered function as shown in the cloudbuild.example.yaml
-
Deploy a GCP Cloud Scheduler to call the function as shown in the cloudbuild.example.yaml. For each email in EMAIL_ADDRESSES, you should schedule a function, passing the key in the dictionary as a GET argument. You can also use schedule_email_address_functions.sh to do this for you. cloudbuild.example.yaml for an example usage of this script.
The ews-mail-ingest works as follows:
- Google Cloud Secret Manager will retrieve a secret that contains the Exchange account password
- A connection will be made to a specific EWS endpoint url with the user credentials from the config file and the decrypted password
- All un-read e-mails will be listed and looped over
- If e-mail attachments are configured, each e-mail attachment will be uploaded as an blob to the specified GCS Bucket
- The actual body and meta-info of each e-mail will be posted to a specified Pub/Sub topic
- The e-mail will be marked as
read
.
The meta-info object posted to a GCP Pub/Sub topic is defined as described below. For the gobits field, refer to this repository.
{
"gobits": [gobits],
"mail": {
"sent_on": "",
"recipient": "",
"subject": "",
"body"
"sent_on": "",
"received_on": "",
"attachments": []
}
}
This function is licensed under the GPL-3 License