Skip to content
/ thm Public
forked from FutureDays/thm

video post-processing for The History Makers

Notifications You must be signed in to change notification settings

brnco/thm

 
 

Repository files navigation

The History Makers

This repository contains scripts to process preservation files, generate checksums, and create and move derivatives of The History Makers oral history interviews.

Installation

Prerequisites

Git

Download official Windows build here: https://git-scm.com/download/win

Install using the Git-[version].exe file, using the default/ pre-filled options

Python

Download official Python 3.x build for Windows here: https://www.python.org/downloads/windows/

Open Downloads folder, locate python3.x.exe file, right-click and select "Run as Administrator" from pop-up menu

IMPORTANT - during install, select "Add Python to environment variables" option

test the Python install

open a new instance of Powershell, type "python" and hit enter (type "exit()" and hit enter to exit the Python interperator shell that was opened)

connect to GitHub

Git is version control software for developers - GitHub is a website that integrates Git with other features that developers find handy. The code for this project is hosted on GitHub, and we'll use Git to download a copy of that code to the machine running the video processing, and upload back to GitHub with any changes.

GitHub only supports SSH authentication these days, follow thier guide for setting that up here: https://docs.github.com/en/authentication/connecting-to-github-with-ssh

Once SSH is set up, do an SSH clone of the repo to the video processing machine.

Configuration

  1. open video-post-processing-config.txt in the text editor of your choice

  2. fill out fields per your local specifications

General Configuration Notes

general format is:

[section_header]
variable_name = variable value

do not enclose paths with quotes, even if they have spaces - do not escape whitespace either

Configuration Fields Reference

Filetypes

Input

comma-separated list of acceptable file extensions for input files, each extension is enclosed in quotes

e.g. ".mov",".MOV"

Transcode

this section contains filepaths for assets which are required in order to transcode derivative files

White Watermark

raw_captures

specifies the path to the main ingest directory. This directory can be considered "hot" in that any subfolders will be attempted to be processed when the script is run with no arguments. Individual accessions should be saved at this path in a folder named with the accession number - alternatively, folder can contain any name if an alternative accession number is supplied at runtime (see Usage section of this document)

Example folder setup, tree view

/raw_captures
├── A2022_034_001_001
│   ├── DOH_HEJ_006_000.mov
│   ├── DOH_HEJ_006_001.mov
│   ├── DOH_HEJ_006_002.mov
│   └── DOH_HEJ_006.XML
├── A2022_034_001_002
│   ├── DOH_HEJ_007_000.mov
│   ├── DOH_HEJ_007_001.mov
│   ├── DOH_HEJ_007_002.mov
│   └── DOH_HEJ_007.XML
├── A2022_047_001_001
│   ├── 01275001.MOV
│   ├── 01275002.MOV
│   ├── 01275003.MOV
│   └── 01275004.MOV

File Destinations

This section describes folder paths for derivatives

Email

This section contains info for email notifications from the script

Logs

This section contains folder paths for the directory containing the logs, as well as the path of the lockfile that makevideos creates in order to only one a single instance of the script at a time

MediaConch

This section delineates the folderpath for MediaConch policies

Usage

General

ingest.py --options accession_number(s)

Help

ingest.py -h

virtual environment

this script uses the venv python library to manage dependencies ("venv" is short for "virtual environment"). It must be enabled in order to be used, however. THM staff shouldn't have to do this too often, but after closing cmd.exe or after a restart it may be necessary.

you can tell you're in the virtual environment by looking to the left of the command prompt. For the THM processing machine, the prompt is D:\Users\archadmin\code\thm - if that line is preceded by (venv), you are in the virtual environment

This is what you want:

(venv) D:\Users\archadmin\code\thm:

This means you gotta activate it:

D:\Users\archadmin\code\thm:

To activate the virtual environment, run the below command in cmd.exe:

venv\Scripts\activate.bat

once that command completes, you should be good to go

the script will error and close if it is not being run in the virtual environment

Examples

ingest everything in raw_captures directory, as configured in config file

ingest.py

ingest a single accession, A2022_012_001_001

ingest.py A2022_012_001_001

ingest multiple accessions

ingest.py A2022_012_001_001 A2022_033_001_001

file validation

ingest without validating input files

ingest.py --no_input_validation A2022_012_001_001

changing terminal output

you can run this script with more or less output to the terminal

note that these setting don't change what is logged, just what is printed

run in verbose mode

ingest.py -v A2022_012_001_001

run in quiet mode

ingest.py -q A2022_012_001_001

changing notification settings

you can run this script without sending emails using the --no_email flag

ingest.py --no_email

changing file copy setting

you can run the script without copying files to the connected drives using the --no_copy flag

ingest.py --no_copy

using multiple flags

these options can be strung together in a single command. the command below will process two accessions without input validation, printing every log entry to the terminal window, without copying files and without emailing anyone

ingest.py -v --no_input_validation --no_copy --no_email A2022_999_001_001 A2017_088_001_001

Script Descriptions

makevideos

this script takes the raw video captures delivered by THM personnel and:

  1. concatenates the < 4GB files into 1 long file

  2. transcodes that file to flv, mp4, and mpeg

  3. embeds timecode and watermarks where appropriate

  4. hashmoves (see below) them to their destiantions

  5. triggers script to embed those hashes into a Filemaker db named PBCore_Catalog

makevideos also checks to make sure that everything is plugged in and that all necessary files (like watermarks) are in their expected locations.

makevideos is triggered every 15minutes, M-F, 7am-9pm local time by cron

makevideos can also be run manually by cd'ing into the repo directory (look for that in the config.txt file) and running "python makevideos.py"

startup

this script checks the values in the config file against the configuration currently present on the workstation running the script. Predominantly, it verifies that filepaths specified in the config actually exist.

file_validation

this script uses MediaConch validation to ensure that only valid input files are passed to the script for preservation/ transcode. MediaConch policies are managed in the directory specified in the config file. For each input file, this script checks it against available file policies in the MediaConch policies folder - if a match is found, that policy is used to validate all other input and output files for the accession.

MediaConch GUI

if a file doesn't pass validation, follow these steps to find out why:

  1. open MediaConch

  2. in the "Checker" tab, use the dropdown menu to select the policy to check against -- see log for list of policies attempted

  3. still in the "Checker" tab, select a file to check against the policy from step 1

  4. select "check file"

  5. MediaConch will analyze the file and add it to a list at the bottom of the window

  6. to view pass/ fail for each field, click the eyeball icon

for more info, see official how-to's at this link

filemaker_handler

this script handles all calls to FileMaker database, requires ODBC

send_email

this script sends emails per info in config file

util

utility functions required by other scripts in this repository

venv

This script uses Python's venv module to create a virutal environment, the venv folder contains configuration info for this virtual environment, and should not need to be modified

About

video post-processing for The History Makers

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%