Skip to content

TEAM Engine Provenance Review

dstenger edited this page Feb 28, 2024 · 17 revisions

TEAM Engine Provenance Review

TEAM Engine code providence review, covering issues raised for each module in TEAM Engine. The goal here is to check the headers (fill them in if needed) and confirm that the information is correct. We can also list GitHub issues against any inconsistencies discovered.

Acknowledgement: This review is based on the template provided by the GeoServer Provenance Review.

Key Definition
not checked yet
check in progress
❓ ❗ check is stuck, header or license requires developer attention
✅ ✔️ checked, all clear
✅ ❗ checked, warning (missing information)
‼️ checked, fix me! requires developer attention

There are two main steps to this review. The first step is to run an automated review. The second step is to carefully and manually review the files that have been flagged up by the automated review. We also use the second step to verify that the automated review worked correctly.

Step 1: Automated Review

Modules

Some files are missing source headers.

Step 2: Manual Review

We then carefully check java source file headers here, and explore git history to double check where files came from.

  • Multiple organisations have contributed code to the project.

  • The source code was previously available under a Mozilla licence. See the early commits.

  • The source code is now published under an Apache 2.0 licence.

Created list of copyright holders by using this search query.

The list was created manually:

  • Alex Gorbatchev
  • Asir S Vedamuthu of WebMethods Inc.
  • Fabrizio Vitale jointly with the Institute of Methodologies for Environmental Analysis (IMAA) part of the Italian National Research Council (CNR)
  • Intecs SPA
  • Northrop Grumman Corporation
  • OGC
  • W3C (MIT, INRIA, Keio)

Tools

The following python script was used to help detect source files that were missing headers.

import glob,os

# This script is used to help generate a Provenance Review

modules = []


def findMavenLibraries(d):

    for path in os.listdir(d):
        full_path = os.path.join(d, path)
        if os.path.isfile(full_path):
            if 'pom.xml' in full_path:
                modules.append(full_path)

        else:
            findMavenLibraries(full_path)



def findSourceHeader(d):

    for path in os.listdir(d):
        full_path = os.path.join(d, path)
        if os.path.isfile(full_path):
            if full_path.endswith('.java'):           
                with open(full_path, 'r') as file:
                    data = file.read().replace('\n', '')
                    trimmeddata = data.strip()
                    if trimmeddata.startswith('package'):
                        print ('No source header found in '+full_path)
                        

        else:
            findSourceHeader(full_path)

findMavenLibraries('/Users/ogc/Documents/GitHub/teamengine')
outfile = open('output3.txt','w')
for module in modules:    
    modulepath = module.replace('/pom.xml','').replace('/Users/ogc/Documents/GitHub/teamengine/','https://github.com/opengeospatial/teamengine/tree/master/')
    modulepath = modulepath.replace('/pom.xml','').replace('/Users/ogc/Documents/GitHub/teamengine','https://github.com/opengeospatial/teamengine')
    print ('== Module '+modulepath)
    outfile.write('-   :grey_question: [['+modulepath+']('+modulepath+')] \n \n')
    print ('\n')    
    findSourceHeader(module.replace('/pom.xml',''))
    print ('\n')
outfile.close()