Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUMULUS-1748: support relative paths in recursion.js #1485

Merged
merged 64 commits into from
Feb 22, 2020
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
4842ee5
support relative paths in recursion
nemreid Feb 13, 2020
e0a25b2
serial FtpProviderclient tests
nemreid Feb 13, 2020
76c5806
disable ftp list test
nemreid Feb 14, 2020
00357be
disable ftp filter test
nemreid Feb 14, 2020
38cd57f
disable all but original 3 ftp tests
nemreid Feb 14, 2020
fcbd968
try old recursion
nemreid Feb 14, 2020
e154980
Revert "try old recursion"
nemreid Feb 14, 2020
82961d3
remove now-unnecessary path checks
nemreid Feb 14, 2020
39319b6
re-enable all ftpProviderClient tests
nemreid Feb 14, 2020
7304555
try old recursion again
nemreid Feb 14, 2020
3dca8e3
update FtpProviderClient expectations
nemreid Feb 14, 2020
1a4f1cd
Revert "try old recursion again"
nemreid Feb 14, 2020
384c587
recursion debug logging
nemreid Feb 14, 2020
a431c50
log memory usage
nemreid Feb 14, 2020
b0c822a
cache ftpClient
nemreid Feb 14, 2020
a5ed80e
move client.destroy to error handlers
nemreid Feb 14, 2020
1ec9f67
misc cleanup
nemreid Feb 14, 2020
de8def6
slow down recursion
nemreid Feb 14, 2020
724e2f4
fix sleep
nemreid Feb 14, 2020
9407390
deploy-step set envs before checking them
nemreid Feb 14, 2020
906f2db
remove sleep from recursion
nemreid Feb 14, 2020
ed92f11
converge on ftp error handler and log errors
nemreid Feb 14, 2020
dd91f5a
un-serial FtpProviderClient tests
nemreid Feb 14, 2020
e051b3e
more testing
nemreid Feb 14, 2020
922ddd4
stop manipulating provider_path
nemreid Feb 14, 2020
e6771e3
normalize prefix in S3ProviderClient
nemreid Feb 14, 2020
3ec880b
DEBUG: log dependency versions
nemreid Feb 14, 2020
7596d9c
dump ftp log to bamboo
nemreid Feb 18, 2020
5846725
debugging
nemreid Feb 18, 2020
c989fe3
Revert "DEBUG: log dependency versions"
nemreid Feb 18, 2020
0f9f930
remove -i
nemreid Feb 18, 2020
68b57e3
more debug logging
nemreid Feb 18, 2020
e4226d3
Update unit test deployment to *not* chmod keys in the test dir
Jkovarik Feb 18, 2020
015f019
Merge branch 'CUMULUS-1748' of https://github.com/nasa/cumulus into C…
Jkovarik Feb 18, 2020
1feb8bd
Really fix the unit tests
Jkovarik Feb 18, 2020
7e481d4
handle file permissions failures correctly
nemreid Feb 19, 2020
1e242ab
remove debug logging
nemreid Feb 19, 2020
4f37cc9
default value for path in S3ProviderClient.list
nemreid Feb 19, 2020
22fd08a
comment out failure unit test
nemreid Feb 19, 2020
4f2c913
properly skip failure test
nemreid Feb 19, 2020
da6fdfa
actually really properly skip failure test for realsies
nemreid Feb 19, 2020
5ed85ab
providerClient bug fixing
nemreid Feb 19, 2020
7da1033
thought better of automagic for preventing users from doing bad things
nemreid Feb 19, 2020
fd51f65
remove unused import
nemreid Feb 19, 2020
afb9113
tribute to the lint gods
nemreid Feb 19, 2020
2048bb6
Merge branch 'master' into CUMULUS-1748
nemreid Feb 19, 2020
249c752
changelog
nemreid Feb 19, 2020
030c0d2
rephrase 1748 changelog
nemreid Feb 19, 2020
d1cc86a
jsdoc formatting
nemreid Feb 19, 2020
f725ee2
Merge branch 'master' into CUMULUS-1748
nemreid Feb 19, 2020
330a5e9
try cmr-client alpha version
nemreid Feb 19, 2020
a2f407f
destroy ftp connection when done using
nemreid Feb 20, 2020
a545798
do the last one again but better this time
nemreid Feb 20, 2020
85b2cf5
Merge branch 'master' into CUMULUS-1748
nemreid Feb 20, 2020
5ac3795
errorHandler var naming
nemreid Feb 20, 2020
7e9d0ed
test error message against regex
nemreid Feb 21, 2020
f92ab5f
test to ensure we don't pick up files along the way
nemreid Feb 21, 2020
4900319
Revert "stop manipulating provider_path"
nemreid Feb 21, 2020
4a1835d
deprecate normalizeProviderPath
nemreid Feb 21, 2020
f524a84
changelog deprecation note
nemreid Feb 21, 2020
dcc4286
converge on path.normalize
nemreid Feb 21, 2020
a41a2c9
fix import
nemreid Feb 21, 2020
ca69b9c
stub failure test setup
nemreid Feb 21, 2020
795f969
Merge branch 'master' into CUMULUS-1748
nemreid Feb 21, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions bamboo/bootstrap-unit-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ while ! docker container inspect ${container_id}\_build_env_1; do
done

## Setup the build env container once it's started
$docker_command "npm install --error --no-progress -g nyc; cd $UNIT_TEST_BUILD_DIR; git fetch --all; git checkout $GIT_SHA; npm install --error --no-progress; npm run bootstrap-no-build-quiet || true; npm run bootstrap-no-build-quiet; chmod 0400 -R $UNIT_TEST_BUILD_DIR/packages/test-data/keys"
$docker_command "npm install --error --no-progress -g nyc; cd $UNIT_TEST_BUILD_DIR; git fetch --all; git checkout $GIT_SHA; npm install --error --no-progress; npm run bootstrap-no-build-quiet || true; npm run bootstrap-no-build-quiet"

# Wait for the FTP server to be available
while ! $docker_command 'curl --connect-timeout 5 -sS -o /dev/null ftp://testuser:[email protected]/README.md'; do
Expand All @@ -45,10 +45,11 @@ while ! $docker_command 'curl --connect-timeout 5 -sS -o /dev/null http://127.0
done
echo 'HTTP service is available'

$docker_command "mkdir /keys;cp $UNIT_TEST_BUILD_DIR/packages/test-data/keys/ssh_client_rsa_key /keys/; chmod -R 400 /keys"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea behind making it a root directory (/keys) instead of in the test-data directory to avoid it being crawled by the FTP discovery tests and having issues due to directory permissions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is both that and the fact that SFTP requires the key permissions to be 400.
Changing the way we react to running into a directory we can't access seemed out of scope for this PR but would be required for everything not to bomb out on encountering a directory that does not allow reads.

# Wait for the SFTP server to be available
while ! $docker_command "sftp \
-P 2222\
-i $UNIT_TEST_BUILD_DIR/packages/test-data/keys/ssh_client_rsa_key\
-i /keys/ssh_client_rsa_key\
-o 'ConnectTimeout=5'\
-o 'StrictHostKeyChecking=no'\
-o 'UserKnownHostsFile=/dev/null'\
Expand Down
2 changes: 1 addition & 1 deletion bamboo/deploy-dev-integration-test-stack.sh
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#!/bin/bash
set -ex
. ./bamboo/set-bamboo-env-variables.sh
. ./bamboo/abort-if-not-pr-or-redeployment.sh
. ./bamboo/abort-if-skip-integration-tests.sh
. ./bamboo/set-bamboo-env-variables.sh

npm config set unsafe-perm true

Expand Down
5 changes: 2 additions & 3 deletions packages/common/sftp.js
Original file line number Diff line number Diff line change
Expand Up @@ -163,10 +163,9 @@ class Sftp {
* @returns {Promise.<Object>} - list of file object
*/
async list(remotePath) {
const normalizedPath = (remotePath.trim() === '' ? '/' : remotePath);
if (!this.connected) await this.connect();
return new Promise((resolve, reject) => {
this.sftp.readdir(normalizedPath, (err, list) => {
this.sftp.readdir(remotePath, (err, list) => {
if (err) {
if (err.message.includes('No such file')) {
return resolve([]);
Expand All @@ -175,7 +174,7 @@ class Sftp {
}
return resolve(list.map((i) => ({
name: i.filename,
path: normalizedPath,
path: remotePath,
type: i.longname.substr(0, 1),
size: i.attrs.size,
time: i.attrs.mtime * 1000
Expand Down
62 changes: 38 additions & 24 deletions packages/ingest/FtpProviderClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ const log = require('@cumulus/common/log');
const omit = require('lodash.omit');
const S3 = require('@cumulus/aws-client/S3');
const { S3KeyPairProvider } = require('@cumulus/common/key-pair-provider');
const { isNil } = require('@cumulus/common/util');
const recursion = require('./recursion');
const { lookupMimeType } = require('./util');

Expand Down Expand Up @@ -51,13 +52,31 @@ class FtpProviderClient {
}

async buildFtpClient() {
return new JSFtp({
host: this.host,
port: get(this.providerConfig, 'port', 21),
user: await this.getUsername(),
pass: await this.getPassword(),
useList: get(this.providerConfig, 'useList', false)
});
if (isNil(this.ftpClient)) {
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
this.ftpClient = new JSFtp({
host: this.host,
port: get(this.providerConfig, 'port', 21),
user: await this.getUsername(),
pass: await this.getPassword(),
useList: get(this.providerConfig, 'useList', false)
});
}
return this.ftpClient;
}

errorHandler(rejectFn, e) {
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
let err;
if (!e.message && e.text) {
const message = (e.code
? `FTP Code ${e.code}: ${e.text}`
: `FTP error: ${e.text}`) + ' This may be caused by user permissions disallowing the listing.';
err = new Error(message);
}
if (!isNil(this.ftpClient)) {
this.ftpClient.destroy();
}
log.error('FtpProviderClient encountered error: ', err);
return rejectFn(err);
}

/**
Expand All @@ -74,15 +93,13 @@ class FtpProviderClient {
const client = await this.buildFtpClient();

return new Promise((resolve, reject) => {
client.on('error', reject);
client.on('error', this.errorHandler.bind(this, reject));
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
client.get(remotePath, localPath, (err) => {
client.destroy();

if (err) reject(err);
else {
log.info(`Finishing downloading ${remoteUrl}`);
resolve(localPath);
if (err) {
return this.errorHandler(reject, err);
}
log.info(`Finishing downloading ${remoteUrl}`);
return resolve(localPath);
});
});
}
Expand All @@ -91,19 +108,19 @@ class FtpProviderClient {
let counter = _counter;
const client = await this.buildFtpClient();
return new Promise((resolve, reject) => {
client.on('error', reject);
client.on('error', this.errorHandler.bind(this, reject));
client.ls(path, (err, data) => {
client.destroy();
if (err) {
if (err.message.includes('Timed out') && counter < 3) {
const message = err.message || err.text;
if (message && message.includes('Timed out') && counter < 3) {
log.error(`Connection timed out while listing ${path}. Retrying...`);
counter += 1;
return this._list(path, counter).then((r) => {
log.info(`${counter} retry suceeded`);
log.info(`${counter} retry succeeded`);
return resolve(r);
}).catch((e) => reject(e));
}).catch(this.errorHandler.bind(this, reject));
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
}
return reject(err);
return this.errorHandler(reject, err);
}

return resolve(data.map((d) => ({
Expand Down Expand Up @@ -153,8 +170,7 @@ class FtpProviderClient {
const readable = await new Promise((resolve, reject) => {
client.get(remotePath, (err, socket) => {
if (err) {
client.destroy();
return reject(err);
return this.errorHandler(reject, err);
}
return resolve(socket);
});
Expand All @@ -172,8 +188,6 @@ class FtpProviderClient {
await S3.promiseS3Upload(params);
log.info('Uploading to s3 is complete(ftp)', s3uri);

client.destroy();

return s3uri;
}
}
Expand Down
13 changes: 10 additions & 3 deletions packages/ingest/S3ProviderClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,12 @@ const aws = require('@cumulus/common/aws');
const log = require('@cumulus/common/log');
const errors = require('@cumulus/common/errors');
const isString = require('lodash.isstring');
const { basename, dirname } = require('path');
const {
basename,
dirname,
isAbsolute,
normalize
} = require('path');

class S3ProviderClient {
constructor({ bucket }) {
Expand Down Expand Up @@ -41,11 +46,13 @@ class S3ProviderClient {
* @returns {Promise<Array>} a list of files
* @private
*/
async list(path) {
async list(path = '') {
// absolute paths are not valid S3 prefixes
const prefix = isAbsolute(path) ? normalize(path.slice(1)) : normalize(path);
const objects = await aws.listS3ObjectsV2({
Bucket: this.bucket,
FetchOwner: true,
Prefix: path
Prefix: prefix
});

return objects.map(({ Key, Size, LastModified }) => ({
Expand Down
1 change: 0 additions & 1 deletion packages/ingest/SftpProviderClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,6 @@ class SftpProviderClient {
privateKey: await this.getPrivateKey()
});
}

return this.sftpClientInstance;
}

Expand Down
49 changes: 23 additions & 26 deletions packages/ingest/recursion.js
Original file line number Diff line number Diff line change
@@ -1,20 +1,8 @@
'use strict';

const path = require('path');
const log = require('@cumulus/common/log');

/**
* Insert leading and remove terminating slashes into/from the path string
*
* @param {string} path - path string
* @returns {string} normalized path
*/
const normalizeSlashes = (path) => {
let output = path.replace(/[\/]{2,}/g, '/');
if (!output.startsWith('/')) output = `/${output}`;
if (output.endsWith('/')) output = output.slice(0, -1);
return output;
};

/**
* Recur on directory, list all files, and recur into any further directories,
* as specified regex segments allow.
Expand All @@ -40,7 +28,7 @@ async function recurOnDirectory(fn, currentPath, segments, position) {
if (['-', 0].includes(item.type)) {
files.push(item);
} else if (['d', 1].includes(item.type)) {
const nextDir = (currentPath === '' ? item.name : `${currentPath}/${item.name}`);
const nextDir = (currentPath === '/' ? `/${item.name}` : `${currentPath}/${item.name}`);
nemreid marked this conversation as resolved.
Show resolved Hide resolved
// eslint-disable-next-line no-await-in-loop
files = files.concat(await recurOnDirectory(fn, nextDir, segments, position + 1));
}
Expand All @@ -54,28 +42,37 @@ async function recurOnDirectory(fn, currentPath, segments, position) {
* It requests a promisified list function that returns contents of
* a directory on a server, filtering on provided regex segments.
*
* Note that calls to the list function will not have leading or terminating slashes.
* Initially an empty string is passed as the path to list the default directory. Following calls
* based on items discovered will be of the format `fn('path/to/files')`, again with no leading or
* terminating slashes.
* Note that calls to the list function will use either a relative or absolute path, corresponding
* to the `configuredPath` passed into this function. The list function will initially be called
* with '.' for a relative path or '/' for an absolute path. List functions will need to be able to
* normalize or correct these paths as appropriate for their protocol.
*
* Further calls to the list functions will append the current path to that starting path, such
* that all calls will start with either '.' or '/', regardless of additional characters, e.g.
* `fn('./path')` vs. `fn('path')`.
*
* List functions will need to be able to normalize or correct these paths as appropriate for their
* protocol.
* In the case of failure during the recursive list, this function will only apply `path.normalize`
* to the `configuredPath` and then call the list function with the entire normalizedPath.
*
* @param {function} fn - the promisified function for listing a directory
* @param {string} originalPath - path string which may contain regexes for filtering
* @param {string} configuredPath - path string configured by operator, which may contain
* regexes for filtering
* @returns {Promise} the promise of an object that has the path is the key and
* list of files as values
*/
async function recursion(fn, originalPath) {
const normalizedPath = normalizeSlashes(originalPath);
async function recursion(fn, configuredPath) {
const normalizedPath = path.normalize(configuredPath);
const isAbsolutePath = path.isAbsolute(normalizedPath);
try {
const segments = normalizedPath.split('/'); // split on divider
return await recurOnDirectory(fn, segments[0], segments, 0);
const segments = normalizedPath
.split('/') // split on divider
.filter((segment) => segment.trim() !== ''); // filter out empty strings from split
const startingPath = isAbsolutePath ? '/' : '.';
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
return await recurOnDirectory(fn, startingPath, segments, -1);
} catch (e) {
log.error(`Encountered error during recursive list filtering: ${e}`);
log.info('Falling back to unfiltered directory listing...');
return recurOnDirectory(fn, normalizedPath.slice(1), [], 0);
return recurOnDirectory(fn, normalizedPath, [], 0);
}
}

Expand Down
33 changes: 17 additions & 16 deletions packages/ingest/test/test-FtpProviderClient.js
Original file line number Diff line number Diff line change
Expand Up @@ -87,21 +87,22 @@ test('FtpProviderClient supports plaintext usernames and passwords', async (t) =
t.true(fileNames.includes('index.html'));
});

test('FtpProviderClient supports S3-keypair-encrypted usernames and passwords', async (t) => {
const ftpClient = new FtpProviderClient({
host: '127.0.0.1',
encrypted: true,
username: await S3KeyPairProvider.encrypt('testuser'),
password: await S3KeyPairProvider.encrypt('testpass'),
useList: true
test('FtpProviderClient supports S3-keypair-encrypted usernames and passwords',
async (t) => {
const ftpClient = new FtpProviderClient({
host: '127.0.0.1',
encrypted: true,
username: await S3KeyPairProvider.encrypt('testuser'),
password: await S3KeyPairProvider.encrypt('testpass'),
useList: true
});

const files = await ftpClient.list('/');
const fileNames = files.map((f) => f.name);

t.true(fileNames.includes('index.html'));
});

const files = await ftpClient.list('/');
const fileNames = files.map((f) => f.name);

t.true(fileNames.includes('index.html'));
});

test('FtpProviderClient supports KMS-encrypted usernames and passwords', async (t) => {
const ftpClient = new FtpProviderClient({
host: '127.0.0.1',
Expand Down Expand Up @@ -132,7 +133,7 @@ test('useList is present and true when assigned', async (t) => {

await myFtpProviderClient.list('');

t.true(jsftpSpy.callCount > 1);
t.true(jsftpSpy.callCount > 0);
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
t.is(jsftpSpy.getCall(0).args[0].useList, true);
});

Expand All @@ -150,11 +151,11 @@ test('useList defaults to false when not assigned', async (t) => {

await myFtpProviderClient.list('');

// TODO figure out why STAT does not list any results on our local FTP server
t.is(jsftpSpy.callCount, 1);
t.true(jsftpSpy.callCount > 0);
t.is(jsftpSpy.getCall(0).args[0].useList, false);
});


test('Download remote file to s3 with correct content-type', async (t) => {
const myFtpProviderClient = new FtpProviderClient({
host: '127.0.0.1',
Expand Down
45 changes: 45 additions & 0 deletions packages/ingest/test/test-FtpProviderClientFailure.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
'use strict';

const test = require('ava');
const JSFtp = require('jsftp');

const FtpProviderClient = require('../FtpProviderClient');

const executeRawFtpCommand = async (cmd) => {
const jsftp = new JSFtp({
host: '127.0.0.1',
port: 21,
user: 'testuser',
pass: 'testpass'
});
return new Promise((resolve, reject) => {
jsftp.raw(cmd, (err, data) => {
if (err) reject(err);
resolve(data);
});
});
};

test.before(async () => {
// await when fixed
executeRawFtpCommand('chmod 400 -R forbidden/file.txt');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fear that the answer is obvious, but why aren't we awaiting?

Copy link
Contributor Author

@nemreid nemreid Feb 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the only way to allow this file to satisfy both ava and eslint while setting up a test that can be easily enabled in the future. If we await this command throws an error because our server's vsftpd.conf does not have chmod_enabled=true.

});

test.after.always(async () => {
// await when fixed
executeRawFtpCommand('chmod 644 -R forbidden/file.txt');
});

test.skip('FtpProviderClient throws an error when listing a non-permitted directory', async (t) => {
markdboyd marked this conversation as resolved.
Show resolved Hide resolved
// TODO: update cumuluss/vsftpd to allow `chmod` to work in the setup for this test.
nemreid marked this conversation as resolved.
Show resolved Hide resolved
const myFtpProviderClient = new FtpProviderClient({
host: '127.0.0.1',
username: 'testuser',
password: 'testpass',
useList: true
});

await t.throwsAsync(myFtpProviderClient.list('forbidden/file.txt'),
'FTP Code 451: Could not retrieve a file listing for forbidden/file.txt.'
nemreid marked this conversation as resolved.
Show resolved Hide resolved
+ ' This may be caused by user permissions disallowing the listing.');
});
Loading