- compatibility with ancient versions of node
- avoid xxx..html (double dot)
- fixed looping search for start of doc when end not found (very large docs)
- bunyan for logging
- fixed bug that xml not cleared when processed
- refactored to move all anomaly handling into the doOneFile file splitter
- handling timing issues
- improve info message
- avoid double call of getonefile
- make getonefile artificially wait til processing can be assumed done before going on
- correct errors in bad xml doc start
- destroy stream on error
- on error processing for stream
- cheap-logger fix
- clean up error messages.
- clean up checkStart
- logic for input.resume
- typo fix (log->logger)
- handle case where input is empty (xml translation problem?)
- handle preprocess returning error in callback
- handle file containing multiple html files
- updated superagent to 1.8.3
- new cheap-logger
- preprocess gets json from config.json
- Bug if infile list ends in comma, drill finds random files from null "directory".
- Error on no closing element visible and no more input
- Error from ElasticIndexer submitObject gives file in http error message
- processFiles takes a second optional argument for a progress callback (config, msg)
- changed engine to Node v4.2.2+ (0.10.29 gives errors on the tests for odd reasons)
- did not handle \n<?xml...
- handle file with just whitespace in it
- fixes missed assignment on previous fix
- handle <?xml in embedded documents
- Was failing on large files (~1 GB) with multiple xml docs. Fixed by passing keeping buffer small.
- fix for new bug single file fileExt missing
- Fixed generation (for html typically) to produce correct output filename
- added options to new xml2js: attrkey: @, charkey: #, explicitarray: false == not default children to array
- switch to xml2js in order to have bi-directional. (good performance!)
- fix ref to adm-zip in the wrong dependency group
- can handle single-entry zip files. todo: multiple entries
- Fix to drill for list of input directories: do not exit if one file is missing, just make a note.
- update superagent requirement
- report empty xml file instead of throwing error
- add format to String.prototype ("{0}=={1}".format("this","that") ==> "this==that"
- make html generator async safe so that it doesn't exit before writing is finished.
- gz not initializing xml, so documents accumulating
- add README.md to npm dist
- used setimmediate to drop stack and hopefully buffer
- simplified gz
- fixed test to handle gz filename properly
- can handle .gz files now
- bug when no mapping: ElasticIndexer:224
- bug when no src in xml-to-es:deepExtend: 366
- Refixed bug where fileExt is more than just the extension
- Broke out collectFiles from parseOptions processing to make multi-threaded (Cluster) use easier;
- Demoted ElasticIndexer.js log message from INFO to DEBUG.
- Fixed bug where fileExt is applied not to end of filename but only to formal extension.
- Fixed resolving generator fn in nofile case
- use setImmediate to trim stack on callbacks
- Fix bug in Generation.js, line 69, createOneDocPerFile not running callback.
- Add and correct added generator test
- fix make generators available in Generation.js
- Fixed setGenerator produced incorrect structure.
- make generators available in Generation.js
- handle filename properly when id is "" or " ", etc.
- BREAKING CHANGE: config.input.preProcess takes a callback argument cb(json). This enables async processing in preprocess (doh).
- Check that database has been created and attempt to create if not.
- If config.index exists, initialize the indexer.
- make test of fmt and type case-insensitive by using regexp
- throw exception if ignoring user's generator
- fix logger npe in Generation.js
- handle setConfig property in output.generator so that generator can insure that it has the final version of the config object.
- allow parser and indexer to take config as an object as well as a filename
- fixes to ElasticIndexer
- change lewis-config to lewis-input-config
- added cheerio for parsing html input
- substituted config.input.currentFile for config.inputFiles[0] where feasible
- fixed bug where giving a list of input file names (not comma delimited) gives an error.
- added recursive processing to subdirectories of directories submitted
- put input file path (or URI, or dbRef) on config.input.currentFile during runs
- pass config to input-config preProcess function.
- fixed dbconfig to run callback correctly
- added
- ability to define your own generator in output config
- ability to specify a callback in output config to be called when all files are processed
- example to save json output to mongodb
- changed requires in examples/convert.js and example/indexFiles.js so that they can be run in place. Added comment for developer to change them for use outside the examples dir.
- Skip indexing tests if ES is not running.
- for input-config
- added preProcess for gross output manipulation before other processing
- added delete to enable property deletion
- added rename to enable property key renaming (useful after promoting)
- Changed
require
calls inexamples
scripts to suit npm usage, so that examples run correctly when copied to a top level directory (abovenode_modules/xml-to-es
).
- Create index.js to collect API exports into one require.
- Modify tests and examples to load index.js.
- API using lib/xml-to-es and lib/indexFiles backward compatible.
- Fixed bug on --clean, request.delete --> request.del; adjusted tests to exercise
- Fixed serious bug where attempt to get the mapping was creating the index with wrong settings because of URL misspelling
- cheap-logger
log
method now logs absolutely; it ignores log levels