Coding challenge for Insight Data Engineering Followship
This program has been developed and tested based on Python 2.7.10 under GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39) in OS X Yosemite Version 10.10.5. It's compatiable for most main stream OS platform including MacOS and Linux/Unix. I haven't tested it on Windows, guess it should work :)
In order to run it, simply pull the whole folder into local machine and launch with run.sh script.
python ./src/process_log.py <input_log.txt> <output_top_ten_hosts.txt> <output_top_ten_resources.txt> <output_top_ten_busy_hour.txt> <output_block_list.txt>
This program is develeloped following object-oriented design, and each feature is wrapped as an indepdent service class. When program runs, each line of data stream will first be parsed into a log data structure. Next, this log structure is processed by four classes to achieve different features. Finally each service class will output result individually.
Shanshan Qin