This repository contains two services: api_service
and log_service
. log_service
processes and stores logs in PostgreSQL, and api_service
provides an API for querying Apache log data.
Upload the repository:
sudo git clone https://github.com/shershunov/apache_log2PG
cd apache_log2PG
Create a Python virtual environment named apache_env and activate it:
sudo apt-get install python3-venv
python3 -m venv apache_env
source apache_env/bin/activate
Install the required Python libraries from the requirements.txt file:
pip install -r requirements.txt
You may need to install psycopg2-binary
pip install psycopg2-binary
It is required to specify data for connection to PostgreSQL database.
[log_service]
DB_HOST = localhost
DB_NAME = postgres
DB_USER = watchdog_agent
DB_PASSWORD = Pa$$w0rd
DB_PORT = 5432
LOG_FILE_PATH = /var/log/apache2/access.log
LAST_POSITION_FILE = last_position.temp
[api_service]
DB_HOST = localhost
DB_NAME = postgres
DB_USER = api_agent
DB_PASSWORD = Pa$$w0rd
DB_PORT = 5432
Set the log format:
nano /etc/apache2/apache2.conf
LogFormat "%h %l %t \"%r\" %>s %b" combined
Run the installation scripts to configure the services:
sudo bash install_api_service.sh
sudo bash install_log_service.sh
The api_service provides an API endpoint at http://host:5000/logs to query Apache log data. The endpoint accepts the following parameters:
ip_address: Filter logs by IP address.
status_code: Filter logs by status code.
start_time: Filter logs by timestamp after the specified time.
end_time: Filter logs by timestamp before the specified time.
group_by_ip: If true, grouping results by IP address.
curl -X GET "http://host:5000/logs"
curl -X GET "http://host:5000/logs?group_by_ip=true"
curl -X GET "http://host:5000/logs?ip_address=192.168.1.1"
curl -X GET "http://host:5000/logs?start_time=2024-06-01T00:00:00&end_time=2024-06-10T23:59:59"
It is required to specify Host, Port and URL of the located API server.
[API]
host = 0.0.0.0
port = 5000
url = /logs
Selection by specific ip address or status code is possible.
Selection from and to a specific date.
Grouping by ip displays the number of records, date of first and last access.