-
Notifications
You must be signed in to change notification settings - Fork 1
note 1: Purging tuyathome.messages table
The tuyathome.messages
table can grow very quickly, leading to performance degradation.
We bear in mind that some devices (e.g. BreakerDIN dP 6) can generate up to one measurement per second, i.e. up to 86400 records in 24 hours. while 'core._heartbeat' every 30s generates 2880 records/day, etc.
The first measure to take therefore, depending on the devices present and the objectives to be achieved, is to control the throughput of the devices, possibly in the device itself. The user has several means at his disposal:
- Set the sampling frequency directly in the device (see PM_detector dP 1) when possible.
- Use
REFRESH
for devices that accept this command (e.g. ACmeter)
If it is not possible to intervene on the device, it is possible to act at the driver level:
- Use the new
SET_EVENT_MODE
command intuya_device
node ver 5.2.0 (see ISSUE#118) to control data reception.
Otherwise, there are some general, less efficient solutions:
- Use the
RT/AVG
subflow which decimates the data by providing average values at pre-set intervals. - If the message is useful and not purgeable (like
_heartbeat
) you can still decide not to save it to the DB by addinghide
= "RT" inglobal.alldevices
, or by updating the code of DB filter nodes in CORE. - Sending repeated GET or SCHEMA requests, otherwise, when a device doesn't PUSH data (see Smart Socket).
However, you can always exceed the HW limits with a distributed system, using multiple DBs and tuyaDEAMON servers.
For the user, the goal is to overcome the limits of the various devices, to obtain a behavior aimed at the needs of the applications.
Evaluate whether two regimes should be implemented: one, faster, when displaying data on a user interface (2-10s, smartLife uses 5s), another, slower, when it only affects data storage (20-120s).
For an application of these guidelines see the application daemon.energy
However the tuyathome.messages
table tends to grow over time, a strategy is needed to reduce performance degradation and maintain high TuyaDAEMON efficiency:
-
Data Purging: The primary objective of the messages table is the common one of all logs: to have a trace of the operations performed in order to be able to reconstruct the events in case of malfunctions. To maintain an acceptable size, the persistence of data can be limited over time.
-
Data Archiving: The preservation of historical series of data from some devices is equally important. This operation can be done by creating tables specialized by area of interest. As an example, this is the layout of the
meteo data
table, which combines the data of two devices:weather-station
andPM_detector
:
timex|day|temp|hum|rain|wind|gust|wind dir|HI|DP|WC|PMtemp|PMhum|PM10|PM2.5|PM1|AQI
Other solutions like table partition or file storage don't seem suitable for TuyaDAEMON.
To keep the table with acceptable dimensions we can limit data persistence over time. Usually, you need to consider the specific situation, maintenance strategies, amount of daily raw data, etc... A reasonable limit for my applications is 1-3 days.
Here are two 'stored procedures' for a minimally invasive purge, to be chosen depending on the case:
-- Stored Procedures: do_purge_01
-- Description: Purges the 'tuyathome.messages' table by batch deleting records older than a certain number of days .
-- Parameters:
-- - @days: The number of days that determines the maximum age of records to delete.
-- - @rows_deleted: Output variable indicating the total number of records deleted.
-- Note:
-- - The purge operation is performed in blocks of 5000 records at a time.
-- - Calibrate this value according to the HW, to have a time < 1 second
-- - A 3-second delay is introduced between each elimination block to avoid a blocking effect.
-- - Set this delay (0.5-5 s) according to the load.
-- - The 'timestamp' (not unique) column is used to determine the age of the records to be deleted, but 'id' (pk) is used in deletions.
DELIMITER //
CREATE PROCEDURE do_purge_01()
BEGIN
-- Declaration of variables
DECLARE days INT;
DECLARE endid INT;
DECLARE var_rows INT;
DECLARE rows_deleted INT;
-- Set the number of days for the maximum age of records to be deleted
SET days := 2;
-- Calculate the maximum ID of records older than 'days' days
SET endid := (
SELECT MAX(`id`)
FROM `tuyathome`.`messages`
WHERE `timestamp` < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL days DAY)
);
-- Initialize variables
SET var_rows := 1;
SET rows_deleted := 0;
-- Loop to delete records in blocks
WHILE var_rows > 0 DO
-- Delete a block of older records
DELETE IGNORE FROM `tuyathome`.`messages`
WHERE `id` < endid
LIMIT 5000;
-- Get the number of records deleted in the current block
SET var_rows := ROW_COUNT();
-- Update the total number of deleted records
SET rows_deleted := rows_deleted + var_rows;
-- Introduces a 3-second delay between elimination blocks to avoid a blocking effect
DO SLEEP(3);
END WHILE;
-- Returns the total number of records deleted as a result of the stored procedure
SELECT rows_deleted AS "Rows Deleted";
END //
DELIMITER ;
As an alternative this second procedure is more atomic, deleting one record at a time:
-- Stored Procedures: do_purge_02
-- Description: Purges table 'tuyathome.messages' by deleting records older than 3 days using a cursor.
-- Note:
-- - The purge operation is performed by deleting records individually using a cursor.
-- - A delay is introduced between each elimination to avoid a blocking effect.
-- - The `messages`.'timestamp' column is used to determine the age of the records to be deleted, the `messages`.`id` is pk.
DELIMITER //
CREATE PROCEDURE do_purge_02()
BEGIN
-- Declaration of variables
DECLARE done INT DEFAULT FALSE;
DECLARE purge_id INT;
-- Create a cursor to select IDs of records to delete
DECLARE cur CURSOR FOR
SELECT `id`
FROM `tuyathome`.`messages`
-- Set the number of days for the maximum age of records to be deleted
WHERE `timestamp` < NOW() - INTERVAL 3 DAY
ORDER BY `id`
-- Adjust as required
LIMIT 5000;
-- Handler for the case where no records are found at the cursor
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
-- Open the cursor
OPEN cur;
-- Loop to read and delete records
read_loop: LOOP
-- Read record ID from the cursor
FETCH cur INTO purge_id;
-- Check if the cursor has finished reading all records
IF done THEN
LEAVE read_loop;
END IF;
-- Perform the delete operation for the single record
DELETE IGNORE FROM `tuyathome`.`messages` WHERE `id` = purge_id;
-- Introduces a delay between eliminations to avoid a blocking effect
-- You can adjust the length of the delay at your discretion
DO SLEEP(0.1); -- 0.1-second delay (100 milliseconds)
END LOOP;
-- Close the cursor
CLOSE cur;
END //
DELIMITER ;
Note that using a cursor and deleting records one by one can take longer than batch deleting with the "LIMIT" clause. This could affect performance if the table contains a large number of records to be deleted. It is important to carefully evaluate the performance effect and system impact before running the purge operation in a production environment.
Whichever procedure you choose, one implementation possibility in tuyaDAEMON is to use CORE._heartbeat
as CRON. The advantage is that you only work on global.alldevices
, using a share
, without having to write code or add new nodes: you have to modify core._heartbeat
as follows:
{
"dp": "_heartbeat",
"capability": "RO",
"typefield": "TSTAMP2TIME",
"__comment": "simple cron, +-15s, share implementation",
"__test_a": "test time everyday: (msg.info.value >= '17:52:00' && msg.info.value < '17:52:30')",
"__test_b": "limit day of week [0=sun]: (new Date().getDay() in [0,1, 6])",
"__test_c": "limit mounth [0=jan]: (new Date().getMonth() in [0,1, 11])",
"share": [
{
"__comment": "job_dopurge: everyday purge the messages table in the DBlocal at 6:23:15",
"test": [
"(msg.info.value >= '06:23:00' && msg.info.value < '06:23:30')"
],
"action": [
{
"device": "_system"
"property": "_dopurge",
"value": "gonow"
}
]
}
]
},
and add a new property (_dopurge
) to SYSTEM:
{
"dp": "_dopurge",
"capability": "SKIP",
"__comment": "uses the stored procedure `do_purge` to trim local messages table",
"__parameters": "'value':'any' (trigger: required but not used)",
"share": [
{
"test": [
"tuyastatus.core._DBase"
],
"action": [
{
"property": "_sqlDBlocal",
"value": "CALL `do_purge`()"
}
]
}
]
}
- The new SYSTEM
dP
(ormethod
)_dopurge()
is autonomous, it checks the connection to the DB, so it can be invoked by the user at any time. - The
SKIP
capability denies GETs, and accepts SETs but doesn't call the deviceCORE
: only executes the 'share'. - The
share
added to core._heartbeat acts as a CRON calling_dopurge
() once a day. - At the end, you can also execute the SQL command "OPTIMIZE TABLE
tuyathome
.messages
" (optional). - Make a backup of
alldevices
(in CORE_devices, 'EXPORT alldevices.json' input node) before any modification. - Since one new
dP
has been added toSYSTEM
, update your documentation (custom SYSTEM datasheet).