Skip to content

note 1: Purging tuyathome.messages table

Marco Sillano edited this page Nov 9, 2023 · 11 revisions

limit messages

The tuyathome.messages table can grow very quickly, leading to performance degradation. We bear in mind that some devices (e.g. BreakerDIN dP 6) can generate up to one measurement per second, i.e. up to 86400 records in 24 hours. while 'core._heartbeat' every 30s generates 2880 records/day, etc. The first measure to take therefore, depending on the devices present and the objectives to be achieved, is to control the throughput of the devices, possibly in the device itself. The user has several means at his disposal:

  • Set the sampling frequency directly in the device (see PM_detector dP 1) when possible.
  • Use REFRESH for devices that accept this command (e.g. ACmeter)

If it is not possible to intervene on the device, it is possible to act at the driver level:

  • Use the new SET_EVENT_MODE command in tuya_device node ver 5.2.0 (see ISSUE#118) to control data reception.

Otherwise, there are some general, less efficient solutions:

  • Use the RT/AVG subflow which decimates the data by providing average values at pre-set intervals.
  • If the message is useful and not purgeable (like _heartbeat) you can still decide not to save it to the DB by adding hide = "RT" in global.alldevices, or by updating the code of DB filter nodes in CORE.
  • Sending repeated GET or SCHEMA requests, otherwise, when a device doesn't PUSH data (see Smart Socket).

However, you can always exceed the HW limits with a distributed system, using multiple DBs and tuyaDEAMON servers.

For the user, the goal is to overcome the limits of the various devices, to obtain a behavior aimed at the needs of the applications.

Evaluate whether two regimes should be implemented: one, faster, when displaying data on a user interface (2-10s, smartLife uses 5s), another, slower, when it only affects data storage (20-120s).

For an application of these guidelines see the application daemon.energy

db management strategies

However the tuyathome.messages table tends to grow over time, a strategy is needed to reduce performance degradation and maintain high TuyaDAEMON efficiency:

  • Data Purging: The primary objective of the messages table is the common one of all logs: to have a trace of the operations performed in order to be able to reconstruct the events in case of malfunctions. To maintain an acceptable size, the persistence of data can be limited over time.

  • Data Archiving: The preservation of historical series of data from some devices is equally important. This operation can be done by creating tables specialized by area of interest. As an example, this is the layout of the meteo data table, which combines the data of two devices: weather-station and PM_detector:

 timex|day|temp|hum|rain|wind|gust|wind dir|HI|DP|WC|PMtemp|PMhum|PM10|PM2.5|PM1|AQI

Other solutions like table partition or file storage don't seem suitable for TuyaDAEMON.

purging db tuyathome.messages

To keep the table with acceptable dimensions we can limit data persistence over time. Usually, you need to consider the specific situation, maintenance strategies, amount of daily raw data, etc... A reasonable limit for my applications is 1-3 days.

EXAMPLE#1 do_purge_01

Here are two 'stored procedures' for a minimally invasive purge, to be chosen depending on the case:

    -- Stored Procedures: do_purge_01
    -- Description: Purges the 'tuyathome.messages' table by batch deleting records older than a certain number of days .
    -- Parameters:
    --    - @days: The number of days that determines the maximum age of records to delete.
    --    - @rows_deleted: Output variable indicating the total number of records deleted.
    -- Note:
    --    - The purge operation is performed in blocks of 5000 records at a time.
    --    - Calibrate this value according to the HW, to have a time < 1 second
    --    - A 3-second delay is introduced between each elimination block to avoid a blocking effect.
    --    - Set this delay (0.5-5 s) according to the load.
    --    - The 'timestamp' (not unique) column is used to determine the age of the records to be deleted, but 'id' (pk) is used in deletions.

        DELIMITER //
			
	CREATE PROCEDURE do_purge_01()
	BEGIN
        -- Declaration of variables
	  DECLARE days INT;
	  DECLARE endid INT;
	  DECLARE var_rows INT;
	  DECLARE rows_deleted INT;
        -- Set the number of days for the maximum age of records to be deleted
          SET days := 2;

        -- Calculate the maximum ID of records older than 'days' days
          SET endid := (
	     SELECT MAX(`id`)
	        FROM `tuyathome`.`messages`
	        WHERE `timestamp` < DATE_SUB(CURRENT_TIMESTAMP, INTERVAL days DAY)
	        );

	-- Initialize variables		  
	  SET var_rows := 1;
	  SET rows_deleted := 0;

        -- Loop to delete records in blocks
          WHILE var_rows > 0 DO
        -- Delete a block of older records
	     DELETE IGNORE FROM `tuyathome`.`messages`
 	        WHERE `id` < endid
	        LIMIT 5000;
        -- Get the number of records deleted in the current block
             SET var_rows := ROW_COUNT();   
        -- Update the total number of deleted records
 	     SET rows_deleted := rows_deleted + var_rows;
        -- Introduces a 3-second delay between elimination blocks to avoid a blocking effect
             DO SLEEP(3);
          END WHILE;

        -- Returns the total number of records deleted as a result of the stored procedure
	  SELECT rows_deleted AS "Rows Deleted";
	  END //
			
	DELIMITER ;

EXAMPLE#2 do_purge_02

As an alternative this second procedure is more atomic, deleting one record at a time:

    -- Stored Procedures: do_purge_02
    -- Description: Purges table 'tuyathome.messages' by deleting records older than 3 days using a cursor.
    -- Note:
    --   - The purge operation is performed by deleting records individually using a cursor.
    --   - A delay is introduced between each elimination to avoid a blocking effect.
    --   - The `messages`.'timestamp' column is used to determine the age of the records to be deleted, the `messages`.`id` is pk.

        DELIMITER //

        CREATE PROCEDURE do_purge_02()
        BEGIN
    -- Declaration of variables
        DECLARE done INT DEFAULT FALSE;
        DECLARE purge_id INT;

    -- Create a cursor to select IDs of records to delete
        DECLARE cur CURSOR FOR
           SELECT `id`
              FROM `tuyathome`.`messages`
    -- Set the number of days for the maximum age of records to be deleted
            WHERE `timestamp` < NOW() - INTERVAL 3 DAY
              ORDER BY `id`
    -- Adjust as required
              LIMIT 5000;

    -- Handler for the case where no records are found at the cursor
        DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    -- Open the cursor
        OPEN cur;

    -- Loop to read and delete records
        read_loop: LOOP
    -- Read record ID from the cursor
           FETCH cur INTO purge_id;

    -- Check if the cursor has finished reading all records
           IF done THEN
              LEAVE read_loop;
           END IF;

    -- Perform the delete operation for the single record
           DELETE IGNORE FROM `tuyathome`.`messages` WHERE `id` = purge_id;

    -- Introduces a delay between eliminations to avoid a blocking effect
    -- You can adjust the length of the delay at your discretion
           DO SLEEP(0.1); -- 0.1-second delay (100 milliseconds)
        END LOOP;

    -- Close the cursor
        CLOSE cur;
        END //

        DELIMITER ;

Note that using a cursor and deleting records one by one can take longer than batch deleting with the "LIMIT" clause. This could affect performance if the table contains a large number of records to be deleted. It is important to carefully evaluate the performance effect and system impact before running the purge operation in a production environment.

EXAMPLE#3 CORE._heartbeat as cron

Whichever procedure you choose, one implementation possibility in tuyaDAEMON is to use CORE._heartbeat as CRON. The advantage is that you only work on global.alldevices, using a share, without having to write code or add new nodes: you have to modify core._heartbeat as follows:

		{
		  "dp": "_heartbeat",
		  "capability": "RO",
		  "typefield": "TSTAMP2TIME",

		  "__comment": "simple cron, +-15s, share implementation",
                  "__test_a":  "test time everyday:        (msg.info.value >= '17:52:00' && msg.info.value < '17:52:30')",
                  "__test_b":  "limit day of week [0=sun]: (new Date().getDay() in [0,1,   6])",
                  "__test_c":  "limit mounth [0=jan]:      (new Date().getMonth() in [0,1,  11])",
		  "share": [
		    {
		      "__comment": "job_dopurge: everyday purge the messages table in the DBlocal at 6:23:15",
 		      "test": [
		        "(msg.info.value >= '06:23:00' && msg.info.value < '06:23:30')"
		      ],
		      "action": [
		        {
                          "device": "_system"
		          "property": "_dopurge",
		          "value": "gonow"
		        }
		      ]
		    }
		  ]
		},

EXAMPLE#4 SYSTEM._dopurge()

and add a new property (_dopurge) to SYSTEM:

		{
		  "dp": "_dopurge",
		  "capability": "SKIP",
	          "__comment":    "uses the stored procedure `do_purge` to trim local messages table",
	          "__parameters": "'value':'any' (trigger: required but not used)",
		  "share": [
		    {
		      "test": [
		        "tuyastatus.core._DBase"
		      ],
		      "action": [
		        {
		          "property": "_sqlDBlocal",
   		          "value": "CALL `do_purge`()"
		        }
		      ]
		    }
		  ]
		}
  • The new SYSTEM dP (or method) _dopurge() is autonomous, it checks the connection to the DB, so it can be invoked by the user at any time.
  • The SKIP capability denies GETs, and accepts SETs but doesn't call the device CORE: only executes the 'share'.
  • The share added to core._heartbeat acts as a CRON calling _dopurge() once a day.
  • At the end, you can also execute the SQL command "OPTIMIZE TABLE tuyathome.messages" (optional).
  • Make a backup of alldevices (in CORE_devices, 'EXPORT alldevices.json' input node) before any modification.
  • Since one new dP has been added to SYSTEM, update your documentation (custom SYSTEM datasheet).