Skip to content

Conversation

@lbudai
Copy link
Collaborator

@lbudai lbudai commented Jul 5, 2018

@MrAnno : please check whether it supersedes #2045 or not.

  • suspend is set by log_source_flow_control_suspend
  • suspend is unset by log_source_wakeup
  • free_to_send blocks reading until suspend bit is set or the window size is 0
  • when window size was 0 before log_source_flow_control_adjust is called, then we call log_source_wakeup
  • when ack_tracker receives an ACK, then it checks if source is suspended or not (and based on the acknowledge type it makes a decision whether we need to suspend/wakeup the source)

@lbudai lbudai force-pushed the fix-suspend-race-cond branch from 422b974 to ad3f796 Compare July 5, 2018 15:53
@lbudai lbudai requested a review from MrAnno July 5, 2018 15:56
@kira-syslogng
Copy link
Contributor

Build FAILURE

Copy link
Collaborator

@MrAnno MrAnno Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_source_wakeup is unnecessary here. log_source_flow_control_adjust should schedule the wakeup based on its knowledge (window size + suspend bit).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after my change it's not...
Maybe it's a wrong direction but trying to make suspend/wakeup more symmetrical:
suspend bit is set by suspend and unset by wakeup...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a small change:

  • suspend is set by log_source_flow_control_suspend
  • suspend is unset by log_source_wakeup
  • free_to_send blocks reading until suspend bit is set or the window size is 0
  • when window size was 0 before log_source_flow_control_adjust is called, then we call log_source_wakeup
  • when ack_tracker receives an ACK, then it checks if source is suspended or not (and based on the acknowledge type it makes a decision whether we need to suspend/wakeup the source)

Copy link
Collaborator

@MrAnno MrAnno Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log_source_wakeup is unnecessary here. log_source_flow_control_adjust should schedule the wakeup based on its knowledge (window size + suspend bit).

lib/logsource.c Outdated
Copy link
Collaborator

@MrAnno MrAnno Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new (incremented) window size should be used here instead of the old one.

Instead of calling window_size_counter_get here, we could write old_window_size + window_size_increment == ... which is guaranteed to preserve the value of the previous window_size_counter_add call (preserves atomicity).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you are right

lib/logsource.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add this condition to log_source_wakeup.
The name of this function tells me that it will wake the source up without condition.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well... someone should check this suspend bit...
This is not the best and not the final place (this is why I added the wip flag :) )

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, okay, sorry. I'll check it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also playing with this a little bit.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@lbudai lbudai force-pushed the fix-suspend-race-cond branch 2 times, most recently from ce03904 to 2a676d0 Compare July 5, 2018 17:14
@kira-syslogng
Copy link
Contributor

Build FAILURE

@lbudai lbudai force-pushed the fix-suspend-race-cond branch from 2a676d0 to 47c3259 Compare July 5, 2018 17:32
@kira-syslogng
Copy link
Contributor

Build FAILURE

1 similar comment
@kira-syslogng
Copy link
Contributor

Build FAILURE

@kira-syslogng
Copy link
Contributor

Build SUCCESS

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove these 3 racy operations. We had only problems with them in the past.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and I would not, not part of this PR. I agree with you that we should consider to remove all the racy atomic operations, but... but I also know that I don't want to discuss it it as part of this PR.
This is why I wanted to atomic-gssize PR handled separately. Please, add these notes to there, or, after that PR merged, we could open a new PR where we remove all these operations and explain why we don't need them anymore.

If no other issues, don't forget to approve the requested changes :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I've added my comment to #2159.

@lbudai lbudai mentioned this pull request Jul 6, 2018
@lbudai lbudai force-pushed the fix-suspend-race-cond branch from a298db9 to dfd022b Compare July 6, 2018 11:17
@kira-syslogng
Copy link
Contributor

Build SUCCESS

Copy link
Collaborator

@MrAnno MrAnno Jul 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not check free_to_send() here. We might skip suspend() while another thread is waking up the source.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The free_to_send() call is a problem here too.
Instead of calling wakeup here, I would reset the suspend bit in _flow_control_window_size_adjust(), where we can do it safely.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@lbudai lbudai force-pushed the fix-suspend-race-cond branch from 178af2d to b8fc53a Compare July 6, 2018 15:49
@lbudai lbudai changed the title [wip] Fix suspend race cond Fix suspend race cond Jul 6, 2018
@kira-syslogng
Copy link
Contributor

Build SUCCESS

MrAnno
MrAnno previously approved these changes Jul 6, 2018
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

atomic_gssize is implemented with atomic pointers, so I would add a static_assert to compare its size with gsize if we want to use G_MAXSIZE on it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not it be 1ULL << ...? I think the cast would be unnecessary this way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be the type of the expression unsigned long long? I'd stick to the gsize type.

lib/logsource.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like to leave this debug message in the final patch?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. I found it very useful during development... but it is possible to print out a better message, I'll try to do that. But I'd keep a debug message here.

lib/logsource.c Outdated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider adding this new evt tag to log_source_flow_control_suspend as well. We have the same debug message there.

I don't know about our evt tag conventions, but it seems we prefer lowercase tag names (function vs. Function).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will check it.

@MrAnno
Copy link
Collaborator

MrAnno commented Jul 6, 2018

@kira-syslogng do perftest

@kira-syslogng
Copy link
Contributor

Build SUCCESS

@kira-syslogng
Copy link
Contributor

Build FAILURE

@lbudai lbudai force-pushed the fix-suspend-race-cond branch from 5f43934 to 7c17cc7 Compare July 9, 2018 11:08
@kira-syslogng
Copy link
Contributor

Build SUCCESS

MrAnno
MrAnno previously approved these changes Jul 9, 2018
This is an atomic counter for tracking window size and suspend state.
On 64 bit systems the max. window size is 2^63-1, while on 32 bit
systems 2^31-1.
Code asserts in case of overflow/underflow.

Signed-off-by: Laszlo Budai <[email protected]>
A window-size leak also fixed: when ack_type is AT_SUSPENDED,
we have to count the window size even if there is no continuous range
in ringbuffer.

Signed-off-by: Laszlo Budai <[email protected]>
@lbudai lbudai force-pushed the fix-suspend-race-cond branch from fe7c692 to d5dc3bd Compare July 10, 2018 17:44
@kira-syslogng
Copy link
Contributor

Build SUCCESS

1 similar comment
@kira-syslogng
Copy link
Contributor

Build SUCCESS

Copy link
Collaborator

@MrAnno MrAnno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔲 (This is a window.)

@furiel furiel merged commit 8779d8b into syslog-ng:master Jul 12, 2018
@Kokan Kokan removed the in progress label Jul 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants