Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only load floating point registers when needed #1981

Merged
merged 1 commit into from
May 13, 2020

Conversation

aarzilli
Copy link
Member

@aarzilli aarzilli commented Apr 2, 2020

proc/*: only load floating point registers when needed

Changes implementations of proc.Registers interface and the
op.DwarfRegisters struct so that floating point registers can be loaded
only when they are needed.
Removes the floatingPoint parameter from proc.Thread.Registers.
This accomplishes three things:

1. it simplifies the proc.Thread.Registers interface
2. it makes it impossible to accidentally create a broken set of saved
registers or of op.DwarfRegisters by accidentally calling
Registers(false)
3. it improves general performance of Delve by avoiding to load
floating point registers as much as possible

Floating point registers are loaded under two circumstances:

1. When the Slice method is called with floatingPoint == true
2. When the Copy method is called

Benchmark before:

BenchmarkConditionalBreakpoints-4   	       1	4327350142 ns/op

Benchmark after:

BenchmarkConditionalBreakpoints-4   	       1	3852642917 ns/op

Updates #1549

proc: do not convert floating point registers until needed

@aarzilli aarzilli force-pushed the delayfpregs branch 2 times, most recently from cee08e1 to dd595b8 Compare April 2, 2020 17:50
@aarzilli
Copy link
Member Author

aarzilli commented Apr 2, 2020

Looks like I broke arm64, I'll have to look into it.

@aarzilli aarzilli changed the title Only load floating point registers when needed Only load floating point registers when needed [WIP] Apr 2, 2020
@aarzilli aarzilli force-pushed the delayfpregs branch 2 times, most recently from 2aa9905 to d2c01be Compare April 2, 2020 18:27
@chainhelen
Copy link
Contributor

En, about

Benchmark before:

BenchmarkConditionalBreakpoints-4   	       1	4327350142 ns/op

Benchmark after:

BenchmarkConditionalBreakpoints-4   	       1	3852642917 ns/op

When I do benchmark (previous similar optimizations), the fluctuation is very big every time and the value is inconspicuous. Of course it must be optimized in theory.

I just want to konw how to generate reliable data. Maybe remove the maximum, remove the minimum, take the average?

@aarzilli aarzilli force-pushed the delayfpregs branch 10 times, most recently from bdbed27 to c2a5212 Compare April 4, 2020 09:34
@aarzilli
Copy link
Member Author

aarzilli commented Apr 4, 2020

Looks like I broke arm64, I'll have to look into it.

Fixed it.

When I do benchmark (previous similar optimizations), the fluctuation is very big every time and the value is inconspicuous. Of course it must be optimized in theory.

I just want to konw how to generate reliable data. Maybe remove the maximum, remove the minimum, take the average?

I don't get a lot of variance. If you are using 1.14 disable async preemption, that adds a lot of noise, otherwise change the number of iterations in issue1549.go.

@aarzilli aarzilli changed the title Only load floating point registers when needed [WIP] Only load floating point registers when needed Apr 4, 2020
}

type DwarfRegister struct {
Uint64Val uint64
Bytes []byte
}

// NewDwarfRegisters returns a new DwarfRegisters object.
func NewDwarfRegisters(staticBase uint64, regs []*DwarfRegister, byteOrder binary.ByteOrder, pcRegNum, spRegNum, bpRegNum, lrRegNum uint64) *DwarfRegisters {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a thin wrapper around initializing the struct. All function args become struct members with no processing done by the function. Maybe we just let the callers initialize the struct themselves?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unexported the Regs field so that consumers of DwarfRegisters can't access it directly and have to go through the appropriate methods (that will call loadMoreCallback when needed). Philosophical reasons aside, this also made sure that I converted all accesses to Regs. However now users of DwarfRegisters can't instantiate it because they can't assign to regs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack, makes perfect sense.

if idx >= uint64(len(regs.Regs)) {
return nil
if idx >= uint64(len(regs.regs)) {
if regs.loadMoreCallback != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this would read better as:

if regs.loadMoreCallback == nil {
        return nil
}
regs.loadMoreCallback()
regs.loadMoreCallback = nil
if idx >= uint64(len(regs.regs)) {
	return nil
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return nil
if idx >= uint64(len(regs.regs)) {
if regs.loadMoreCallback != nil {
regs.loadMoreCallback()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should put this into a regs.loadMore method which can wrap the logic of checking if the callback is not nil and then setting it to nil once it's been called.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, see what you think.

}

// ClearRegisters clears all registers except for PC, SP and BP
func (regs *DwarfRegisters) ClearRegisters() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name should reflect that this is selective in which registers it clears (e.g. ClearGeneralPurposeRegisters or something). Also, why be selective?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General purpose registers usually refers to rax, rbx, etc... this cleans all of them (including xmm, etc) except for PC, SP and BP. It's like this because the only place we need this is in the function call protocol.

However, you are right, this is pretty weird and ugly and I've thought of a way to do it better.

Changes implementations of proc.Registers interface and the
op.DwarfRegisters struct so that floating point registers can be loaded
only when they are needed.
Removes the floatingPoint parameter from proc.Thread.Registers.
This accomplishes three things:

1. it simplifies the proc.Thread.Registers interface
2. it makes it impossible to accidentally create a broken set of saved
   registers or of op.DwarfRegisters by accidentally calling
   Registers(false)
3. it improves general performance of Delve by avoiding to load
   floating point registers as much as possible

Floating point registers are loaded under two circumstances:

1. When the Slice method is called with floatingPoint == true
2. When the Copy method is called

Benchmark before:

BenchmarkConditionalBreakpoints-4   	       1	4327350142 ns/op

Benchmark after:

BenchmarkConditionalBreakpoints-4   	       1	3852642917 ns/op

Updates go-delve#1549
@derekparker derekparker merged commit 200994b into go-delve:master May 13, 2020
cgxxv pushed a commit to cgxxv/delve that referenced this pull request Mar 25, 2022
Changes implementations of proc.Registers interface and the
op.DwarfRegisters struct so that floating point registers can be loaded
only when they are needed.
Removes the floatingPoint parameter from proc.Thread.Registers.
This accomplishes three things:

1. it simplifies the proc.Thread.Registers interface
2. it makes it impossible to accidentally create a broken set of saved
   registers or of op.DwarfRegisters by accidentally calling
   Registers(false)
3. it improves general performance of Delve by avoiding to load
   floating point registers as much as possible

Floating point registers are loaded under two circumstances:

1. When the Slice method is called with floatingPoint == true
2. When the Copy method is called

Benchmark before:

BenchmarkConditionalBreakpoints-4   	       1	4327350142 ns/op

Benchmark after:

BenchmarkConditionalBreakpoints-4   	       1	3852642917 ns/op

Updates go-delve#1549
abner-chenc pushed a commit to loongson/delve that referenced this pull request Mar 1, 2024
Changes implementations of proc.Registers interface and the
op.DwarfRegisters struct so that floating point registers can be loaded
only when they are needed.
Removes the floatingPoint parameter from proc.Thread.Registers.
This accomplishes three things:

1. it simplifies the proc.Thread.Registers interface
2. it makes it impossible to accidentally create a broken set of saved
   registers or of op.DwarfRegisters by accidentally calling
   Registers(false)
3. it improves general performance of Delve by avoiding to load
   floating point registers as much as possible

Floating point registers are loaded under two circumstances:

1. When the Slice method is called with floatingPoint == true
2. When the Copy method is called

Benchmark before:

BenchmarkConditionalBreakpoints-4   	       1	4327350142 ns/op

Benchmark after:

BenchmarkConditionalBreakpoints-4   	       1	3852642917 ns/op

Updates go-delve#1549
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants