Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packet optimisation and small tweaks #45

Merged
merged 10 commits into from
Jan 18, 2023

Conversation

k-danil
Copy link
Contributor

@k-danil k-danil commented Jan 4, 2023

Good day and Happy new year!

This patch introduce changes in packet struct and some minor optimisations:

  • Packet header embedded in packet by value not by reference. Packet header is mandatory part of packet, so we can cutoff expenses on addition heap allocation.
  • Sort fields in Packet Adaptation Field by memory size. This optimisation cut some bytes on allocation of Packet Adaptation Field by better field alignment.
  • Return custom parser in isPSIComplete(). It may be useful for someone, so be it.
  • In packetPool.add replace multiple map access with one.
  • In hasDiscontinuity & isSameAsPrevious call len(ps) once.
  • In programMap use RWMutex, as we read much more than write.

Current benchmarks:

BenchmarkParsePacket
BenchmarkParsePacket-8   	 5675138	       201.8 ns/op	     416 B/op	       9 allocs/op
PASS

BenchmarkDemuxer_NextData
BenchmarkDemuxer_NextData-8   	  231079	      4485 ns/op	    6666 B/op	     125 allocs/op
PASS

After this patch:

BenchmarkParsePacket
BenchmarkParsePacket-8   	 5838757	       193.7 ns/op	     388 B/op	       8 allocs/op
PASS

BenchmarkDemuxer_NextData
BenchmarkDemuxer_NextData-8   	  248532	      4371 ns/op	    6458 B/op	     122 allocs/op
PASS

Also i'd like to ask for your advice.
Right now hottest part of demuxer (by profiler) is packet parser. It allocates slices for payload and structs for packet itself.
First. I think we can make use of sync.pool in packet creation process where we get new packet from pool and put them back by Packet.Close() method. By calling NextData we can put unused packets back in the pool for reuse, or if end user prefer NextPacket it can be called manually.
Second. We can allocate packet payload slice of size n*188 (where n is configurable via options) and then assign parts of it to sequential packets on demand. BytesIterator.DumpTo() will be implemented to copy on existent slice. It will minimise allocation count but underlying slice will not be collected by GC as long as even one of it's part still referred.
Third. In packet pool we make use of sync.Mutex. Are we provide multithread-safe guaranties for end users? Maybe we can get rid of it?

What do you think?

k-danil and others added 8 commits September 5, 2022 10:49
…lation func and corresponding tests/benchmarks.
…onField to minimise overhead on alignment. Return custom parser in isPSIComplete. isPSIComplete & parseData small tweaks. In packetPool.add replace two map access with one. In hasDiscontinuity & isSameAsPrevious call len(ps) ones. In programMap use RWMutex.
data.go Outdated
@@ -50,15 +50,22 @@ func parseData(ps []*Packet, prs PacketsParser, pm *programMap) (ds []*DemuxerDa

// Get payload length
var l int
for _, p := range ps {
l += len(p.Payload)
for i := range ps {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious: does that really make it run faster? If so, do you know why? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hardly noticeable results. We can drop it off from current PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rolled back

@@ -42,22 +42,22 @@ type PacketHeader struct {
// PacketAdaptationField represents a packet adaptation field
type PacketAdaptationField struct {
AdaptationExtensionField *PacketAdaptationExtensionField
OPCR *ClockReference // Original Program clock reference. Helps when one TS is copied into another
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I get it right: you moved it to have better alignment?
Have you used some tool to help you with that optimization? (i.e. aligo)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it's alignment-base optimisation.
At first it was hand written, but i double check with structslop - didn't find any problem with my current scheme.
For notice it gave this result on current master:
struct has size 104 (size class 112), could be 96 (size class 96), you'll save 14.29%

@k-danil
Copy link
Contributor Author

k-danil commented Jan 16, 2023

Hello, @asticode ! Is there any problem with this PR?

data.go Outdated
@@ -118,7 +118,16 @@ func isPESPayload(i []byte) bool {
}

// isPSIComplete checks whether we have sufficient amount of packets to parse PSI
func isPSIComplete(ps []*Packet) bool {
func isPSIComplete(ps []*Packet, prs PacketsParser) bool {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really get the point of using the PacketsParser here 🤔 It shouldn't be used here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is too pass unwanted packets from accumulator as fast as possible (our personal case), but for this i have much more efficient solution. If you sure that this part is unnecessary or somehow dangerous, i will exclude it from commit and clean up packetAccumulator from PacketsParser related code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm yeah I'd rather see it removed 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.
For next PR i think of optional packet filter based on PacketHeader. This will cut unnecessary packets on early stages of pipeline.

@asticode
Copy link
Owner

I'm sorry I've been really busy at work recently. I've only one question (I've made a comment).

@asticode asticode merged commit db51df8 into asticode:master Jan 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants