-
Notifications
You must be signed in to change notification settings - Fork 218
Bin pack first fit #448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bin pack first fit #448
Conversation
The Cell index can be used by rep to provide the auctioneer information about its cell index. Auctioneer can use this information to do resource efficient scheduling of LRPs. For example with memory being the limiting factor on a CF cluster, you might want to fill diego-cells to the max regarding memory, diego-cells with a lower index first.
|
We have created an issue in Pivotal Tracker to manage this: https://www.pivotaltracker.com/story/show/171108206 The labels on this github issue will be updated when the story is started. |
|
This PR looks good to me with just small comments that we can address ourselves. The only issue I have with it is that BOSH does not rely on instance index as a permanent value attached to an instance. If instance is deleted all instance indexes will be reshuffled starting from 0 to max. That was the reason why they introduced instance ID. In case of such events with bin pack weight set that affects the placement LRPs might be reshuffled when they don't need to. BOSH most likely is not going to delete instance index given that many releases rely on it but they encourage everyone to use instance ID for unique instance identification. Given that this is not a common situation and that weight is defaulted to 0 it should be OK. |
|
Hello @pommi - just wanted to let you know we're not ignoring this PR. It's still being considered/discussed. It may be helpful to have a discussion with you. Would you be open to that? |
|
Hi @heyjcollins - no problem. I'm open for a discussion. Shall we plan a video conference call? Could you drop an email / send some possible dates/times via the address used in my commit messages? |
|
Great @pommi - I just sent you an email & look forward to chatting with you. |
Submodule src/code.cloudfoundry.org/credhub-cli 907ed5401..fd43d7668: > Bump go modules Submodule src/garden d3ba7afc2..f8558dc30: > Update go.mod dependencies > Update go.mod dependencies Submodule src/grootfs 1fe91e4a2..8b717454b: > Merge pull request #266 from cloudfoundry/fix-test-new-golang > Update go.mod dependencies > Update go.mod dependencies Submodule src/guardian 4127d0dff..79721e0b5: > Update go.mod dependencies > Update go.mod dependencies > Merge pull request #448 from cloudfoundry/remove-pinned-packages > Update go.mod dependencies > Merge pull request #447 from cloudfoundry/pin-opentelemetry Submodule src/idmapper d9ef54a0d..71b869609: > Update go.mod dependencies
…vizzini garden grootfs guardian idmapper Submodule src/code.cloudfoundry.org/auctioneer 84b15846c..127aac4 (rewind): < Struct changes due to new protobuf Submodule src/code.cloudfoundry.org/bbs 38d6fdf18..9be69c0 (rewind): < Change to match new protobufs < Regenerate all other protos < Resolve merge conflicts. Regenerate desired_lrp.proto < Remove debug log statements < Change interface{} to explicit type so we can use ToProto() and send to newTestRequest < We can use d.Routes directly here since it's a pointer < Replace eventfakes with new FakeEvent object < Make FakeEvent a proper proto object Regenerate protos < Routable is a pointer and needs to be dereferenced < Need to store the deserializeModel result in a variable to use later < Change responses to use pointers so we can use them after the method calls < Lint ignore for unnecessary nil check Change routes to use pointers < Ginkgo unfocus + lint ignore for deprecated method testing < WIP: Make integration tests use new protobuf < Struct and naming convention changes; Trueing up lint deprecation < Regenerate protos with deprecated markers < Added support for deprecated things to be marked as such in the output files < Struct and naming changes due to new protobuf < Struct and naming changes due to new protobuf < Merge branch 'main' into protobuf-with-plugin < Naming convention change due to new protobuf < Updated loggregator to v9 < expectedEvent doesn't need to be converted to a Proto here < Add conversions for protobuf messages < Add ToEventProto for models.Event interface < Replace gogo protobuf with google protobuf < Use google.golang protobuf instead of gogo protobuf Remove Unmarshal from MessageValidator interface (use proto.Marshal instead) < Structure and naming changes due to Protobuf upgrades < Add Validate functions for Proto structs < Have to use Proto structs (.ToProto()) in proto.Marshal < Naming convention changes due to protobuf updates < Have to pass a Proto struct in order to populate correctly, then use FromProto for the return value < Naming convention changes due to protobuf updates < Remove usage of protojson, encoding/json package is enough for our use case < copy() is destination first < Regenerate protos with improved optional support and AlwaysEmit changes < Protos: Add AlwaysEmit options to Messages to match old protobuf results < Codegen: Handle optional fields better: Make sure we're checking for nils Set default values where appropriate When a Map has a value of Message type, dereference for Equal < Codegen: Action no longer needs special Get/SetValue in accessors < Codegen: alwaysEmit->jsonEmit < Add JsonAlwaysEmit and CustomType features; Regenerate protos for codegen < Redirect output from pushd/popd to /dev/null < WIP: What if Routes wasn't a special case? < Add explicit json_name for static and dynamic < Codegen: Remove extra log messages, use Desc.Name() for ignoredMessages < Update Task test for protobuf and protojson changes < Regenerate protos for json_name changes < Add json_names for fields with underscores < Update SecurityGroups test for protobuf and protojson changes < Regenerate protos for Action special case < Add special case for Action types (they need GetValue and SetValue) < Add ExcludeFromEqual to Error.Message; regenerate error.proto < Codegen: Add support for ExcludeFromEqual feature and regenerate codegen protos < Update tests to account for protojson and protobuf changes < Add GetProto for Actions so we can marshal/unmarshal correctly < Regenerate actions.proto for json_name changes < Add json_name tags to prevent camelCase from happening < Use protojson for Marshal/Unmarshal functionality < Update pointers to reflect new ByValue types < Regenerate protos with new ByValue types < Add ByValue attributes to match old protobuf declarations < Codegen: Don't generate a Getter for By Value types < Regenerate protos for nil checks on ProtoSlice/ProtoMap methods and bbs_by_value support < Pointer changes / Proto conversion for Routes bbs_by_value < Add nil checks on ToProto/FromProto Custom Routes type requires json.RawMessage but we only have []byte, so conversion has to be done manually < Make custom Routes type use bbs_by_value < Codegen: Add nil checks for ProtoSlice and ProtoMap methods < Codegen: Add support for bbs_by_value < Codegen: Regenerate bbs.proto for bbs_by_value support < Codegen: Add bbs_by_value option (allows fields to be passed by value instead of reference) < Use generate_protos script < Update generate_protos script to handle plugin < Regenerate protos with nil checks for ToProto and From Proto < CodeGen: add nil checks for ToProto and FromProto methods < Extract structs into variables and use ToProto to give back a proto.Message < Add ToEventProto to Event interface and use it for Un/Marshalling < Remove old gogoproto generated files < Use copy instead of loop (staticcheck compliance) < Test changes to reflect proto and model changes < Add FromProto to Routes special case < Regenerate protos with new Slice/Map methods and FromProto methods < Differentiate between Slices and Maps Add FromProto methods < Bring back old functionality after ignoring ProtoRoutes/Routes in our protoc plugin < Regenerate desired_lrp proto (ignore ProtoRoutes in favor of routes.go) < Handles Routes special case for Equal < Regenerate protos (was using RC release) Make bbs.pb.go a full file (try to resolve this later) < Revert package change and remove softlinks < Change package of protogen plugin to models and softlink .proto and .pb.go files into models Always ignore bbs.proto when generating our plugin's files < Weird Routes issue (assume because we aren't using the one defined in routes.go any more) < Standardization of enum value names (include containing struct) < Standardization of enum value names (include containing struct) < Lack of protobuf embed support requires using the structs directly < Pointer changes due to standard protobuf updates < Standardization of ActualLRP=>ActualLrp < Remove gogoproto proto.Message < Standardization of DesiredLRP=>DesiredLrp < Lack of protobuf embed support requires using the structs directly < Pointer changes due to standard protobuf updates < Regenerate protos for maps with a value type of protogen message < Fix type on enumvalue_customname < Don't need Proto prefix for string types < Add support for map values with a protogen'd message type Add ProtoMap function for conversion between our map and the default protogen map < Default value for repeated fields (arrays/slices) is nil < Use type name instead of field name when generating an enum cast < Pointer changes due to optional field support < Field name standardization on ActualLrp* < Pointer changes due to optional field support < Regenerate protos with standardized LRP fieldName modifications (i.e. no modifications from standard protoc output) < Remove getFieldName overrides for Actual/DesiredLRP. This wasn't consistent so we're going to standardize on the output from basic protoc (ActualLrp instead of ActualLRP) < Regenerate protos after ToProto field name fix < Incorrect field name used when generating ToProto function < Regenerate protos after oneof removal < Remove support for oneof since we weren't using it (previous use of oneof was incorrect and essentially a non-use of oneof) < Regenerate protos with optional field support < Add support for optional fields < Remove duplicate methods generated by plugin < Regenerate protos for Routable fix < These Routables aren't messages, so don't need the 'Proto' prefix < CellId exists on ActualLRPInstanceKey. Golang protobuf doesn't support embedded types, so we need to reference this directly. Potential NRE if ActualLRPInstanceKey is null, but this function doesn't look like it's being used. < Regenerate protos with getFieldName overrides for Actual/DesiredLRP < Add getFieldName func: For adding additional logic to generated field name (e.g. ActualLrp => ActualLRP) This is intended to be reverted after confirming bbs still works after the plugin is finished. It's here now because we want to make as few changes to the bbs code as possible. < Regenerate protos with Equal function < No longer ignoring ProtoRoutes < Add Equal function for struct generation < Regenerate protos with Enum String func < Add String function to enum generation Remove unnecessary check on len(msg.Enums) < Regenerate protos with enum name/value maps < Add name/value maps to enum generation < Remove duplicate methods generated by our protoc plugin < Remove duplicate methods generated by our protoc plugin < Remove duplicate methods generated by our protoc plugin < Remove gogoproto proto.Message from ActionInterface < Remove duplicate methods generated by our protoc plugin < Regenerate protos with accessors < Import bbs.proto for enum customname extension < Support generating accessors (Get/Set) Rearrange some code for scope changes < Regenerate protos with new plugin < Make use of enumvalue_customvalue extension Remove extra Proto for ProtoRoutes < Move bbs.proto to protoc plugin folder < Add new protoc plugin for bbs code generation < WIP: proto changes < WIP: New protoc plugin for BBS < Move gogoproto files to their own directory < Protos: Change oneof optionals to actual optionals (oneofs weren't being used correctly) < Protos: omitempty no longer needed (protobuf always adds it) < Protos: Change option go_package to code.cloudfoundry.org/bbs/models < Protos: Move optional to beginning of line Remove json_name definitions where not needed < Protos: Remove gogoproto.equal_all < Protos: Remove string_name from enums (does not work) < Protos: Remove gogoproto.goproto_enum_prefix_all < Protos: gogoproto.enumvalue_customname => string_name < Protos: Remove gogoproto.customtype < Protos: Remove gogoproto.embed < Protos: Remove gogoproto.nullable false (required fields are not valid in protobuf3) < Protos: gogoproto.nullable true => optional < Protos: gogoproto.jsontag => json_name < Protos: Add option go_package < Protos: Remove import of gogoproto < Preserve gogoproto generated files Submodule src/code.cloudfoundry.org/cfdot e6e5eb7b7..864c4a6 (rewind): < Update to use new protobufs Submodule src/code.cloudfoundry.org/diego-ssh e4ae099fb..0e1b7b1 (rewind): < Merge branch 'main' into protobuf-with-plugin < Update to use the latest protobufs Submodule src/code.cloudfoundry.org/executor a9ee7fae8..0a01dbe (rewind): < Resolve merge conflict: Change ShouldNot(Receive()) to Should(Receive(&nilOjbect)) < Merge main into protobuf-with-plugin < Change Should.Receive(nil) to ShouldNot(Receive()) since Gomega requires a pointer and this should be equivalent < Resolve staticcheck issues with Errorf and testing deprecated functionality < SetNofile and SetNproc take pointers now due to protobuf changes Submodule src/code.cloudfoundry.org/inigo 121564129..18074ea (rewind): < Merge branch 'main' into protobuf-with-plugin < Update struct references to match new protobufs Submodule src/code.cloudfoundry.org/rep 77ed89ca9..057388d (rewind): < Resolve merge conflict: Change ShouldNot(Receive()) to Should(Receive(&nilOjbect)) < Protobuf change after merge < Merge main into protobuf-with-plugin < Change Should.Receive(nil) to ShouldNot(Receive()) since Gomega requires a pointer and this should be equivalent < Struct changes due to new protobuf Submodule src/code.cloudfoundry.org/route-emitter 58e003eb9..1ed8242 (rewind): < Resolve merge conflict: Add additional ports for TLS < Instances has to be set explicitly to 0 since it's a pointer now < Update to use new protobufs Submodule src/code.cloudfoundry.org/vizzini 14b91b0b1..56193ba (rewind): < Update to use new protobufs < Update struct references to match new protobufs Submodule src/garden d3ba7afc2..fba22f3dc: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies Submodule src/grootfs 1fe91e4a2..1dd33356b: > Update go.mod dependencies > Merge pull request #266 from cloudfoundry/fix-test-new-golang > Update go.mod dependencies > Update go.mod dependencies Submodule src/guardian 4127d0dff..d7f54dca3: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Merge pull request #448 from cloudfoundry/remove-pinned-packages > Update go.mod dependencies > Merge pull request #447 from cloudfoundry/pin-opentelemetry Submodule src/idmapper d9ef54a0d..34b682d85: > Update go.mod dependencies > Update go.mod dependencies
Submodule src/code.cloudfoundry.org/credhub-cli 907ed5401..fd43d7668: > Bump go modules Submodule src/garden d3ba7afc2..fba22f3dc: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies Submodule src/grootfs 1fe91e4a2..1dd33356b: > Update go.mod dependencies > Merge pull request #266 from cloudfoundry/fix-test-new-golang > Update go.mod dependencies > Update go.mod dependencies Submodule src/guardian 4127d0dff..d7f54dca3: > Update go.mod dependencies > Update go.mod dependencies > Update go.mod dependencies > Merge pull request #448 from cloudfoundry/remove-pinned-packages > Update go.mod dependencies > Merge pull request #447 from cloudfoundry/pin-opentelemetry Submodule src/idmapper d9ef54a0d..34b682d85: > Update go.mod dependencies > Update go.mod dependencies
This PR is part of multiple PRs across rep, auctioneer and auction to add an optional weighted bin pack first fit component to the scheduling algorithm of Cloud Foundry Diego for scheduling LRPs.
PRs/issues involved:
What is this change about?
These PRs combined introduce a new setting for auctioneer:
When
bin_pack_first_fit_weightis set to a value > 0, it will make diego-cells with a lower BOSH instance index number more attractive to deploy LRPs to by adding "weight x diego-cell index" to the score of a diego-cell. Diego-cells will be filled up more instead spreading the LRPs across all diego-cells equally.Setting
bin_pack_first_fit_weightto 0.0 (the default) will effectively disable the optional weighted bin pack first fit component. Everything still keeps working as it was previously.What problem it is trying to solve?
With the current deployment algorithm in diego it spreads the LRPs across all diego-cell instances equally. At Mendix we have 64GB memory diego-cells. We need to have the possibility for our customers to deploy 16G containers at all times. This means that we need to have at least 1 diego-cell with 16G memory available at all times. When the LRPs are spread equally you end up with a situation where on average 25% (16/64) of the memory on all our diego-cells is not used.
What is the impact if the change is not made?
In our case 25% of our diego-cell resources are wasted. We have been running a diego-release with these changes on top for the past months in our production Cloud Foundry foundations. We are saving ten-thousands of $$ monthly on AWS EC2 costs.
How should this change be described in diego-release release notes?
auctioneer
Please provide any contextual information.
Blog post with more information: https://cloud-infra.engineer/saving-costs-with-a-new-scheduler-in-cloud-foundry-diego/
Tag your pair, your PM, and/or team!
I'm not sure who to tag.