-
Notifications
You must be signed in to change notification settings - Fork 0
/
draft-campbell-dime-load-considerations-01.txt
952 lines (656 loc) · 37.2 KB
/
draft-campbell-dime-load-considerations-01.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
Internet Engineering Task Force B. Campbell
Internet-Draft S. Donovan, Ed.
Intended status: Informational Oracle
Expires: September 4, 2015 JJ. Trottin
Alcatel-Lucent
March 3, 2015
Architectural Considerations for Diameter Load Information
draft-campbell-dime-load-considerations-01
Abstract
RFC 7068 describes requirements for Overload Control in Diameter.
This includes a requirement to allow Diameter nodes to send "load"
information, even when the node is not overloaded. The Diameter
Overload Information Conveyance (DOIC) solution describes a mechanism
meeting most of the requirements, but does not currently include the
ability to send load information. This document explores some
architectural considerations for a mechanism to send Diameter load
information.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on September 4, 2015.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Campbell, et al. Expires September 4, 2015 [Page 1]
Internet-Draft Abbreviated Title March 2015
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2
2. Differences between Load and Overload information . . . . . . 3
3. How is Load Information Used? . . . . . . . . . . . . . . . . 4
4. Piggy-Backing vs a Dedicated Application. . . . . . . . . . . 5
5. Which Nodes Exchange Load Information? . . . . . . . . . . . 6
6. Scope of Load Information . . . . . . . . . . . . . . . . . . 7
7. Frequency of Sending Load Information . . . . . . . . . . . . 8
8. Load Information Semantics . . . . . . . . . . . . . . . . . 9
9. Is Negotiation of Support Needed? . . . . . . . . . . . . . . 10
10. Topology Scenarios . . . . . . . . . . . . . . . . . . . . . 10
10.1. No Agent . . . . . . . . . . . . . . . . . . . . . . . . 11
10.2. Single Agent . . . . . . . . . . . . . . . . . . . . . . 11
10.3. Multiple Agents . . . . . . . . . . . . . . . . . . . . 11
10.4. Linked Agents . . . . . . . . . . . . . . . . . . . . . 12
10.5. Shared Server Pools . . . . . . . . . . . . . . . . . . 13
10.6. Agent Chains . . . . . . . . . . . . . . . . . . . . . . 14
10.7. Fully Meshed Layers . . . . . . . . . . . . . . . . . . 14
10.8. Partitions . . . . . . . . . . . . . . . . . . . . . . . 15
10.9. Active-Standby Nodes . . . . . . . . . . . . . . . . . . 15
10.10. Addition and removal of Nodes . . . . . . . . . . . . . 15
11. Security Considerations . . . . . . . . . . . . . . . . . . . 15
12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16
13. References . . . . . . . . . . . . . . . . . . . . . . . . . 16
13.1. Normative References . . . . . . . . . . . . . . . . . . 16
13.2. Informative References . . . . . . . . . . . . . . . . . 16
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17
1. Introduction
[RFC7068] describes requirements for Overload Control in Diameter
[RFC6733]. At the time of this writing, the DIME working group is
working on the Diameter Overload Information Conveyance (DOIC)
mechanism [I-D.ietf-dime-ovli] . As currently specified, DOIC
fulfills some, but not all, of the requirements.
In particular, DOIC does not fulfill Req 24, which requires a
mechanism where Diameter nodes can indicate their current load, even
if they are not currently overloaded. DOIC also does not fulfill Req
23, which requires that nodes that divert traffic away from
Campbell, et al. Expires September 4, 2015 [Page 2]
Internet-Draft Abbreviated Title March 2015
overloaded nodes be provided with sufficient information to select
targets that are most likely to have sufficient capacity.
There are several other requirements in RFC 7068 that mention both
overload and load information that are only partially fulfilled by
DOIC.
The DIME working group explicitly chose not to fulfill these
requirements in DOIC due to several reasons. A principal reason was
that the working group did not agree on a general approach for
conveying load information. It chose to progress the rest of DOIC,
and defer load information conveyance to a DOIC extension or a
separate mechanism.
This document describes some high level architectural decisions that
the working group will need to consider in order to solve the load-
related requirements from RFC 7068.
At the time of this writing, there have been several attempts to
create mechanisms for conveyance of both load and overload control
information that were not adopted by the DIME working group. While
these drafts are not expected to progress, they may be instructive
when considering these decisions.
o [I-D.tschofenig-dime-dlba] proposed a dedicated Diameter
application for exchanging load balancing information.
o [I-D.roach-dime-overload-ctrl] described a strictly peer-to-peer
exchange of both load and overload information in new AVPs piggy-
backed on existing Diameter messages.
o [I-D.korhonen-dime-ovl] described a dedicated Diameter application
for exchanging both load and overload information.
2. Differences between Load and Overload information
Previous discussions of how to solve the load-related requirements in
[RFC7068] have shown that people do not have an agreed-upon concept
of how "load" information differs from "overload" information. The
two concepts are highly interrelated, and so far the working group
has not defined a bright line between what constitutes load
information and what constitutes overload information.
In the opinion of the authors, there are two primary differences.
First, a Diameter node always has a load. At any given time that
load maybe effectively zero, effectively fully loaded, or somewhere
in between. In contrast, overload is an exceptional condition. A
node only has overload information when it in an overloaded state.
Campbell, et al. Expires September 4, 2015 [Page 3]
Internet-Draft Abbreviated Title March 2015
Furthermore, the relationship between a node's load level and
overload state at any given time may be vague. For example, a node
may normally operate at a "fully loaded" level, but still not be
considered overloaded. Another node may declare itself to be
"overloaded" even though it might not be fully "loaded".
Second, Overload information, in the form of a DOIC Overload Report
(OLR) [I-D.ietf-dime-ovli] indicates an explicit request for action
on the part of the reacting node. That is, the OLR requests that the
reacting node reduce the offered load -- the actual traffic sent to
the reporting node after overload abatement and routing decisions are
made -- by an indicated amount or to an indicated level.
Effectively, DOIC provides a contract between the reporting node and
the reacting node.
In contrast, load is informational. That is, load information can be
considered a hint to the recipient node. That node may use the load
information for load balancing purposes, as an input to certain
overload abatement techniques, to make inferences about the
likelihood that the sending node becomes overloaded in the immediate
future, or for other purposes.
None of this prevents a Diameter node from deciding to reduce the
offered load based on load information. The fundamental difference
is that an overload report requires that reduction. It is also
reasonable for a Diameter node to decide to increase the offered load
based on load information.
3. How is Load Information Used?
[RFC7068] contemplates two primary uses for load information. Req 23
discusses how load information might be used when performing
diversion as an overload abatement technique, as described in
[I-D.ietf-dime-ovli]. When a reacting node diverts traffic away from
an overloaded node, it needs load information for the other
candidates for that traffic in order to effectively load balance the
diverted load between potential candidates. Otherwise, diversion has
a greater potential to drive other nodes into overload.
Req 24 discusses how Diameter load information might be used when no
overload condition currently exists. Diameter nodes can use the load
information to make decisions to try to avoid overload conditions in
the first place. Normal load-balancing falls into this category. A
node might also take other proactive steps to reduce offered load
based on load information, so that the loaded node never goes into
overload in the first place.
Campbell, et al. Expires September 4, 2015 [Page 4]
Internet-Draft Abbreviated Title March 2015
If the loaded nodes are Diameter servers (or clients in the case of
server-to-client transactions), both of these uses are most
effectively accomplished by a Diameter node that performs server
selection. Typically, server selection is performed by a node (a
client or an agent) that is an immediate peer of the server.
However, there are scenarios (see Section 10) where a client or proxy
that is not the immediate peer to the selected servers performs
server selection. In this case, the client or proxy enforces the
server selection by inserting a Destination-Host AVP.
For example, a Diameter node (e.g. client) can use a redirect
agent to get candidate destination host addresses. The redirect
agent might return several destination host addresses, from which
the Diameter node selects one. The Diameter node can use load
information received from these hosts to make the selection.
Just as load information can be used as part of server selection, it
can also be used as input to the selection of the next-hop peer to
which a request is to be routed.
One area that requires thought is how load information is used, if at
all, in the presence of an overload report from the same Diameter
node. It might be that the load information from that Diameter node
is ignored for the duration of the time that the overload report is
in effect. It might also be possible that the load information can
aid in the routing of non abated requests targeted for the overloaded
Diameter node.
4. Piggy-Backing vs a Dedicated Application.
[I-D.roach-dime-overload-ctrl] imbeds load and overload information
onto messages of existing applications. This is known as a "piggy-
back" approach. Such an approach has the advantage of not requiring
new messages to carry load information. It has an additional
advantage of scaling with load; that is, the more the transaction
load, the more opportunities to send load information.
DOIC [I-D.ietf-dime-ovli] also uses a piggy-backed approach to send
OLRs. Given the potentially tight connection between load and
overload information, there may be advantages to maintaining
consistency with DOIC.
[I-D.tschofenig-dime-dlba] used a dedicated application to carry load
information. This application has quasi-subscription semantics,
where a client requests updates according to a cadence. The server
can send unsolicited updates if the load level changes between
updates in the cadence.
Campbell, et al. Expires September 4, 2015 [Page 5]
Internet-Draft Abbreviated Title March 2015
[I-D.korhonen-dime-ovl] also used a dedicated application, but
allowed nodes to send unsolicited reports containing load and
overload information. The mechanism has an issue that the sender of
load information may not know which other nodes need the information.
It may be possible to infer that information from other application
messages handled by the sender.
Another potential approach is that of a dedicated Diameter
application with a slightly different subscription semantic than that
of [I-D.tschofenig-dime-dlba]. In such an application, a node that
consumes load information sends a Diameter request to the source of
the load information. This request indicates that the consumer
wishes to receive load information for some period of time. The load
source would send periodic Diameter requests indicating the current
load level, until such time that the subscription period expired, or
the subscribe explicitly unsubscribed. After the initial
notification, the sender would only send updates when the load level
changed.
5. Which Nodes Exchange Load Information?
Section 10 illustrates a number of Diameter network topologies where
load information may be useful. However, there are potentially
limitless configurations where load information might be used to make
peer and server selection choices. Nodes may be unaware of the
topology beyond their immediate peers, which may limit the utility of
load information for nodes beyond that peer.
There may in fact be scenarios where a peer-selection decision is
impacted by the load of non-adjacent nodes, or where a node needs to
force selection of a particular non-adjacent server. While explicit
knowledge of the load of such non-adjacent nodes may be useful in
such decisions, the working group should consider whether this
utility is worth the added complexity.
For instance, one approach would be to support two types of load
reports, endpoint load reports and peer load reports. In this
scenario, load reports would likely require an AVP indicating the
Diameter node to which the report applies. This would be needed
to differentiate between endpoint load reports and next hop load
reports. This would imply that a single message will likely have
two load reports, one for the endpoint and one for the next hop.
This would also add complexity in agents, sometimes needing to
strip next hop load reports and sometimes not.
Previous load related efforts have made different assumptions about
which Diameter nodes exchange load information.
Campbell, et al. Expires September 4, 2015 [Page 6]
Internet-Draft Abbreviated Title March 2015
[I-D.roach-dime-overload-ctrl] operated in a strictly peer-to-peer
mode. Each node would only learn the load (and overload) information
from its immediate peers.
[I-D.korhonen-dime-ovl] and [I-D.tschofenig-dime-dlba] are each
effectively any-to-any. That is, they each allowed any node to send
load information to any other node that supported the dedicated
overload or load application, respectively.
In the latter case, load is effectively sent between clients and
servers of the dedicated application, but those roles may not match
the client and server roles for the "main" Diameter applications in
use. For example, a pair of adjacent diameter agents might be
"client" and "server" for the dedicated "load" application,
effectively creating a peer-to-peer relationship similar to that of
[I-D.roach-dime-overload-ctrl].
Each approach has advantages. Since server selection is typically
done by immediate peers to the servers, peer-to-peer transmission
covers most cases. Additionally, selection of non-terminal nodes is
exclusively done on a peer-to-peer basis. If the loaded node is an
agent, for example, the load information is only useful to immediate
peers. Peer-to-peer transmission is the easiest to negotiate. (See
Section 9)
Any-to-Any transmission offers more flexibility, and could
potentially cover the case where server selection is done by nodes
that are not peers to the candidate servers.
6. Scope of Load Information
Load information could refer to several different scopes:
o Load of a Node -- The load information refers to the load for an
entire Diameter host, that is a Client, Agent, or Server described
by a Diameter Identity.
o Load of an Application -- The load for a specific Diameter node
that supports multiple Diameter applications might differ between
applications.
o Load of a set of nodes -- The load would likely be the aggregated
load of the nodes in the set. This would likely require a
separate Diameter identity be assigned to the set of nodes and the
load information would be associated with that Diameter identity.
o Aggregate Load -- Different paths via different agents may exist
between a node making a peer selection decision and the final
Campbell, et al. Expires September 4, 2015 [Page 7]
Internet-Draft Abbreviated Title March 2015
destination of the request. The least loaded destination may only
be reachable via certain peers.
o Load of an agent plus load of a Diameter endpoint -- Different
paths via different Diameter agents may exist between the node
doing the server selection and the targeted Diameter endpoint.
The load information on the Diameter endpoint might be used for
server selection and the load information on the agent might be
used for selecting the next hop in the route to the Diameter
endpoint.
The "scope" of load information defines what the load indication
applies to. For example, load could apply to a whole Diameter node,
or a node could report different load for different application. It
might be possible to have a load value for a whole realm, or a group
of nodes.
[I-D.roach-dime-overload-ctrl] has a very expressive concept of
scope, which applies both to load and overload information. It
defines the scopes of "Destination-Realm", "Application-ID",
"Destination-Host", "Host", "Connection", "Session", and "Session-
Group". Scopes can be combined.
[I-D.tschofenig-dime-dlba] does not have an explicit concept of
scope. Load information describes the load of a server for all
Diameter purposes.
[I-D.korhonen-dime-ovl] defines several scopes for overload
information. However, load information applies to the a whole node.
One view is that the load level of a Diameter node will usually apply
to the whole node. In this case, the working group should consider a
single "whole node" scope for load information. Alternatively, a
"per-connection" scope could simulate "whole node" scope without
requiring the recipient to pay attention to whether multiple
transport connections terminate at the same peer.
Other scopes might also be considered based on the analysis of the
use cases identified for the use of load information.
7. Frequency of Sending Load Information
While it is true that a node always has a discrete load, a
determination needs to be made as to the frequency with which load
information is sent.
Campbell, et al. Expires September 4, 2015 [Page 8]
Internet-Draft Abbreviated Title March 2015
This interacts with the method for transporting load information --
piggy-backed versus a dedicated application -- discussed in
Section 5.
With a piggy-backed approach the following alternatives exist:
1. Send load information in every message.
2. Send load information when it changes by some amount. For
instance, only send a new load report when the load value has
changed by some percentage.
3. Send load information every interval of time. With this
approach, load information would be sent every some number of
seconds.
With alternatives 2 and 3 there would need to be a mechanism for the
sender of the load information to ensure that all consumers of the
load information receive the periodic load information. This is more
straightforward if the load information is sent only to peers. It
becomes more difficult if the load information is sent to non
adjacent nodes. This might require option one if the load mechanism
supports sending of load information to non adjacent nodes.
If a dedicated application is used for transporting of load
information then part of the application definition would need to
define the frequency of sending load information. Options 2 and 3 in
the above list would be the likely alternatives.
8. Load Information Semantics
Both [I-D.tschofenig-dime-dlba] and [I-D.korhonen-dime-ovl] define
load level to be a range between zero and some maximum value, where
zero means no load at all and the max value means fully loaded. The
former uses a range of 0-10, while the later uses 0-100.
[I-D.roach-dime-overload-ctrl] treats load information as a strictly
relative weighting factor. The weight is only meaningful when load-
balancing across multiple destinations. That is, a maximum load
value does not necessarily imply that the node is cannot handle more
traffic. The load level scale is zero to 65535. That scale was
chosen to match the resolution of the weight field from a DNS SRV
record, [RFC2782]
Campbell, et al. Expires September 4, 2015 [Page 9]
Internet-Draft Abbreviated Title March 2015
9. Is Negotiation of Support Needed?
The working group should discuss whether a load conveyance mechanism
requires negotiation or declaration of support. Several
considerations apply to this discussion.
If load information is treated as a hint, it can be safely ignored by
nodes that don't understand it. However, security considerations may
apply if load information is accidentally leaked across a non-
supporting node to a node that is not authorized to receive it.
If load information is conveyed using a dedicated Diameter
application, the normal mechanisms for negotiation support for
Diameter applications apply. However, the Diameter Capabilities
Exchange [RFC6733] mechanism is inherently peer-to-peer. If there is
a need to convey load information across a node that does not
understand the mechanism, the standard Diameter mechanism would
involve probing for support by sending load requests and watching for
error answers with a result code of DIAMETER_APPLICATION_UNSUPPORTED.
If the probe request also includes load information, there is again a
potential for leaking load information to unauthorized parties.
If load information was treated in a strictly peer-to-peer fashion,
there would be no need to probe to see if non-adjacent nodes support
the mechanism. However, there would still be a need to control
whether a non-supporting node would leak load information. Such a
leak could be prevented if adjacent peers declared support, and never
sent load information to a peer that did not declare support.
A peer-to-peer mechanism would also need a way to make sure that, if
load information leaked across a non-supporting node, the receiving
node would not mistakenly think the information came from the non-
supporting node. This could be mitigated with a mechanism to declare
support as in the previous paragraph, or with a mechanism to identify
the origin of the load information. In the latter case, the
receiving node would treat any load information as invalid if the
origin of that information did not match the identity of the peer
node.
10. Topology Scenarios
This section presents a number of Diameter topology scenarios, and
discusses how load information might be used in each scenario.
Nothing in this section should be construed to mean that a given
scenario is in scope for this effort, or even a good idea. Some
scenarios might be considered as not relevant in practice and
subsequently discarded.
Campbell, et al. Expires September 4, 2015 [Page 10]
Internet-Draft Abbreviated Title March 2015
10.1. No Agent
Figure 1 shows a simple client-server scenario, where a client picks
from a set of candidate servers available for a particular realm and
application. The client selects the server for a given transaction
using the load information received from each server.
------S1
/
C
\
------S2
Figure 1: Basic Client Server Scenario
Open Issue: Will a Diameter node include potential peers that it
is not currently connected to as part of the candidate set? It is
unlikely the client would have load information from peers that it
is not currently connected to.
Note: The use of dynamic connections needs to be considered.
10.2. Single Agent
Figure 2 shows a client that sends requests to an agent. The agent
selects the request destination from a set of candidate servers,
using load information received from each server. The client does
not need to receive load information, since it does not select
between multiple agents.
------S1
/
C----A
\
------S2
Figure 2: Simple Agent Scenario
10.3. Multiple Agents
Figure 3 shows a client selecting between multiple agents, and each
agent selecting from multiple servers. The client selects an agent
based on the load information received from each agent. Each agent
selects a server based on the load information received from its
servers.
Campbell, et al. Expires September 4, 2015 [Page 11]
Internet-Draft Abbreviated Title March 2015
This scenario adds a complication that one set of servers may be more
loaded than the other set. If, for example, S4 was the least loaded
server, C would need to know to select agent A2 to reach S4. This
might require C to receive load information from the servers as well
as the agents. Alternatively, each agent might use the load of its
servers as an input into calculating its own load, in effect
aggregating upstream load.
Similarly, if C sends a host-routed request [I-D.ietf-dime-ovli], it
needs to know which agent can deliver requests to the selected
server. Without some special, potentially proprietary, knowledge of
the topology upstream of A1 and A2, C would select the agent based on
the normal peer selection procedures for the realm and application,
and perhaps consider the load information from A1 and A2. If C sends
a request to A1 that contains a Destination-Host AVP with a value of
S4, A1 will not be able to deliver the request.
-----S3
/
---A1------S1
/
C
\
---A2------S2
\
---- S4
Figure 3: Multiple Agents and Servers
10.4. Linked Agents
Figure 4 shows a scenario similar to that of Figure 3, except that
the agents are linked, so that A1 can forward a request to A2, and
vice-versa. Each agent could receive load information from the
linked agent, as well as its connected servers.
This somewhat simplifies the complication from Figure 3, due to the
fact that C does not necessarily need to choose a particular agent to
reach a particular server. But it creates a similar question of how,
for example, A1 might know that S4 was less loaded than S1 or S3.
Additionally, it creates the opportunity for sub-optimal request
paths. For example [C,A1,A2,S4] vs. [C,A2,S4].
A likely application for linked agents is when each agent prefers one
set of servers and only forwards requests to another agent under
exceptional circumstances. For example, A1 might not forward
requests to A2 unless both S1 and S3 are overloaded. In this case,
Campbell, et al. Expires September 4, 2015 [Page 12]
Internet-Draft Abbreviated Title March 2015
A1 might use the load information from S1 and S3 to select between
those, and only consider the load information from A2 (and other
connected agents) if it needs to divert requests to different agents.
-----S3
/
---A1------S1
/ |
C |
\ |
---A2------S2
\
---- S4
Figure 4: Linked Agents
Figure 5 is a variant of Figure 4. In this case, C1 sends all
traffic through A1 and C2 sends all traffic through A2. By default,
A1 will load balance traffic between S1 and S3 and A2 will load
balance traffic between S2 and S4.
Now, if S1 S3 are significantly more loaded than S2 S4, A1 may route
some C1 traffic to A2. This is non optimal path but allows a better
load balancing between the servers. To achieve this, A1 needs to
receive some load info from A2 about S2/S4 load.
-----S3
/
C1----A1------S1
|
|
|
C2----A2------S2
\
---- S4
Figure 5: Linked Agents
10.5. Shared Server Pools
Figure 6 is similar to Figure 4, except that instead of a link
between agents, each agent is linked to all servers. (The links to
each set of servers should be interpreted as a link to each server.
The links are not shown separately due to the limitations of ASCII
art.)
Campbell, et al. Expires September 4, 2015 [Page 13]
Internet-Draft Abbreviated Title March 2015
In this scenario, each agent can select among all of the servers,
based on the load information from the servers. The client need only
be concerned with the load information of the agents.
---A1---S[1], S[2]...S[p]
/ \ /
C x
\ / \
---A2---S[p+1], S[p+2] ...S[n]
Figure 6: Shared Server Pools
10.6. Agent Chains
The scenario in Figure 7 is similar to that of Figure 3, except that,
instead of the client possibly needing to select an agent that can
route requests to the least loaded server, in this case A1 and A2
need to make similar decisions when selecting between A3 or A4. As
the former scenario, this could be mitigated if A3 and A4 aggregate
upstream loads into the load information they report downstream.
---A1---A3----S[1], S[2]...S[p]
/ | \ /
C | x
\ | / \
---A2---A4----S[p+1], S[p+2] ...S[n]
Figure 7: Agent Chains
10.7. Fully Meshed Layers
Figure 8 extends the scenario in Figure 6 by adding an extra layer of
agents. But since each layer of nodes can reach any node in the next
layer, each node only needs to consider the load of its next-hop
peer.
---A1---A3---S[1], S[2]...S[p]
/ | \ / |\ /
C | x | x
\ | / \ |/ \
---A2---A4---S[p+1], S[p+2] ...S[n]
Figure 8: Full Mesh
Campbell, et al. Expires September 4, 2015 [Page 14]
Internet-Draft Abbreviated Title March 2015
10.8. Partitions
A Diameter network with multiple is said to be "partitioned" when
only a subset of available servers can server a particular realm-
routed request. For example, one group of servers may handle users
whose names start with "A" through "M", and another group may handle
"N" through "Z".
In such a partitioned network, nodes cannot load-balance requests
across partitions, since not all servers can handle the request. A
client, or an intermediate agent, may still be able to load-balance
between servers inside a partition.
10.9. Active-Standby Nodes
The previous scenarios assume that traffic can be load balanced among
all peers that are eligible to handle a request. That is, the peers
operate in an "active-active" configuration. In an "active-standby"
configuration, traffic would be load-balanced among active peers.
Requests would only be sent to peers in a "standby" state if the
active peers became unavailable. For example, requests might be
diverted to a stand-by peer if one or more active peers becomes
overloaded.
10.10. Addition and removal of Nodes
When a Diameter node is added, the new node will start by advertising
its load. Downstream nodes will need to factor the new load
information into load balancing decisions. The downstream nodes
should attempt to ensure a smooth increase of the traffic to the new
node, avoiding an immediate spike of traffic to the new node. It
should be determined if this use case is in the scope of the load
control mechanism.
When removing a node in a controlled way (e.g. for maintenance
purpose, so outside a failure case), it might be appropriate to
progressively reduce the traffic to this node by routing traffic to
other nodes. Simple load information (load percentage) would be not
sufficient. It should be determined if this use case is in the scope
of the load control mechanism.
11. Security Considerations
Load information may be sensitive information in some cases.
Depending on the mechanism. an unauthorized recipient might be able
to infer the topology of a Diameter network from load information.
Load information might be useful in identifying targets for Denial of
Service (DoS) attacks, where a node known to be already heavily
Campbell, et al. Expires September 4, 2015 [Page 15]
Internet-Draft Abbreviated Title March 2015
loaded might be a tempting target. Load information might also be
useful as feedback about the success of an ongoing DoS attack.
Any load information conveyance mechanism will need to allow
operators to avoid sending load information to nodes that are not
authorized to receive it. Since Diameter currently only offers
authentication of nodes at the transport level, any solution that
sends load information to non-peer nodes might require a transitive-
trust model.
12. IANA Considerations
This document makes no requests of IANA.
13. References
13.1. Normative References
[I-D.ietf-dime-ovli]
Korhonen, J., Donovan, S., Campbell, B., and L. Morand,
"Diameter Overload Indication Conveyance", draft-ietf-
dime-ovli-03 (work in progress), July 2014.
[RFC6733] Fajardo, V., Arkko, J., Loughney, J., and G. Zorn,
"Diameter Base Protocol", RFC 6733, October 2012.
[RFC7068] McMurry, E. and B. Campbell, "Diameter Overload Control
Requirements", RFC 7068, November 2013.
13.2. Informative References
[I-D.korhonen-dime-ovl]
Korhonen, J. and H. Tschofenig, "The Diameter Overload
Control Application (DOCA)", draft-korhonen-dime-ovl-01
(work in progress), February 2013.
[I-D.roach-dime-overload-ctrl]
Roach, A. and E. McMurry, "A Mechanism for Diameter
Overload Control", draft-roach-dime-overload-ctrl-03 (work
in progress), May 2013.
[I-D.tschofenig-dime-dlba]
Tschofenig, H., "The Diameter Load Balancing Application
(DLBA)", draft-tschofenig-dime-dlba-00 (work in progress),
July 2013.
Campbell, et al. Expires September 4, 2015 [Page 16]
Internet-Draft Abbreviated Title March 2015
[RFC2782] Gulbrandsen, A., Vixie, P., and L. Esibov, "A DNS RR for
specifying the location of services (DNS SRV)", RFC 2782,
February 2000.
Authors' Addresses
Ben Campbell
Oracle
7460 Warren Parkway # 300
Frisco, Texas 75034
USA
Email: [email protected]
Steve Donovan (editor)
Oracle
7460 Warren Parkway # 300
Frisco, Texas 75034
United States
Email: [email protected]
Jean-Jacques Trottin
Alcatel-Lucent
Route de Villejust
91620 Nozay
France
Email: [email protected]
Campbell, et al. Expires September 4, 2015 [Page 17]