-
Couldn't load subscription status.
- Fork 23
The BESS Scheduler
Let's take a look at simple pipeline from a sample script (bess/bessctl/conf/samples/acl.bess):
localhost:10514 $ show pipeline
+---------+ +----------+ +-----+ +-------+
| source0 | :0 200285248 0: | rewrite0 | :0 200257248 0: | fw | :0 100117232 0: | sink0 |
| Source | -----------------> | Rewrite | -----------------> | ACL | -----------------> | Sink |
+---------+ +----------+ +-----+ +-------+
The modules classes above can be divided into two categories, based on their behavior:
- The
Rewrite,ACLandSinkclasses are only called by their left neighbor when there are packets to process. - The
Sourcemodule class is called periodically and generates packets on its own.
The Source module class behaves differently because its instances create a task.
The QueueInc class behaves a lot like Source: it registers a task that periodically gets called to read packets from a Port rxq. Remember: BESS ports operate in polling mode, to avoid interrupt overhead.
The PortInc module is very similar to the QueueInc module, except that it may register more tasks (one per each rxq of the Port).
A module class is not forced to choose between receiving packets and generating them: there are classes that mix the two behaviors. Let's take a look at another pipeline (bess/bessctl/conf/samples/queue.bess):
localhost:10514 $ show pipeline
+--------+ +----------+ +-----------+ +-------------------+ +-------+
| src | | rewrite0 | | queue | | vlan_push0 | | sink0 |
| Source | :0 26043968 0: | Rewrite | :0 26040608 0: | Queue | :0 2897536 0: | VLANPush | :0 2898400 0: | Sink |
| | ----------------> | | ----------------> | 1023/1024 | ---------------> | PCP=0 DEI=0 VID=2 | ---------------> | |
+--------+ +----------+ +-----------+ +-------------------+ +-------+
The queue module above (instance of Queue, very different from QueueInc!), receives packets generated from a task registered by src, but also registers its own task. It doesn't immediately forwards the packets received, but it stores them in a ring buffer. The task created by queue will later read packets from the ring buffer and forwards them to its right neighbor.
There are two different tasks in the pipeline. How often does the src task get called? How often does the queue task get called?
BESS implements a fully hierarchical task scheduler that supports different policies. The job of the scheduler is to decide which task needs to be executed next. The tasks in the scheduler are organized in a tree-like data structure, where the leaf nodes are the tasks itself, and the other nodes represents particular policies.
When the user or the author of the BESS script doesn't configure the scheduler, the execution of all the tasks is interleaved in a round robin fashion. The scheduler tree can be examined using the show tc command:
localhost:10514 $ show tc
<worker 0>
+-- !default_rr_0 round_robin
+-- !leaf_src:0 leaf
+-- !leaf_queue:0 leaf
The above command shows that we have only one thread (worker 0) with a very simple tree: there's a root node (called !default_rr_0) with type round_robin and two children (!leaf_src:0) and (!leaf_queue:0), which are the two tasks registered by the src and queue modules. In this case the scheduler behavior is very simple: it will simply alternate the execution of src and queue over and over.
The rate of the execution of a task can be throttled with a 'rate_limit' node in the scheduler tree. We can create the node with a line in the BESS configuration script:
bess.add_tc('fast', policy='rate_limit', resource='packet', limit={'packet': 9000000})
If we inspect the tree now we see:
localhost:10514 $ show tc
<worker 0>
+-- !default_rr_0 round_robin
+-- !leaf_src:0 leaf
+-- !leaf_queue:0 leaf
+-- fast rate_limit 9.000 Mpps
The newly created note doesn't have any effect on the src or queue tasks, because they're still under the round_robin policy. To actually enforce the limit on a task, we have to make it a child of the rate_limit node, using this code:
src.attach_task('fast')
With the above line we tell the module src to attach its task under the 'fast' policy. Now the scheduler tree looks like:
localhost:10514 $ show tc
<worker 0>
+-- !default_rr_0 round_robin
+-- !leaf_queue:0 leaf
+-- fast rate_limit 9.000 Mpps
+-- !leaf_src:0 leaf
Similarly, we can also limit the execution of the task registered by the queue module to a slower rate with these two lines:
bess.add_tc('slow', policy='rate_limit', resource='packet', limit={'packet': 1000000})
queue.attach_task('slow')
The final tree looks like this:
localhost:10514 $ show tc
<worker 0>
+-- !default_rr_0 round_robin
+-- fast rate_limit 9.000 Mpps
| +-- !leaf_src:0 leaf
+-- slow rate_limit 1.000 Mpps
+-- !leaf_queue:0 leaf
The following functions are used to interact with the scheduler from a BESS script:
-
bess.add_tc(name, policy, wid=-1, parent='', resource=None, priority=None, share=None, limit=None, max_burst=None)Create a new node in the scheduler tree called
nameof typepolicy.namemust be a unique string that identifies the node (it cannot start with'!').policycan be one of the following:-
'round_robin': Each time this node is visited by the scheduler, a child is picked in a round robin fashion. -
'weighted_fair': The children of this nodes are executed in proportion to theirshare. -
'rate_limit': The node can have at most one child. When the child execution exceeds the limits imposed bylimitandmax_burstthe node will be put in a blocked state (it will be unblocked after an appropriate amount of time). -
'priority'The node always schedules the child with the highest priority that's not blocked (in the sense described by therate_limitnode.). If a node has no children, or if all its children are temporarily blocked, it is considered blocked itself. -
'leaf'Nodes with this policy represent a task. They cannot be created withadd_tc, they're added by the modules when a task is registered.
The
widandparentarguments control where to place the new node in the tree. They're mutually exclusive, i.e. only one of them can have a non default value. Ifparentis specified, its value must be the name of an existing node in the tree: the newly added node will become one of its children. Ifwidis specified, the newly added node will become the root of the tree on workerwid; if there's a root already, the roots will be placed under a round robin node named'default_rr_<wid>'. If alsowidis unspecified (i.e. -1), the worker will be chosen in a round robin fashion.The
weighted_fairorrate_limitpolicies can (respectively) share among children or limit different types of resources. Theresourceparameter must be used when creating one of them to choose. It can be:-
count: The new node will share fairly or limit the number of times a child is scheduled. -
cycle: The new node will share fairly or limit the number of cycles (as measured by the TSC) a child's execution takes. -
packet: The new node will schedule the children nodes to try to share fairly or to limit the number of packets generated. -
bit: The new node will schedule the children nodes to try to share fairly or to limit the number of bits generated.
The next two parameters are only used when attaching to certain types of parent nodes. The
priorityparameter control the priority that this node has among the children of apriorityparent (a lower number means higher priority). Theshareparameter controls the relative share of the resource among the children of aweighted_fairparent.limitandmax_burstare only used when creatingrate_limitnodes: they control the rate of the resource used and the excess allowed. They must be in the form of an object with the resource as key (which must be the same as theresourceparameter) and an integer as value (e.g.{'packet': 1000}). The node will schedule its children to consume no more thanlimitunits per second. -
-
<module>.attach_task(parent='', wid=-1, module_taskid=0, priority=None, share=None)bess.attach_task(module_name, parent='', wid=-1, module_taskid=0, priority=None, share=None)Moves a task in the scheduler tree. The two forms are equivalent.
The task to move is the task numbered
module_taskid(it is usually 0), registered by a module. The module is identified by the<module>object in the first form, or bymodule_namein the second form.parentandwidbehave like inbess.add_tc().priorityandsharebehave like inbess.add_tc(). -
update_tc_params(self, name, resource=None, limit=None, max_burst=None)Update the parameters of the node named
namein the scheduler tree. Onlyweighted_fairandrate_limitnodes have parameters that can be updated.resource,limitandmax_burstbehave like inbess.add_tc()
Copyright (c) 2014-2016, The Regents of the University of California. All rights reserved.
Copyright (c) 2016-2022, Nefeli Networks, Inc. All rights reserved.
Copyright (c) 2022-Present, Open Networking Foundation. All rights reserved.