-
Notifications
You must be signed in to change notification settings - Fork 416
[AP][Legalization] Updated the Mass Abstraction #3123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[AP][Legalization] Updated the Mass Abstraction #3123
Conversation
Within the partial legalizer of the AP flow, a mass abstraction was used to abstract the complex logical block constraints of the device. Originally, this abstraction created a primitive vector which kept a count of each model used by the atom and the capacity of each model in a given tile. This worked well, but it was very innacurate due to not capturing differences between models with different numbers of inputs/outputs. Updated the mass abstraction in three ways: 1) Used the number of bits stored in a memory as the mass for memory primitives. 2) Used the number of pins used for all other types of primitives (clamped by the capacity of pins for the given logical block). 3) Created the concept of "one-hot" primitives to assign multiple models to the same dim if they are part of a primitive which can only every implement one of them. These changes improved the ability of the partial legalizer to understand the details of the complex blocks in the architecture such that it can better legalize during analytical placement. From experimentation, found that this greatly improved the post-FL quality of the placement with practically no cost in run time (since this does not change the complexity of the legalizer, just spends slightly more time deciding the mass of the atoms / tiles).
Results on the largest VTR circuits with no fixed IOs (not timing driven):
The important metric to look at here is the post-Full Legalization HPWL which improved by 20%. This allowed the detailed placement to run for less time (24% faster); but it got a worst final solution. The quality of the final solution is hard to control, but it does show that there is something wrong with the initial temperature. |
Running Titan now, but this is ready for review. |
Titan results are in. These results have no fixed IOs and is not timing driven:
The post Full Legalizer HPWL improved by 24%, which is fantastic. This yielded a 2% decrease in post-detailed placement HPWL, which ultimately led to a 2% improved post-route HPWL. This also reduced the run time of the detailed placer since it started and a better starting placement. However, global placement runtime increased slightly which washed any run time gains. Overall this improvement is nothing but good. I would like to further investigate target density in the partial legalizer; currently it just aims to fill all bins to 100% capacity; but I think if I set the target density based on properties of the physical tiles (i.e. make CLB tiles less filled) we can see even further gains based on some small scale experiments I have been working on. |
// Get the mass of this atom (i.e. how much resources on the device this atom | ||
// will likely use). | ||
float mass = 0.0f; | ||
if (!is_primitive_memory_pb_type(primitive->pb_type)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might be able to shorten the code by calling your physical routine, and then just blending in two different proportions based on whether it is a memory or not.
capacity.set_dim_val(dim, std::min(capacity.get_dim_val(dim), total_curr_score)); | ||
} | ||
|
||
// A pb_type may contain multiple copies of the same PB. Multiply the capcity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
capacity
capacity = PrimitiveVector::max(capacity, mode_capacity); | ||
} | ||
|
||
// The current pb only has a set number of pins. Therefore, each dimension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not super clear to me ... not sure if the comment can be clarified.
*/ | ||
struct OneHotPbType { | ||
/// @brief The root pb type which contains the modes which act in a one-hot | ||
/// fasion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fasion -> fashion
log_verbosity); | ||
|
||
// For each model, label them with their shared model ID if they are part | ||
// of one, -1 otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of one -> of a one_hot_pb_type
/** | ||
* @brief Add the given model ID to the given primitive vector dim. | ||
* | ||
* It is assumed that the given model is not part of any other dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dimensions -> dimension
is_one_hot = false; | ||
break; | ||
} | ||
// Check if the mode is unique. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to self: This is missing a condition to check. What is being checked is that it is singular and is a primitive. It is not checking uniqueness.
Within the partial legalizer of the AP flow, a mass abstraction was used to abstract the complex logical block constraints of the device. Originally, this abstraction created a primitive vector which kept a count of each model used by the atom and the capacity of each model in a given tile. This worked well, but it was very innacurate due to not capturing differences between models with different numbers of inputs/outputs.
Updated the mass abstraction in three ways:
Used the number of bits stored in a memory as the mass for memory
primitives.
Used the number of pins used for all other types of primitives
(clamped by the capacity of pins for the given logical block).
Created the concept of "one-hot" primitives to assign multiple models
to the same dim if they are part of a primitive which can only every
implement one of them.
These changes improved the ability of the partial legalizer to understand the details of the complex blocks in the architecture such that it can better legalize during analytical placement.
From experimentation, found that this greatly improved the post-FL quality of the placement with practically no cost in run time (since this does not change the complexity of the legalizer, just spends slightly more time deciding the mass of the atoms / tiles).