-
Notifications
You must be signed in to change notification settings - Fork 8
Inf packed bag l #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: khanin-upstream
Are you sure you want to change the base?
Inf packed bag l #104
Conversation
added packing L optimization
packed bag L optimization
added packedMode_L optimization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues I see so far:
- It fails to pass tests
- Seems like not all issues with history are yet solved
- Some dead code that shouldn't go into upstream
- Copy-paste
Still continue the review on main algorithm
fbgemm_gpu/fbgemm_gpu/split_table_batched_embeddings_ops_inference.py
Outdated
Show resolved
Hide resolved
fbgemm_gpu/fbgemm_gpu/split_table_batched_embeddings_ops_inference.py
Outdated
Show resolved
Hide resolved
fbgemm_gpu/codegen/inference/embedding_forward_quantized_split_nbit_kernel_template.cu
Outdated
Show resolved
Hide resolved
fbgemm_gpu/codegen/inference/embedding_forward_quantized_split_nbit_kernel_template.cu
Outdated
Show resolved
Hide resolved
fbgemm_gpu/codegen/inference/embedding_forward_quantized_split_nbit_kernel_template.cu
Outdated
Show resolved
Hide resolved
| {% if not nobag %} | ||
| VecNT<{{ (32 // emb_weight_type.bit_width) }}, PrimitiveType::{{ emb_weight_type.primitive_type }}> accumulators[OutputRowsPerThread][AccumulateStoreRequests]; | ||
| {% endif %} | ||
| if constexpr (PackedMode_L){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid copy-pasting and embed your algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one might be hard because some for loops are fused together (input_row_in_flights+packedBagL) so we might end up having 2 if else condition still but on a later line of the code instead.
No description provided.