-
Notifications
You must be signed in to change notification settings - Fork 179
netkvm: Add a case for packet loss when slowly reusing memory buffers #4247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
depends on avocado-framework/avocado-vt#4045 |
d1b7c9e
to
0b9815b
Compare
@leidwang, call you to review this patch again. Thanks in advance. Test result:
|
Hi @heywji Please remember to sign the commits, thanks. |
d410739
to
b2fea54
Compare
The commits are signed now. Please help review it again when you are free. @leidwang |
Hi @heywji I will review this MR once avocado-vt avocado-framework/avocado-vt#4045 done, thanks. |
b2fea54
to
e92e461
Compare
Hi @leidwang, I have removed the "avocado-vt avocado-framework/avocado-vt#4045" related codes. Could you help review it again since this patch can be passed by applying Houqi's patch first? Thanks, |
start_vm_vm2 = no | ||
smp = 2 | ||
queues = ${smp} | ||
vectors = 1024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a must for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This part comes from our product bug: "RHEL-24992 BSOD occurs when the vector=1024 is set"., we need to set vectors = 1024
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, thanks.
qemu/tests/netkvm_buffer_shortage.py
Outdated
:param env: Dictionary of test environment details. | ||
""" | ||
|
||
def analyze_ping_results(session, count, timeout): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def analyze_ping_results(session, count, timeout): | |
def analyze_ping_results(dest, session, count, timeout): |
qemu/tests/netkvm_buffer_shortage.py
Outdated
""" | ||
|
||
status, output = utils_net.ping( | ||
s_vm_ip, session=c_session, count=count, timeout=timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s_vm_ip, session=c_session, count=count, timeout=timeout | |
dest, session=session, count=count, timeout=timeout |
qemu/tests/netkvm_buffer_shortage.py
Outdated
pattern = r"(\d+)% loss" | ||
match = re.search(pattern, output) | ||
if match: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pattern = r"(\d+)% loss" | |
match = re.search(pattern, output) | |
if match: | |
if re.search(r"(\d+)% loss", output): |
qemu/tests/netkvm_buffer_shortage.py
Outdated
param vm: the selected vm | ||
param netkvmco_name: the netkvm driver parameter to modify | ||
param value: the value to set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remember to keep a same format for the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @leidwang, but what's the same format meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks.
qemu/tests/netkvm_buffer_shortage.py
Outdated
if match: | ||
return match.group(1) | ||
|
||
def modify_and_analyze_params_result(vm, netkvmco_name, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def modify_and_analyze_params_result(vm, netkvmco_name, value): | |
def modify_and_analyze_params_result(vm, param_name, value): |
I guess param_name should be more clear to describe it's meaning.Because it's a netkvm parameter's name instead netkvmco name.
qemu/tests/netkvm_buffer_shortage.py
Outdated
utils_net.set_netkvm_param_value(vm, netkvmco_name, value) | ||
cur_value = utils_net.get_netkvm_param_value(vm, netkvmco_name) | ||
if cur_value != value: | ||
test.fail(f"Current value '{cur_value}' was not equires '{value}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test.fail(f"Current value '{cur_value}' was not equires '{value}'") | |
test.fail(f"Failed to set '{param_name}' to '{value}'") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Leidong. I agree with you, and this explanation is better than mine.
qemu/tests/netkvm_buffer_shortage.py
Outdated
param session: session to execute commands on the target machine. | ||
port: the port number to monitor. | ||
script_to_run: the path to the Python script to execute. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as the docstring formatting issue above.
qemu/tests/netkvm_buffer_shortage.py
Outdated
c_pip_copy_cmd = params.get("c_pip_copy_cmd") | ||
c_pip_cmd = params.get("c_pip_cmd") | ||
c_py_copy_cmd = params.get("c_py_copy_cmd") | ||
s_py_copy_cmd = params.get("s_py_copy_cmd") | ||
status, output = session.cmd_status_output(check_live_python, timeout=1200) | ||
if status == 0: | ||
return | ||
session.cmd(dest_location) | ||
if "server" in script_to_run: | ||
error_context.context( | ||
"Run python3 code runs on the server node", test.log.info | ||
) | ||
s_py_copy_cmd = utils_misc.set_winutils_letter(session, s_py_copy_cmd) | ||
session.cmd(s_py_copy_cmd) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can move the script and whl file copy part to run function or write a function for it, it's a test env preparation step.I do not think we need to copy them every time when we want to run server.py or client.py.
qemu/tests/netkvm_buffer_shortage.py
Outdated
for value in param_values.split(" "): | ||
modify_and_analyze_params_result(vm=c_vm, netkvmco_name=param_name, value=value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you please clarify why we need to modify the parameter's value before the end?Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @leidwang,
This step comes from our product bug: "RHEL-24992 BSOD occurs when the vector=1024 is set". So here, I want to simulate the operation on the client side, open the NIC properties, and modify the MinRxBufferPercent parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I am just curious why we modify the parameter's value but do not perform any testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emm. you gave me some inspiration about this operation. I need to add a ping function here to check whether BSOD happens.
qemu/tests/netkvm_buffer_shortage.py
Outdated
""" | ||
conduct a ping test to check the packet loss on slow memory buffer reallocation | ||
|
||
:param session: Local executon hint or session to execute the ping command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param session: Local executon hint or session to execute the ping command. | |
:param session: Local execution hint or session to execute the ping command. |
qemu/tests/netkvm_buffer_shortage.py
Outdated
param vm: the selected vm | ||
param netkvmco_name: the netkvm driver parameter to modify | ||
param value: the value to set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f52c96a
to
8d9d3db
Compare
@leidwang Hi Leidong, Could you help review this patch again? Test result: PASS
Thanks! |
Hi @heywji Please sign the commit, thanks. |
@leidwang Done. Please help review other problems. Thank you! |
qemu/tests/netkvm_buffer_shortage.py
Outdated
if cur_value != value: | ||
test.fail(f"Failed to set '{param_name}' to '{value}'") | ||
|
||
def check_and_restart_port(session, port, script_to_run): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port
is not called in this function
def check_and_restart_port(session, port, script_to_run): | |
def check_and_restart_port(session, script_to_run): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modified. Thank leidong for reviewing me.
c_pip_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${psutil_whl}" ${copy_dest}' | ||
c_pip_cmd = "py -m pip install ${psutil_whl}" | ||
s_py_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${s_py}" ${copy_dest}' | ||
s_py_cmd = "start cmd /c py ${s_py} ${port_num}" | ||
c_py_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${c_py}" ${copy_dest}' | ||
c_py_cmd = "start cmd /c py ${c_py} 99999 %s ${port_num}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c_pip_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${psutil_whl}" ${copy_dest}' | |
c_pip_cmd = "py -m pip install ${psutil_whl}" | |
s_py_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${s_py}" ${copy_dest}' | |
s_py_cmd = "start cmd /c py ${s_py} ${port_num}" | |
c_py_copy_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\${c_py}" ${copy_dest}' | |
c_py_cmd = "start cmd /c py ${c_py} 99999 %s ${port_num}" | |
copy_file_cmd = 'xcopy "WIN_UTILS:\packet_loss_scripts\%s" ${copy_dest}' | |
install_psutil_cmd = "py -m pip install ${psutil_whl}" | |
server_cmd = "start cmd /c py ${s_py} ${port_num}" | |
client_cmd = "start cmd /c py ${c_py} 99999 %s ${port_num}" |
We can try to merge some similar code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bit hard. Let me do more tries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can copy the files together since they are small.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Leidong. I am running the test results now and will let you and Qianqian review them together.
c08ee32
to
5ae0e1e
Compare
Under various conditions, when the host-to-device packet rate is high, we lose packets in QEMU due to a lack of guest-allocated buffers. Look also at virtio-win/kvm-guest-drivers-windows#1012 Signed-off-by: wji <[email protected]>
@leidwang @vivianQizhu Could you help me review this patch? Test Result: PASS
|
x86_64: | ||
psutil_whl = "psutil-6.1.1-cp37-abi3-win_amd64.whl" | ||
pip_cmd = "py -m pip install ${psutil_whl}" | ||
dest_location = "pushd ${copy_dest}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the absolute path directly so that we don't have to switch directories again and again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but the Python script under winutils reports some errors. This script was provided to us by the upstream reporter, and it will take extra time to modify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if so, that's fine with me.
Under various conditions, when the host-to-device packet rate is high, we lose packets in QEMU due to a lack of guest-allocated buffers. Look also at virtio-win/kvm-guest-drivers-windows#1012
ID: 2405
Signed-off-by: wji [email protected]