Description
I've used trio in some projects lately, and also recently added it to dnspython. As part of some random performance testing, I compared "minimal DNS server" performance between ordinary python sockets, curio, and trio. I'll attach the trio program to the bottom of this report. Trio performed significantly worse. E.g. on a particular linux VM:
Regular python I/O: 9498 QPS
Curio: 8979 QPS
Trio: 5007 QPS
I've seem similar behavior on a Mac:
Regular python I/O: 8359 QPS
Curio: 8061 QPS
Trio: 3425 QPS
I'm using dnsperf to generate load, and it does a good job of keeping UDP input queues full, so the ideal expected behavior if you strace the program is to see a ton of recvfrom() and sendto() system calls, and nothing else. In particular, you don't expect to see any epoll_wait(). Ordinary python I/O and curio behave as expected, but going through the loop in trio looks like:
recvfrom()
clock_gettime()
epoll_wait() (not waiting on any fds if I'm reading strace output correctly)
clock_gettime()
clock_gettime()
epoll_wait()
clock_gettime()
sendto()
clock_gettime()
epoll_wait()
clock_gettime()
I don't understand trio's internals enough to debug this further at the moment, but I thought I would make a report, as this seems like excessive epolling.
I haven't dtrussed on the mac, but python tracing indicated a lot of time related to kqueue.
This was with trio 0.15.1, and Cpython 3.8.3 and 3.7.7.
import socket
import trio
import trio.socket
import dns.message
import dns.rcode
async def serve():
with trio.socket.socket(socket.AF_INET, socket.SOCK_DGRAM) as s:
await s.bind(('127.0.0.1', 5354))
while True:
(wire, where) = await s.recvfrom(65535)
q = dns.message.from_wire(wire)
r = dns.message.make_response(q)
r.set_rcode(dns.rcode.REFUSED)
await s.sendto(r.to_wire(), where)
def main():
trio.run(serve)
if __name__ == '__main__':
main()