time function call overhead at the cost of a slightly increased program text

Time function call overhead at the cost of a slightly

This preview shows page 7 - 9 out of 18 pages.

time function call overhead, at the cost of a slightly increased program text size [ 40 ]. Finally, using workload-related insights a network programmer may manually annotate the branches of conditional program code more likely to execute during runtime, this way improving the CPU’s branch prediction success rate. Using such programmer-provided annotations, the compiler can place the likely execution branch right after the conditional expression and let the CPU automatically fetch the corresponding code into the CPU pipeline; in contrast, the CPU pipeline must be tediously invalidated and repopulated with the correct branch code in case of any mispredicted branch, possibly leading to a significant performance penalty. 2) Hardware–supported Functions in Software: An im- portant class of acceleration techniques is constituted by hardware-assisted functions, whereby the hardware exposes certain functionalities that can be used to speed up the execu- tion of software. We distinguish two categories in this context, depending on whether the assistance is offered by the NIC or by the CPU, that are reported in the rightmost part of Table II . NIC-assisted acceleration techniques range from virtualiza- tion support and direct DMA to NIC-driven parallel packet dispatching. Modern NICs contain a fast packet parser to compute flow level hashes in hardware, maintain multiple hardware packet queues called Receive Side Scaling (RSS) and expose packet counters in registers. Access to this hardware functionality is typically implemented in the low-level NIC drivers. Use of register-backed packet counters reduces mem- ory overhead, while RSS is instrumental to multi-threading
Image of page 7

Subscribe to view the full document.

8 process, as the packet RSS hash may be used by the NIC to dispatch packets to different CPU cores, in order to leverage flow-level parallelism and avoid packet reordering. RSS hashes ensure that packets of a single transport-level connection (and, depending on the RSS seed, of both direction of the connection) will always be scheduled to the same CPU, which also enforces locality of data structures usage. Furthermore, newer NICs can take advantage of Data Direct I/O (DDIO), which allows packets to be transferred directly into the last- level CPU caches instead of into the main memory, preventing a costly cache miss when the CPU starts processing the packet. Single root input/output virtualization (SR-IOV) in addition lets the NIC arbitrate received packets to the correct VNF without the explicit involvement of a hypervisor switch. CPU-assisted acceleration techniques, on the other hand, leverage the features of modern CPUs to speed up network applications. Most modern CPUs support low-level data- parallelism through an advanced Single Instruction Multiple Data (SIMD) instruction set. SIMD operations allow to exe- cute the same instruction on multiple data instances at the same time, which greatly benefits vector-based workloads like batch packet processing. In addition, latest CPU chipsets include built-in CPU virtualization
Image of page 8
Image of page 9
  • Spring '16

What students are saying

  • Left Quote Icon

    As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

    Student Picture

    Kiran Temple University Fox School of Business ‘17, Course Hero Intern

  • Left Quote Icon

    I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

    Student Picture

    Dana University of Pennsylvania ‘17, Course Hero Intern

  • Left Quote Icon

    The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.

    Student Picture

    Jill Tulane University ‘16, Course Hero Intern

Ask Expert Tutors You can ask 0 bonus questions You can ask 0 questions (0 expire soon) You can ask 0 questions (will expire )
Answers in as fast as 15 minutes