<4.2, 4.3> Now assume that we can use scatter-gather loads and stores (LVI and SVI). Assume that tiPL,
tiPR, clL, clR, and clP are arranged consecutively in memory. For example, if seq_length==500, the tiPR array would begin 500 * 4 bytes after the tiPL array. How does this affect the way you can write the VMIPS code for this kernel? Assume that you can initialize vector registers with integers using the following technique which would, for example, initialize vector register V1 with values (0,0,2000,2000): LI R2,0 SW R2,vec SW R2,vec+4 LI R2,2000 SW R2,vec+8 SW R2,vec+12 LV V1,vec
Assume the maximum vector length is 64. Is there any way performance can be improved using gather-scatter loads? If so, by how much?
Recently Asked Questions
- Please refer to the attachment to answer this question. This question was created from C310-HW and Discussion 7.pdf. Additional comments: "Please help me
- A computer system has a total of 150 units of memory, currently allocated to three processes as shown below: Process ID Maximum Claim Allocation 1 70 45 2 60
- 1.Consider a disk that has 10 data blocks starting from block 14 through 23. Let there be 2 files on the disk: f1 and f2. The directory structure lists that