5 Replies Latest reply on Sep 17, 2015 10:48 AM by Carlos_A

    Iris 5200 GPU and SDRAM memory controller TDP

    DarkTiger Green Belt


      We have a problem with our application running on i7-4850 processor (47W TDP). We just cut DC power source wire and insert current measure tester into current path. Consumption for whole device is 35W without running our application and 72W with it. Even considering DC/DC converters chain efficiency as 85%, we get 37*0.85=31.5W. It's too much.

      What our application does. It processes real-time 4K video data debayering from SDRAM, sends results to eDP port then. Other application (gstreamer) put data into SDRAM from PCIe for the processing. gstreamer consumes only 2W. GPU does all data processing, CPU is used as "Data pump" only.

      So, the question is: which block could eat so much energy?

      First candidate is DDR3L SDRAM, include SDRAM controller and its buffers. Memory massives are huge (~16M per one video frame), and, if data in SDRAM were not properly aligned to read, we have read of whole cache line by reading each required 2 bytes. 16 times or more overhead. Yes, I know that we need to add #pragma block_loop in code, to explore "space locality" memory property in cache, reading 16 data points per memory access, but we don't have code sources at the moment and have to speculate only. I also understand that proper way is to read data from PCIe to EDRAM buffer directly, without using SDRAM at all - it is not possible at the moment.

      Second candidate is L3 cache. If data in memory were mapped so bad during write that we can't explore "time locality" data property during read, we have giant cache processing without any useful results, while cache is one of most power consumers in processor.

      Third candidate is GPU. I can't make any conclusions because I don't have any data about GPU power consumption. I heard estimation as "10-15% of TDP", but I'm afraid it were speculations, too.

      Can anyone suggest me document # which describes power comsumption of Haswell-M building blocks to make some conclusions about power optimization way?

      With best regards,