In order to fulfill the basic functions of our service, the user hereby agrees to allow Xiaomi to collect, process and use personal information which shall include but not be limited to written threads, pictures, comments, replies in the Xiaomi Community, and relevant data types listed in Xiaomi's Private Policy. By selecting "Agree", you agree to Xiaomi's Private Policy and Content Policy .
Agree

Tech

[News] Know About GPU

2015-07-06 12:47:26
2899 5
The system-on-a-chip is quite a small chip that is used on the mainboard of a smartphone, and as the GPU is actually inside this chipset, to physically find the GPU while looking at the insides of a phone is near impossible. That said, if you manage to locate the SoC you are pretty much right as you would find it in there somewhere if you deconstructed the chip.
The GPU is the "2D/3D Graphics Processor" part of the Tegra 2 SoC above
This is completely different to a desktop or laptop computer, which usually uses a dual-chip solution. For example in the desktop computer I’m using to write this article you would find the CPU attached to the motherboard, and the graphics processor (GPU) is attached to a separate mainboard which is then attached to the motherboard. The two critical components of my desktop are actually physically quite far apart.
There is of course a reason as to why the two chips in a smartphone are so close. First off you’ll discover that smartphones and tablets don’t have a huge amount of internal space to work with, and so having critical components packaged together allows the device’s mainboard to be small and the battery to be large. Secondly, packaging the two units as one reduces the heat output of the device as it’s more localized and can save power through tightly integrating the two. Finally, it saves manufacturing costs to produce one chip instead of two.
What does the GPU do?
The use of the GPU depends on several factors: the structure of the system-on-a-chip and also the operating system used on the device. For the former, if the SoC doesn’t happen to have a dedicated media decoding chip then the GPU might be used to handle high-resolution videos. There is also the possibility that compatible tasks are offloaded to the GPU so the more power intensive CPU cores can clock themselves down.
When it comes to the operating system things are a lot more complex. First and foremost the GPU is used entirely for all 3D rendering in games and applications. The Cortex processing cores are simply not designed to handle these sorts of tasks and in all operating systems the GPU will take over from the CPU to handle the rendering more efficiently. The CPU will help out for certain calculations while rendering 3D models on screen (especially for games), but the main grunt will be done by the graphics chip.
Most graphics cores also support 2D rendering in certain areas: things such as interface animations and image zooming are two good examples. The processor can also usually handle these tasks so whether the GPU is used is usually up to the operating system used on the device.
Playing Asphalt 6: Adrenaline on this Galaxy Note would be very difficult without a GPU
Windows Phone is very animation heavy and with the relatively low-power SoCs used in WP devices it would be impossible to get smooth action from simply using the CPU. As such, the GPU plays a big part in rendering the main interface and other animation-heavy UIs, leaving the user with a very smooth experience.
Android is a whole other story. As the original and low-end devices that were available running Android did not have powerful GPUs in them at all it was impossible to offload all 2D rendering tasks to the GPU. Google decided for compatibility reasons that it was better to simply have all rendering done by the CPU (which for early devices wasn’t very good either) and so the signature Android lag was born.
This was corrected finally in Android 4.0 because modern SoCs actually have very capable GPUs, and with old devices almost certainly not getting the update it was time for Google to allow good devices to render the interface elements using their GPU. It is still possible to get a smooth interface from just CPU rendering (as you will see in Android 2.3 devices like the Galaxy S II and Motorola Droid Razr), but the GPU is more efficient so you’ll likely see it getting used for these tasks from here on out.
As you might have guessed, iOS on the iPhone and iPod Touch is very smooth because it renders most interface elements using the GPU. Apple only has to work with a very small selection of hardware and so they can tightly integrate the OS to what is actually available hardware-wise, and so there were minimal problems getting GPU acceleration to work.
Qualcomm Adreno GPUs
The Adreno graphics processing unit is the proprietary graphics chipset used in Qualcomm SoCs. Adreno GPUs used to be called Imageon and they were manufactured by ATI until Qualcomm bought the division from AMD and renamed the products to Adreno. The old Adreno 1xx series were used in old Qualcomm 7xxx SoCs, with the newer Adreno 2xx series being used inside the Snapdragon series.
In the current range of Snapdragon SoCs you see three Adreno 2xx series GPUs used: the Adreno 200 (for S1), 205 (for S2) and 220 (for S3). You might have guessed that a larger number and inclusion in a newer series indicates a more powerful GPU, and you would be correct: Qualcomm states that each successive GPU is twice as fast as the last, meaning the Adreno 220 is around 4x faster than the 200.
Adreno GPUs are used exclusively in Qualcomm Snapdragon SoCs
Adreno GPUs used up to S3 Snapdragons support both OpenGL ES 2.0 and 1.1 along with Direct3D 9.3; Adrenos after and including the Adreno 205 support hardware accelerated SVG and Adobe Flash. These are really all the APIs needed to ensure that modern mobile games work on the smartphone that adopts an Adreno GPU, as no modern games really use the newer OpenGL ES 3.0 API or Direct3D 11 (yet).
In usual Qualcomm fashion there is virtually no information relating to core layout of the Adreno-series GPUs, fillrate statistics or estimated GFLOPS capabilities of these chipsets. This makes it very hard to compare the chips without resorting to a benchmark.
Heading to the future and Qualcomm actually has decided to release information on their chips such as the Adreno 225 which will first appear in their S4 SoCs that use their new Krait core architecture. Unlike the future Adreno 3xx series they do not improve on the API support but do improve on performance: Qualcomm claims they will be 50% faster than the Adreno 220 and roughly on par with the PowerVR SGX543MP2 (found in the Apple A5), capable of 19.2 GFLOPS at 300 MHz.
Imagination Tech PowerVR GPUs
The second major producer of smartphone graphics chipsets is Imagination Technologies, which makes the PowerVR line of mobile GPUs. There have been many series of PowerVR GPUs, though current devices use products from either the PowerVR SGX 5 or 5XT series.
PowerVR GPUs are licensed to other SoC manufactures and so they find their way into a variety of devices. TI OMAP chipsets exclusively use PowerVR GPUs, and you’ll also find them inside some older Samsung Exynos chipsets and also the Apple A4 and A5. They are also sometimes used alongside Intel x86 processors in low-end notebook computers.
The PowerVR SGX 5 series comprises of several GPUs, only a few of which are regularly used. The PowerVR SGX530 is used in the TI OMAP 3 series and so finds its way into a huge array of single-core devices from the Motorola Droid (original) to the Nokia N9. When clocked at 200 MHz, the SGX530 is capable of 1.6 GFLOPS. The SGX535 (used in the iPhone 3GS and iPhone 4) is a die shrinkage of the SGX530 and contains DirectX 9.0c support where the 530 does not, but retains the same performance.
This is a look in to the architecture of the PowerVR SGX 5XT series
The most popular of the 5 series is the PowerVR SGX540 that is used in both the original Samsung Exynos chipset (the Hummingbird) for the Galaxy S along with the TI OMAP 4 series. It has support for DirectX 10 and is capable of 3.6 GFLOPS at 200 MHz, twice that of the SGX530. Unlike the SGX530, the SGX540 can be clocked up to 400 MHz and so theoretically the GPU can achieve 7.2 GFLOPS.
Some people may look at the implementations of the SGX540 and wonder why it appears in the relatively old single-core Hummingbird SoC in the original Galaxy S but also appears in the TI OMAP 4460 dual-core SoC used in the Galaxy Nexus. It turns out the clockspeeds are actually different for both SoCs: the Hummingbird sees the 540 at 200 MHz (delivering 3.2 GFLOPS), the TI OMAP 4430 used in the Droid Razr has it at 304 MHz (~4.8 GFLOPS) and the TI OMAP 4460 at 384 MHz (~6.1 GFLOPS).
The newer 5XT series hasn’t really found its way into many devices yet, with the only notable inclusions being the Apple A5 chip used in the iPad 2 and iPhone 4S and the PlayStation Vita. Where the 5 series only has a single GPU core, the 5XT series supports up to 16 cores, each of which is twice as fast as the SGX540. GPUs in the 5XT are affixed with MPx, where the x denotes the number of cores: for example the SGX543MP2 used in the Apple A5 has two cores.
Currently the SGX543 is the only chip that has found its way into SoCs, with the similar SGX544 scheduled to go into the TI OMAP 5 series. The SGX543 delivers 6.4 GFLOPS per core at 200 MHz, meaning that at the low 200 MHz the SGX543MP2 in the Apple A5 achieves 12.8 GFLOPS which is a considerable improvement over the highest-clocked SGX540. As Apple hasn’t specified what clock speed the GPU in the A5 runs at, my best estimate is around 250-300 MHz based on benchmarks, which means we’re looking at between a whopping 16 and 19 GFLOPS.
I wouldn’t think that too many manufacturers would exceed two cores in the SGX543 as each core added uses more power, but Sony decided that a quad-core SGX543MP4+ is the way to go in the PlayStation Vita. Even if this is clocked at just 200 MHz, the PSVita’s GPU is capable of 25.6 GFLOPS; up that to 300 MHz and you get 38.4 GFLOPS. Like Apple, Sony actually hasn’t specified a GPU clock speed so we can only guess as to how much power the Vita’s GPU actually has.
For interests sake, a PowerVR SGX543MP16 (16-core variant) clocked at the maximum 400 MHz would be capable of 204.8 GFLOPS. That’s enormous and certainly would use a lot of power, but as far as I can tell no such GPU has found its way, or ever will find its way into a production device.
ARM Mali GPUs
The section on Mali GPUs is going to be relatively short because the Mali GPU is currently only used in one SoC: the Samsung Exynos 4210 found in the Samsung Galaxy S II, Galaxy Note and Galaxy Tab 7.7. The Mali range is ARM’s own, so it should be an ideal partner for the Cortex processing cores used in the Exynos chipset.
Even though on paper there are several Mali GPUs, the only one that has really been used is the quad-core Mali-400 MP4 in the Exynos 4210. When ARM says that the Mali-400 MP4 is “quad-core” it is not truly four processing cores like the PowerVR SGX543MP4, it’s simply four pixel shader processors put together. This is why the Mali-400 MP4 does not have the same graphical capabilities as PowerVR’s true quad-core GPU.
This is what's inside a Mali-400 MP4
To quantify the performance the Mali-400 MP4 is capable of 7.2 GFLOPS at 200 MHz, meaning that it is faster than a single-core PowerVR SGX543. The targeted clock speed for use in the Exynos 4210 is 275 MHz, meaning the GPU is capable of 9.9 GFLOPS and making it the fastest GPU available in an Android smartphone at the time of writing.
Roughly speaking the Mali-400 MP4 in the Galaxy S II is twice as fast as the SGX540 in the Droid Razr and ~75% faster than the same GPU in the Galaxy Nexus. In turn, the iPhone 4S’ SGX543MP2 is around twice as fast as the Mali-400 and the Playstation Vita is even faster than that.
Samsung will continue to use the Mali GPUs in their future Exynos 5xxx SoCs, although they will be more powerful units than the Mali-400 MP4. Currently Samsung claims the next Exynos chip’s GPU will be “4x faster” than the implementation in the 4210, but I’d take that with a grain of salt until we find out exactly what is in there.
NVIDIA ULP GeForce GPUs
I mentioned briefly in the processor section of this series that for desktop graphics card manufacturing giant NVIDIA, the GPUs in their smartphone SoCs aren’t particularly impressive. In fact, NVIDIA’s ULP GeForce that is in their Tegra SoCs is the slowest GPU from the first-generation of dual-core processors, and I’ll explain why.
The ULP GeForce is used in two main Tegra 2 chipsets: Tegra 250 AP20H and Tegra 250 T20; the former for smartphones and the latter for tablets. The ULP GeForce used here is clocked at 300 MHz (AP20H) or 333 MHz (T20), and is only capable of 3.2 GFLOPS at 200 MHz. This means that the AP20H at 300 MHz sees 4.8 GFLOPS and the T20 at 333 MHz sees 5.33 GFLOPS.
Now at first glance you would notice that the smartphone GFLOPS capability of the Tegra 2 is the same as the PowerVR SGX540 clocked at 300 MHz, and that is true. However, the maximum clock speed of the SGX540 seen in an actual device is 384 MHz in the Galaxy Nexus, which is capable of 6.1 GFLOPS. This is faster than even the tablet iteration of Tegra 2 at 333 MHz, making the Tegra 2 the least capable GPU.
A Tegra 3 die image, with the GPU hidden inside somewhere
Of course we’re just talking specifics here, and many other factors actually affect the performance of a GPU such as CPU clock speed and display size, but if you’re talking about the most capable GPU the Tegra 2 is definitely not number one.
As we move into the second generation of multi-core processors, NVIDIA was first to strike the market with their quad-core Tegra 3 as I mentioned in my processor article. You would expect that the Tegra 3 ULP GeForce GPU would see a boost, and while it has it might not be as large as you would like.
The Kal-El GeForce is capable of 4.8 GFLOPS at 200 MHz, which you can immediately see is less than the 200 MHz performance of the Mali-400 MP4 and PowerVR SGX543MP2. NVIDIA hasn’t exactly specified what clock speed the GPU is running at in the Tegra 3 chip in devices such as the ASUS Transformer Prime, save for that the clock speed is greater than Tegra 2. If we estimate that it runs at 400 MHz, it’s still only capable of 9.6 GFLOPS which is close to, but just short of the Mali-400 MP4.
Comparison of smartphone GPUs
So now that you know about the different ranges of mobile GPUs available it’s time to see which is the fastest. To do so I’ve made this handy chart that lists them from the most powerful to least in terms of GFLOPS.
Please note that this simply indicates the potential performance of each GPU and does not reflect real-world performance. GPUs are placed in a wide range of systems where external factors such as increased processor clock speeds, RAM types and speeds, display resolutions and more can affect the actual graphical performance of a smartphone.
GPUSoC ExampleDevice ExampleGFLOPS at 200 MHzGFLOPS in SoC
PowerVR SGX543MP4+PSVitaPlayStation Vita25.625.6+
PowerVR SGX543MP2Apple A5Apple iPhone 4S12.816
at 250 MHz*
Mali-400 MP4Exynos 4210Samsung Galaxy S II7.29.9
at 275 MHz
"Kal-El" GeForceTegra 3ASUS Transformer Prime4.89.6
at 400 MHz*
PowerVR SGX540OMAP4460Galaxy Nexus3.26.1
at 384 MHz
Adreno 220MSM8260HTC SensationN/AN/A
ULP GeForceTegra 2Motorola Xoom3.25.3
at 333 MHz
PowerVR SGX540OMAP4430Motorola Droid Razr3.24.8
at 304 MHz
ULP GeForceTegra 2LG Optimus 2X3.24.8
at 300 MHz
PowerVR SGX540HummingbirdSamsung Galaxy S3.23.2
at 200 MHz
Adreno 205MSM8255HTC TitanN/AN/A
PowerVR SGX535Apple A4iPhone 41.61.6
at 200 MHz*
PowerVR SGX530OMAP3630Motorola Droid X1.61.6
at 200 MHz
Adreno 200QSD8250HTC HD7N/AN/A
*these GFLOPS figures are based on estimated (rather than known) SoC clock speeds
Note: Qualcomm Adreno GPUs are included as placeholders in this chart, but as their positions are determined by benchmarks rather than GFLOPS performance there is no way to fully know where they rank

2015-07-06 12:47:26
Favorites RateRate
Thanks for sharing
2015-07-08 19:13:46

Pro Bunny

Reicherd | from mobile

#2

nice sharing vrohh
2015-07-11 08:37:38
Nice info on GPU
2015-07-11 08:54:27
Thanks for the share
2015-07-11 14:27:04

Rookie Bunny

Alecanvil-mi | from Redmi 6 Pro

#5

ok
2020-02-18 08:49:57
please sign in to reply.
Sign In Sign Up

449505547

Master Bunny

  • Followers

    13

  • Threads

    1027

  • Replies

    8

  • Points

    10587

Read moreGet new
Copyright©2010-2021 Xiaomi.com, All Rights Reserved
Quick Reply To Top Return to the list