Sunday, August 24, 2014

ARM Mali-400 more succesful than ever, dominating the cost-sensitive GPU segment

Long history of adoption, increasing success


The ARM Mali-400 MP GPU core was introduced many years ago as the world's first OpenGL ES 2.0 conformant multi-core GPU, and was the first ARM-developed GPU core to see widespread adoption, particularly in configurations of multiple Mali-400 MP cores. The longevity of the Mali-400 MP continues to be remarkable. The scalability through the number of cores has allowed it be targeted at different segments, ranging from low-end to mid-range, and it continues to be used for much of that range. As of 2014, many more Mali-400 MP cores are shipping than in any point in its history.

Although introduced well before the emergence of current mainstream process technology such as 28nm, adoption of the Mali-400 MP at the 28nm process node has been very strong. Small die-size and low power consumption has allowed the use of multi-core configurations clocked much higher than early implementations of the Mali-400, and the ability to increase the size of the L2 cache inside the GPU (although limited to 256KB) has provided further performance flexibility. The GPU has also benefited from increases in memory bandwidth in modern devices.

Features of Mali-400 MP


The Mali-400 MP GPU core uses ARM's first-generation Utgard GPU architecture. Like other mobile GPU architectures, it uses a tile-based rendering architecture which reduces memory bandwidth requirement and power consumption. It allows good quality full-scene anti-aliasing (FSAA) without a significant effect on performance. The Mali-400 cuts some corners in shader floating point precision compared to competing solutions, supporting only the minimum precision required by the OpenGL ES 2.0 standard, although this is unlikely to be highly visible on the relatively small displays used in most mobile devices.

Pixel fill-rate has been its strongest point, while historically having lower triangle throughput than competing GPUs from Imagination's PowerVR series. The pixel fill-rate scales with the number of cores used, while maximum triangle throughput depends only on the clock frequency of the GPU. The increase in display resolutions in mobile devices, causing increasing pixel fill-rate requirements, can be addressed by increasing the number of cores. Typical clock frequencies used for the Mali-400 MP include 250MHz for 40nm and 500MHz for 28nm HPM.

Displacement of competitors


MediaTek's shift from mainly PowerVR GPUs to mainly Mali-400 GPUs in the second half of 2013 and their success in the cost-sensitive smartphone market and subsequent penetration of the tablet market has significantly increased the unit market share of ARM's Mali GPUs, while impacting the market share of Imagination's PowerVR. Another company using Mali-400 cores in significant volume for smartphones is Spreadtrum, which targets low-end smartphones.

Apart from smartphones, an increasing adoption of Mali-400 can also be observed in the high-volume tablet market, especially for low-end platforms, at the expense of mainly Imagination's PowerVR, while Vivante's GPU cores have increasingly been marginalized. Most of the highest volume tablet SoCs over the last few years have been equipped with Mali-400, such as Allwinner's A1x and the currently ramping Allwinner A33, Rockchip's RK3066, RK3188 as well as Rockchip's low-end platforms.

Other Chinese chip companies that are well-funded or have potential for growth, such as HiSilicon and Leadcore Technology, currently also concentrate on Mali-400 series cores.

Mali-450: A faster, more efficient Mali-400


The Mali-450 MP is a more recent GPU core which, while remaining mostly limited to the feature set of the Mali-400 MP (such as no support for OpenGL ES 3.x), is significantly faster than the Mali-400 MP and is likely to be relatively efficient. Vertex processing throughput (triangles) is doubled compared to an identically-clocked Mali-400 MP, and although pixel fill-rate per core is similar to that of the Mali-400 MP, Mali-450 MP cores can generally be clocked higher. It also includes additional architectural optimizations designed to minimize power use and memory bandwidth requirements. Compared to the Mali-400 MP, the Mali-450 MP increases the maximum number of cores from four to eight.

MediaTek has adopted this core for some smartphone and tablet chips. Due to continuing reliance by mobile graphics applications on OpenGL ES 2.0 as the de-facto standard, the Mali-450 MP GPUs are likely to remain viable as above-average performance GPUs and have benefits in terms of power consumption and cost, while avoiding the inherent overhead associated with the need to support Open GL ES 3.x and other APIs in modern GPUs such as Mali-T6xx, Mali-T7xx and the PowerVR Rogue series.

However, ARM's Mali-T7xx series, and particularly the upcoming lower-end Mali-T72x series, are designed to work in tandem with other new IP cores from ARM (including CPU cores, video processing cores and 2D graphics cores) and provide potentially significant power saving and performance improvement from the use of framebuffer data compression techniques, which reduces unnecessary memory access associated with unchanged regions of the screen or framebuffer, reducing memory bandwidth requirements. This may increase performance in low-end devices without requiring any costly upgrade from the typically used 32-bit DRAM interface with memory clocked at power-friendly frequencies.

Limited memory bandwidth is already likely to be a bottleneck for the Mali-450 GPU as implemented in chips such as MediaTek's MT6592 and MT8127, which have a 32-bit memory interface, similar to the interface used in lower end chips with a Mali-400 GPU. Chips using next-generation GPUs such as Mali-T6xx/T7xx or PowerVR Rogue typically have a dual-channel or 64-bit DRAM interface. It would be interesting to see to what extent a Mali-450 implementation could benefit from a similar higher-performance memory interface.

Overview of notable SoCs using Mali-400 or Mali-450

 

(Click on image to enlarge)

Sources: ARM (Mali-400 MP page), ARM (Mali-450 MP page), GPU GFLOPS

No comments: