Saturday, October 18, 2014

Samsung's 64-bit Exynos 5433 SoC renamed to Exynos 7 Octa, used in some Galaxy Note 4 models

Recently, Samsung renamed its Exynos 5433 SoC to Exynos 7 Octa. The new Exynos chip is used by Samsung in the new Galaxy Note 4 smartphone, although how material actual shipments are has been unclear because most regions were first served primarily by Qualcomm Snapdragon 805-based versions of the Galaxy Note 4. However, evidence from the Geekbench result database suggests roughly a quarter of models currently sold are Exynos versions.

Signs of actual adoption of Exynos 7 Octa in high volume becoming apparent

Samsung has in the past frequently announced the use of Exynos SoCs in prominent smartphones, but shipments were often limited to very low volumes for smaller regions such Korea, with the vast majority of shipments using Snapdragon SoCs. During the last two years, only Samsung's tablets have seen widespread use of Samsung high-performance mobile SoCs. Although Samsung has recently ramped mid-range chips such as Exynos 3470 in presumably high volume for the Galaxy S5 Mini, strong evidence would be required to establish that the situation will be different this time around in terms of a high profile Exynos SoC (Exynos 7 Octa) being actually used in high volume in smartphones.

However,  searching for Galaxy Note 4 models on the Geekbench Browser provides evidence that at least one quarter of units currently sold contains the new Exynos chip, with the other three quarters or so using Snapdragon 805. Exynos versions are primarily represented by the SM-N910C, SM-N910S and SM-N910K models, while Snapdragon versions are mainly represented by SM-N910A, SM-N910T, SM-N910F and several other models.

Number of Geekbench entries for each Samsung Galaxy Note variant as of 24 October:
  • SM-N9100: Snapdragon 805, 7 entries
  • SM-N9109W: Snapdragon 805, 4 entries
  • SM-N910A: Snapdragon 805,  635 entries
  • SM-N910C: Exynos 5433, 425 entries
  • SM-N910F: Snapdragon 805, 496 entries
  • SM-N910H: Exynos 5433, 24 entries
  • SM-N910K: Exynos 5433, 73 entries
  • SM-N910L: Exynos 5433, 33 entries
  • SM-N910R4: Snapdragon 805, 23 entries
  • SM-N910P: Snapdragon 805, 238 entries
  • SM-N910S: Exynos 5433, 197 entries
  • SM-N910T: Snapdragon 805, 559 entries
  • SM-N910V: Snapdragon 805, 69 entries
  • SM-N910W8: Snapdragon 805, 10 entries

For the listed models, the total count is 752 Exynos and 2041 Snapdragon, representing an Exynos proportion of about 27%.

All things being equal, one would expect Samsung to prefer to use the internally manufactured Exynos chipset if enough supply is available, although with four Cortex-A57 cores the SoC is likely to be relatively expensive to manufacture. On the other hand, there are significant performance differences, with the Exynos platform clearly faster in terms of CPU processing but with a question mark in terms of power efficiency, while Snapdragon 805 can be regarded as mature, stable technology. Qualcomm may also be able to enforce a certain quotum of Snapdragon chips based on its leverage of patent royalties and licensing fees (which are considerable for a high-end smartphone).

Some anomalies are evident in the chips used for certain models. For example, a number of the SM-N910S results (which officially uses the Exynos 5433) in the Geekbench database show the use of an APQ8064 (Snapdragon 600) SoC clocked at 1.89 GHz, which is significantly slower that Exynos 5433 (or Snapdragon 805). Similarly, for the SM-N910C, starting from October 30 a not insignificant number of results labelled as SM-N910C show the use of the aging Exynos 4412 SoC (also used in old models such as the Galaxy S III) with four Cortex-A9 cores clocked at 2.0 GHz, much slower than Exynos 5433. These anomalies probably represent counterfeit production by Chinese manufacturers (both APQ8064 and Exynos 4412 have been common in the supply chain in the past). For models that officially use Snapdragon 805, no anomalies are evident.

Update as of December 5, 2014

Reassessing the share of Exynos 5433 vs Snapdragon 805 in the Geekbench database after a few months of production should be informative about whether Samsung is really serious about ramping Exynos production for smartphones. The following is apparent:
  • The Exynos-based SM-N910C count has increased from 425 to 4390.
  • The Exynos-based SM-N910S count has increased from 197 to 578.
  • The Exynos-based SM-N910K count has increased from 73 to 212.
  • The Exynos-based SM-N910H has increased from 23 to 757, while SM-N910L has increased from 33 to 91.
  • The new Exynos-based SM-N910U shows a count of 1062.
  • The Snapdragon 805-based SM-N910A count has increased from 635 to 2258.
  • The Snapdragon 805-based SM-N910T count has increased from 559 to 2089.
  • The Snapdragon 805-based SM-N910F count has increased from to 496 to 3857.
  • The Snapdragon 805-baed SM-N910P count has increased from 238 to 1162.
  • The Snapdragon 805-based SM-N910R4 has increased from 23 to 61, SM-N9100 from 7 to 58, SM-9109W from 4 to 20, SM-910V from 69 to 1685, and SM-910W8 from 10 to 636.
  • The new Snapdragon 805-based SM-N910G shows a count of 903, SM-N9106W shows 22, SM-N9108V shows 1.

For the listed models, the total count is 7090 Exynos and 12752 Snapdragon, representing an increased share of Exynos-based models in the Geekbench database from about 27% to about 36%, clearly suggesting that the share of Exynos-based models is increasing, and recent production may already have a much greater proportion of Exynos-based models.

First 20nm ARMv8 SoC targeting Android

One of the first smartphone SoCs manufactured using a 20nm process, at Samsung's own fabs, the Exynos 7 Octa is the first chip featuring ARM's Cortex-A57 and Cortex-A53 cores in a big.LITTLE configuration to appear on the market. The Cortex-A5x cores support the 64-bit ARMv8 instruction set, although using the 32-bit variant of the ARMv8 instruction set also appears to bring benefits while avoiding the performance degradation (related to increased memory use for pointers and addressing) that is associated with going to full 64-bit.

It is not the first 20nm SoC to support the ARMv8 instruction set, since Apple's A8 chip has already ramped to high-volume production during most of the year at TSMC for use in the iPhone 6 models. And already in 2013, Apple introduced the first ARMv8 chip with the Apple A7. As I have explained in an earlier article, there are reasons to believe the CPU cores in the Apple A7/A8 may have great similarities to ARM's Cortex-A57 CPU core, and in that sense the Exynos 7 Octa technically may not actually be the first SoC with Cortex-A57 cores to hit the market.

Fast, but power efficiency may be a problem

Reviews of Exynos 7 Octa-based devices such as the Galaxy Note 4 are still scarce. Already several months ago, early benchmarks results showed Exynos 5433 (as it was known then) providing the highest performance in the mobile space, significantly outscoring Snapdragon 805 in most benchmarks. This is not unexpected given the use of high-performance Cortex-A57 cores at a fairly high clock frequency.

However, there are signs that maintaining power efficiency with higher-clocked Cortex-A57 cores may be a challenge. Some early hands-on preview have suggested relatively high power consumption and mediocre battery life for an Exynos 5433-based Galaxy Note 4. More definite test results should clarify the situation.

Setting maximum clock frequency creates dilemma

Software techniques such as the use of efficient Global Task Switching with preference for the economical Cortex-A53 cores and throttling down of the clock frequency may be vital to maintain acceptable battery life. Analysis of Geekbench results for the Exynos 5433-based SM-N910C shows a multi-core performance scaling factor of about 4.45 for the largely CPU-bound JPEG Compress test, suggesting that Global Task Switching is implemented in such a way that not just the Cortex-A57 cores are utilized but the Cortex-A53 cores as well when high CPU performance is required.

High-performance CPU cores such as Cortex-A57 tend to have relatively high power consumption that increases as the clock frequency increases. This creates a dilemma for a manufacturer, because for acceptable power consumption with practical use there is little reason to set the maximum clock speed at the relatively high level that desirable for marketing purposes; a speed similar to the one used in Apple's Cyclone cores (e.g. 1.4 GHz) provides more than enough speed for most applications while limiting the excessive power consumption (and potential stability problems) associated with higher frequencies. A similar dilemma is often associated with SoCs with Cortex-A15 CPU cores (such as Samsung's Exynos 5430 used in the Galaxy Alpha) that have performance characteristics (high performance, but low performance/Watt) comparable to Cortex-A57, although Cortex-A57 is likely be more efficient.

Providing superior synthetic benchmark performance can be a matter of high prestige for a company and its marketing department to the extent that an unbalanced high maximum clock frequency may still be used in actually shipping devices, to the detriment of the user experience. Associated with this dilemma is the attraction of "cheating" on benchmarks by detecting when synthetic benchmarks are run and the switching to higher, sustained clock frequencies with reduced heat throttling, which has been demonstrated to be widespread in the past by websites such as AnandTech.

Evidence suggests Exynos 5433's Cortex-A57 cores are already clocked at a relatively low but efficient speed of about 1.4 GHz

Exynos 5433 may in practice already be clocked at a relatively low maximum speed to conserve power. Geekbench consistently reports 1.3 GHz as the clock frequency for all Exynos 5433 devices, however for some devices, including Samsung's big.LITTLE Exynos 5430 with Cortex-A15, Geekbench seems to report the maximum clock speed of the slower LITTLE cores, so the Cortex-A57 are probably clocked higher. However, even for the Cortex-A57 cores in the Exynos 5433, which have dramatically higher performance/cycle than the LITTLE Cortex-A53 cores, a relatively limited maximum speed in the range of 1.3 GHz would by no means be inappropriate for a smartphone platform.

Looking more closely at cross-platform Geekbench results for the Exynos-based Note 4 and the iPhone 5S and iPhone 6, and assuming that Apple' s Cyclone and Cortex-A57 are cores with similar performance characteristics at given clock speed (at least the little available evidence puts metrics like IPC and DMIPS in the same ballpark), gives indications that Exynos 5433 may on average actually be clocked at an effective 1.4 GHz, comparable to the 1.4 GHz of the iPhone 6. However, it can not be ruled out that in the case of the Exynos 5433 the frequency is the average resulting from thermal speed throttling (variation of the CPU speed based on power consumption and heat production).

Apple's SoC architecture is also different because it is a dual-core compared to the big.LITTLE configuration of the Exynos 5433 with four Cortex-A57 cores and four Cortex-A53 cores, and Apple' s cache memory architecture is very different with a large L3 cache and likely highly optimized but smaller L2 cache, and the Apple device has higher external RAM performance. Additionally, the software model (Apple's  64-bit AArch64 vs 32-bit ARMv8 AArch32 used with Exynos 5433) also complicates things, however some conclusions may still be drawn looking at specific benchmarks.

Comparison with Apple A7 and A8 benchmarks provides clues

Performing a detailed comparison of representative results for a SM-N910C and an iPhone 6 with Apple A8 on the Geekbench browser page provides interesting information. On first glance the results are all over the place with some benchmarks (including single-core ones) being faster on Exynos and others on the Apple A8, while Exynos obviously has an advantage for multi-core tests.

However, one can look for sub-benchmarks that are less likely to be affected by a large L3 cache on the Apple device, specifically benchmarks that do not have a large memory working set and source data or do not constantly perform random read access on a large set but do perform a lot of processing, possibly writing (but not reading) a lot of data. Some stream-type algorithms such as common data and image compression and decompression benchmarks fit the bill, because they generally steam the source data sequentially, perform a relatively high amount of CPU processing based on a relatively limited working set (a small part of the stream/file), and write the resulting data sequentially.

This type of benchmark puts the Exynos 5433 somewhat lower but fairly close to the Apple A8 in single-core CPU performance. Further information can be gained from iPhone 5S (Apple A7) results.

Benchmark results: Galaxy Note 4 (SM-N910C) vs iPhone 5S vs iPhone 6, relative speed advantage of iPhones compared to SM-N910C:
Test name                           SM-N910C  iPhone 5S       iPhone 6
BZip2 Compress:                     1187      1109 ( -6.5%)   1187 ( +8.5%)
BZip2 Decompress:                   1366      1394 ( +2.0%)   1538 (+12.6%)
JPEG Compress:                      1378      1196 (-13.2%)   1372 ( -0.0%)
JPEG Decompress:                    1598      1583 ( -0.9%)   1855 (+16.1%)
PNG Compress:                       1391      1427 ( +2.6%)   1577 (+13.4%)
PNG Decompress:                     1490      1301 (-12.7%)   1498 ( +0.5%)
Sobel (image local edge detection): 1701      1584 ( -6.9%)   1922 (+13.0%)
The Apple A8 chip in the iPhone 6 scores somewhat higher than Exynos 5433 in most tests, while Exynos 5433 is on average faster than the Apple A7 in the iPhone 5S. All of this is consistent with the CPU cores in all of the devices having comparable single-core CPU performance, and when making the assumption that Cortex-A57 and Cyclone (which seems to have a lot of architectural similarities with Cortex-A57) have comparable performance per cycle (at a given clock frequency), consistent with a clock frequency for the Exynos 5433 that is similar to the one used in the Apple devices (around 1.3 to 1.4 GHz).

The largely CPU-bound JPEG Compress test, which appears to be closedly tied to clock speed on other chip platforms with limited dependence on factors outside the CPU core, provides evidence that the isolated single-core CPU performance of Exynos 5433 may be close to that of the Apple A8 in the iPhone 6, consistent with a similar effective clock frequency of about 1.4 GHz. To what extent thermal throttling plays a role for the Exynos is not entirely clear. Most of the Geekbench results for SM-N910C for the JPEG Compress test are very close (a score around 1375), suggesting that at least for this test the maximum clock speed is generally maintained, which would be compatible with this speed being about 1.4 GHz.

PNG Decompress seems to be somewhat of a negative outlier for the Apple A7 and A8, but it is consistent across different iPhone results and is probably related to the high amount of memory writes (decompressed image data) associated with the benchmark, which can be affected by the extra layer in the memory subsystem represented by the L3 cache.

One significant caveat for the comparison above is that the Apple devices run in AArch64 mode, while Exynos 5433 in the Note 4 runs in AArch32 mode (the 32-bit version of the ARMv8 instruction set). AArch64 can take advantage of more instructions, in particular instructions operating on 64-bit registers, while the increased pointer/address storage size can decrease performance somewhat. However, the source code for the Geekbench test is likely to be identical (without extensive use of 64-bit integer variables) for AArch64 and unlikely to be specifically optimized, with any optimizations for AArch64 in the generated code depending on the compiler.

Sources: Samsung (Exynos 7 Octa), Geekbench Browser

Updated (24 October 2014): Update with information about proportion of Exynos models based on Geekbench database, and provide performance comparisons with Apple processors.
Updated (30 October 2014): Language tweaks, improve Geekbench comparison table and fix PNG Decompress score for iPhone 5S.
Updated (2 November 2014): Update discussion about clock speed of Exynos 5433, expand description of use of GTS, make note of counterfeit models in Geekbench database.
Updateed (5 December 2014): Update Exynos model share statistics for Galaxy Note 4.

Friday, October 3, 2014

Transition to next-generation FinFET process nodes: Samsung unlikely to be in the lead despite media reports

In the last few months, relatively vague media reports about Samsung gaining back chip orders from Apple that it has recently lost to TSMC, as well new orders for Qualcomm and other players for its next-generation 14nm FinFET technology have surfaced a few times. These media reports have frequently been widely reported in popular technology publications, often been interpreted as if TSMC would be losing market share in 2015 to the point of having significant excess capacity or as if Samsung has a considerable technology lead. However, these media reports as well sweeping conclusions about a presumed superior market competitiveness of Samsung in comparison with TSMC in 2015 are likely to be highly inaccurate.

TSMC currently dominates advanced node foundry production

TSMC currently dominates the foundry market for leading-edge nodes such as 28 and 20nm for chips such as smartphone SoCs and GPUs with a market share in excess of 80%, and faces significantly more demand than it is able to supply, despite unprecedented investment in new production capacity. Samsung's 28nm logic fabs are currently largely empty, and a similar situation is occurring at GlobalFoundries as it has been struggling to gain significant customers apart from AMD. Within this context, it is apparent that TSMC has been doing something right, while Samsung and GlobalFoundries must have had some significant set-backs, otherwise this market share distribution would not be happening. Given this track record, one can wonder how realistic it is to expect that the level of competitiveness of Samsung and GlobalFoundries would recover or even be reversed for next-generation processes as early as 2015.

Chip design companies motivated to seek additional sources of supply, but challenges apparent

Clearly, because TSMC currently has a virtual monopoly and is not able to fulfill demand there is a pressing motivation for chip companies such as Qualcomm and others to seek additional sources of supply. Therefore there is no reason to doubt that major efforts are being made in this area, especially starting from about Q2 2014 when the capacity shortage at TSMC became very evident. However, successful completion within any reasonable time-frame of such a move (especially when the effort has only recently become more intensive) involves substantial technological challenges and risks, which make it unlikely that it will actually happen in any way close to the time-frame and volume that has been suggested be some reports.

The fact that TSMC's 16nm FinFET process is an evolutionary extension of its already highly successful 20nm process to incorporate FinFET technology, rather than the radical technology changes involved in Samsung's 14mn FinFET process, also make it likely that chip design companies will continue to concentrate on TSMC process technology in the near term out of necessity, with any efforts with Samsung likely to only result in significant production at a much later stage.

Optimistic projections from sources within Samsung widely reported as fact

In an article on October 1, ZDNet (based on an article from its Korean website) quotes a manager from Samsung's LSI division saying that Samsung is likely to improve profits once it achieves volume production for next-generation products for Apple. The source declined to comment about when Samsung would start mass producing such chips for clients. Combining earlier media speculation, the article goes on to state that 14nm production for clients such as Apple, Qualcomm and AMD would start as early as the end of this year. The article also quotes undisclosed sources that Samsung is producing 30% of Apple's A8 processors, with the rest being manufactured by TSMC. The article has been widely quoted in popular news media.

However, there are several reasons to believe that these reports are relatively inaccurate and misleading. First of all, unofficial remarks from sources within Samsung seem to be the only source of information for the article. As mentioned in the article, Samsung is currently incurring very significant losses from its logic (LSI) fabs because of underutilization after losing Apple SoC orders to TSMC. That sources within Samsung (including managers who in fact may hold primary responsibility within Samsung for achieving profitability of the LSI division) would be inclined to paint to an over-optimistic picture that may not accurately reflect the the current and future market status for production of advanced next-generation designs is not at all surprising.

Apple has explored multiple sources for production of Apple A9

Already in July 2013, an article published by EE Times reported that Apple signed a deal with Samsung with Apple to produce the Apple A9 in 2015.  This article also illustrates that knowledge of TSMC 20nm production for the Apple A8 in 2014 (as mentioned in the article) was already widespread at this time. However, in June 2013, it was already reported that Apple signed a three-year deal with TSMC not only involving 20nm, but also TSMC's next-generation 16nm FinFET and later 10nm FinFET technologies, with Apple A9 being mentioned. Recently, in August 2014, DigiTimes reported that TSMC had gained production of the Apple A9 using its 16nm FinFET process with significant volume as early as Q1 2015. More recent reports suggest Apple A9 will be manufactured at TSMC but using the same 20nm process as Apple A8.

Based on TSMC's track record and in particular its successful high volume ramp of the Apple A8 using its 20nm process, I believe it is very likely that Apple will focus Apple A9 production, at least for the most significant earlier part of its production cycle, at TSMC. Apple will be able to move to FinFET earlier at TSMC if it chooses too because TSMC's 16nm FinFET is to a large extent an evolutionary extension of its 20nm process incorporating FinFET technology, rather than the radical technology change involved in Samsung's 14mn FinFET process, achievement of maturity for high volume production is much less of a challenge which makes it unlikely that Samsung will be able to achieve a similar level of maturity in a time-frame that is competitive with TSMC. The fact that qualifying and bringing a similar chip to stable production at Samsung involves substantial additional investment in chip design, testing and associated risks including the timing of such production will probably even make it attractive for Apple to keep material Apple A9 production at TSMC for its entire life cycle.

Achieving significant production of Apple A8 will be very challenging for Samsung

In addition, the accuracy of the claim that 30% of the production of the Apple A8 is already manufactured by Samsung is highly questionable. Samsung's 20nm process is fundamentally different from that of TSMC in several details, and Apple would have to repeat most of the design/validation cycle that it is has already completed for the TSMC version of Apple A8 in order to be able to produce at Samsung's fabs, resulting in very high additional cost, numerous risks, and substantial delays. Moreover, it is doubtful that the production capacity of Samsung at 20nm (which it already uses for certain Exynos chips such as Exynos 5430 and 5433, and even those do not appear to have already ramped in really high volumes) is ramping fast enough to quickly gain material shipments to Apple, especially when Samsung is supposed to be rapidly transitioning to 14nm FinFET.

While it is not unlikely that Samsung has been aggressively seeking to provide capacity for the Apple A8, working with Apple, whether it would be able to achieve material amounts of production before the latter stages of the life cycle of the Apple A8 in 2015 when production levels will already have decreased is debatable. From Apple's viewpoint, it appears that its relationship with TSMC involves TSMC giving it any level of capacity it needs (to the detriment of competitors who are facing wafer shortages) which makes the apparent benefit for Apple to quickly move part of the Apple A8 production Samsung relatively limited. Samsung may offer lower prices for 20nm manufacturing capacity, but as explained earlier, the complexity, cost, time and risk involved in moving Apple A8 production to Samsung make it unlikely that Samsung will be able to gain a significant share of production within a reasonable time-frame.

Comparison of FinFET technologies at Intel, TSMC and Samsung

Recently, ZDNet also published a much more technical and reliable article discussing the status of FinFET technologies of the major fab players, including Intel, TSMC, Samsung and GlobalFoundries.

Intel started production of processors using FinFET technology at 22nm as early as 2011 and has already shipped 500 million such chips, mostly targeted at PCs but also gaining shipments for tablet applications this year. It also offers the technology to other customers as a foundry. Intel has started volume production of its next-generation 14nm FinFET process, which is a "true shrink" with significantly increased transistor density and delivers a combined 1.6x improvement in performance/Watt across applications ranging from smartphones to servers, and will continue to ramp production into 2015.

TSMC's 16nm FinFET development is at an advanced stage

TSMC's first generation 16nm FinFET process, 16FF, was qualified in November 2013 and already saw product tape-outs as early as April 2014. This suggests TSMC's 16nm FinFET process is already close to high volume production. TSMC's 16FF process will be followed up by its 16FF+ process with tape-outs expected in early 2015. While the performance benefits of 16FF are limited due its similarities (the same back-end metal layers) with TSMC's 20nm process, the 16FF+ process involves a reduction in feature size that makes it competitive with the theoretical performance of 14nm FinFET processes from competitors. TSMC is already in a stage called "risk production" for 15 16nm FinFET products this year and another 45 products next year for a variety of applications. Yields are reported have already reached levels comparable to TSMC's 20nm process. This is not surprising, as TSMC has reported that 95% of the tools used for 20nm can be reused for 16FF, which also brings massive advantages in the required level of investment to ramp capacity and greatly facilitates time-to-market.

TSMC quotes its 16FF+ process as having 15% greater performance when compared to 16FF (40% compared to 20nm) and 30% less power consumption when compared to 16FF. TSMC is already working on 10nm FinFET process technology which involves a more substantial 2.2x increase in transistor density.

SoCs using Cortex-A57 and Cortex-A53 CPU cores already implement TSMC's 16nm FinFET processes

Although 16FF is seen as a stepping stone to FinFET technology, it does provide performance benefits over planar 20nm. TSMC and ARM have announced that a 16nm test chip using Cortex-A57 and Cortex-A53 cores in a big.LITTLE configuration achieved a sustained 2.3GHz clock rate for the Cortex-A57 core with minimal power consumption of 75 milliwatts achieved for the Cortex-A53 core for common workloads. This demonstration involving a currently relevant SoC design illustrates the relative maturity of TSMC's 16nm technology.

For Cortex-A57, 16FF+ is expected to result in a 11% performance improvement relative to 16FF at the same level of power, while power consumption of the Cortex-A53 for low-intensity applications is reduced by 35%. ARM POP IP core hardening (tweaking cores for either performance or low power consumption) is utilized for early 16FF+ SoC designs. Although TSMC does not specifically address the use of Cortex-A53 at higher clock rates for high performance applications instead of Cortex-A57, the quoted numbers are consistent with the better scaling of Cortex-A53 on new processes when compared to performance-oriented "big-core" Cortex-A57 and cores with a similar architecture.

For example, one can speculate that the significant power reduction for the Cortex-A53 will further significantly increase the maximum clock rate and performance of Cortex-A53 CPU cores, more than the 11% quoted for Cortex-A57, making Cortex-A53-only designs more attractive for high-end applications. Already, early reports about MediaTek's MT6795 octa-core SoC running at about 2.2GHz, the first Cortex-A53-based SoC targeting high performance applications, suggest that it will provide premium-level performance at half the price of current premium-performance SoCs. The chip achieves this despite still using 28nm technology, indicating that Cortex-A53-based high-performance designs using more advanced nodes such as 20nm and 16nm FinFET will be even more revolutionary in terms of performance efficiency.

Samsung development of 14nm FinFET well underway, but maturity for high volume production unclear

Production of the first test chip (using a Cortex-A7 CPU core) on Samsung's first generation 14nm FinFET process, 14FPE, already occurred in December 2013. According to the marketing manager for Samsung’s foundry business, the foundry has completed tape-outs of multiple products and has already started early commercial production for some customers. The 14FPE process is claimed to provide either a 20% boost in performance or a 35% reduction in power consumption when compared to be a planar 20nm process. The process is said to result in 15% smaller chips when compared to a 20nm planar process.

Considering the considerable technological changes in Samsung' FinFET process (especially when compared to TSMC's more evolutionary first-generation 16FF process, which is closely aligned with the already almost mature 20nm), the claimed performance and density gains are relatively minor in the context of the high costs and learning curve involved in bringing chips to mature volume production. High theoretical performance of a new process has little value when it involves very high investment in chip design, relatively high manufacturing cost, and when mature volume production is not achieved in a timely manner. A higher performance version of Samsung's 14nm FinFET process, 14LPP, is expected to be qualified in a couple of months time.

Meanwhile, GlobalFoundries has given up on its own 14XM FinFET process and has aligned with Samsung's 14LPE and 14LPP processes. This decision probably means that it will take considerable time before GlobalFoundries will be competitive for volume production using FinFET, providing evidence that its market position will continue to be precarious for some time.


In summary, indications are that TSMC, helped by its more evolutionary transition to FinFET and dominant position in current leading-edge processes, is much closer to stable high volume production of next-generation FinFET processes than Samsung, and that it will continue to dominate leading-edge foundry production in the near term even as chip designers seek additional sources of supply given the very tight capacity environment at TSMC.

While Intel is also well advanced in its FinFET process development and uses it on a large scale for PC processors, it has not yet seen widespread success either as a foundry partner for third parties or as a provider of large numbers of low-power SoC for applications such as smartphones, also illustrated by the fact that early Intel mobile SoCs such as SoFIA that integrate cellular baseband and other components will in fact first be produced at TSMC and not in Intel's own fabs.

Source: ZDNet (Technical article of FinFET technology development), ZDNet (Samsung LSI article), EE Times

Updated October 5, 2014 (Spelling, grammar) .
Updated October 30, 2014 (Grammar, small corrections).
Updated December 26, 2014 (Minor grammatical corrections).