Those of you rooting for lots of alternatives to the x64 architecture for the server, networking, and embedded systems rackets will be delighted to hear that Tilera, a maker of innovative system-on-chip (SoC) processors, has lined up $45m in funding to help it ramp up development and sell more product.
Tilera is billing its forthcoming TileGx processors – which start shipping this year and which will pack from 16 to 100 cores on a single chip – as the “cloud computer.”
While the company has never confirmed this, Tilera is widely believed to have a modified MIPS core at the heart of its TilePro and TileGx mesh processors, which support a home-grown variant of Linux and which have everything you need for a system but the memory sticks and I/O ports are integrated onto the processor. The secret sauce for the Tilera chips is that they have multiple mesh networks that link all those cores and their cache memories together so that applications can scale efficiently across the cores. Instead of adding sockets to a system board, you add a larger Tilera chip if you need to do more work.
Back in March 2010, Tilera bagged its third round of financing, which had $25m stuffed into it including $10m from Taiwanese PC maker and server wannabe Quanta Computer. Chip-maker Broadcom and the financing arm of Japanese telco NTT also put some money in the bag. Walden International, Bessemer Venture Partners, Columbia Capital, and VentureTech Alliance are among the early investors in Tilera, who have collectively paid in $64m in three rounds. With this fourth round, Artis Capital Management, a late-stage private equity investor, led the round, with Cisco Systems and Samsung Venture Investment – the venture arm of Samsung Electronics – investing for the first time in Tilera. WestSummit Capital Management and Comerica Bank are also first-timers, and Walden, Bessemer, and Columbia kicked in more dough this time around, too.
Troy Bailey, vice-president of marketing at Tilera, says that the company expects for this to be its final round of funding. Early last year, Tilera was getting enough design wins to predict that it would generate revenues from cloud computing projects in 2010 and hit break-even by 2011. The situation is a little better than that as 2011 gets under way. Quanta started shipping its SQ2 server, based on the TilePro64 chip, back in June 2010. That was the same time that Tilera announced its future “Stratton” chip, which will crank the clocks and cram more than 200 cores onto a single die by 2013. Last fall, Silicon Graphics said it would be deploying Tilera co-processors in its Project Mojo Prism XL blade stick servers for HPC workloads where integer performance (genomics, search and semantics processing, and so forth) is more important than floating-point oomph.
Tilera was founded in October 2004, and came out of stealth mode with its first product roadmaps in August 2007. The company’s initial Tile64, TilePro32, and TilePro64 were based on a 32-bit RISC processor. The latter chip is the important one, and is used in Quanta’s SQ2 server. The TilePro64 has 64 cores on a single die (in an 8×8 grid) along with 16 KB of L1 cache per core, and 5.6 MB of L2/L3 cache per core. The L2 caches are made coherent by the iMesh mesh interconnect and function like an L3 as well as segmented L2 caches for each core. Wrapped around the Tile cores are four DDR2 main memory controllers, two Gigabit Ethernet ports, two PCI Express controllers, two 10 Gb/sec XAUI interfaces, and two flexible I/O interfaces to support peripherals like as compact flash memory or disk drives. The iMesh network on the chip is actually five separate networks to handle memory access, streaming packet transfers, user data network, cache misses, and interprocess communications. That iMesh also allows for a Linux instance to span multiple cores, SMP-style, to scale up performance as needed for a single Linux workload.
Tilera has not said how far this iMesh SMP capability scales or how dynamic it is, but think about how this might be useful for cloud computing. An instance could scale cores and memory up and down as needed, leaving capacity available for other users sharing the system. (This is what server virtualization hypervisors do on x64 and other RISC iron, and it imposes some performance penalties.)
As of early 2010, Tilera had over 50 design wins for the use of its SoCs in future networking and security appliances, which was followed up by two server wins with Quanta and SGI. The company has had a dozen more design wins since then and now claims to have over 150 customers who have bought prototypes for testing or chips to put into products.
With the TileGx chips that start rolling out this year, things will start getting interesting as Tilera shifts up to 64-bit processing and memory addressing, ramps up the core count, cranks the clock speeds up as high as 1.5 GHz, and delivers chips that run at between 10 and 55 watts. The 36-core TileGx-36 (a 6×6 grid) samples this quarter, with early customer shipments in the second quarter and general availability in the second half of 2011. The TileGx-100, which has 100 cores (a 10×10 grid) and four DDR3 main memory controllers, will be able to address up to 1 TB of 2.13 GHz main memory from its single socket. The TileGx-100 will come to market about six months behind the TileGx-36, with volume shipments expected in the first quarter of 2012. A 16-core variant, the TileGx-16, was supposed to ship late last year, but has been pushed out into the first half of this year, and a 64-core variant will come to market after the 100-core version. Tilera has set the price of the TileGx-36 at $400, and it is reasonable to expect that the 100-core chip will cost around $1,000.
Taking a peek under the hood
Each core on the Tile-Gx chips has 32 KB of data and instruction cache and 256 KB of L2 cache; those L2 caches are turned into a 26 MB virtual L3 cache. The Tile-Gx chips also have additional SIMD instructions that make use of a four multiplier-accumulator (MAC) per cycle unit that can deliver 600 billion MACs per second, which Tilera says is 12 times the fastest digital signal processor on the market today. The chips also spore two MiCA engines (short for Multistream iMesh Crypto Accelerator), which are able to deliver 40 Gb/sec of bandwidth on cryptographic work and 20 Gb/sec on compression and decompression. The chip also includes a packet-processing accelerator, which sits between the cores and the on-chip network interfaces, called mPIPE (multicore programmable intelligent packet engine), which does load-balancing between the cores and the network interfaces.
Tilera’s TileGx processors run Linux and the same Apache-MySQL-PHP-Python stack that a lot of cloud providers rely on. More importantly, Tilera has created a tool called the Multicore Development Environment that includes a standard Linux 2.6 SMP implementation for the cores plus C and C++ compilers, a hypervisor for hardware abstraction and virtualization, a graphical tool for debugging multicore applications, and an Eclipse plug-in.
Tilera has 85 people now, and is of course fabless – as any sane young chip company has to be these days. Bailey says that Tilera will be using the money from the fourth funding round to beef up its sales operations as well as speeding up its development. While the plan has not been hammered out yet, one option is to bring on a second design team that can do the testing and qualification work on the current TileGx chips, so that the current team can work on the future Stratton chips. Another option is to add people to create some custom derivatives of the TileGx chips that might be of interest to current and potential customers.
Incidentally, Tilera was only seeking about half as much money as it received in its fourth round, according to Bailey. So it boosted the number and that was quickly oversubscribed. It looks like there is a bit of enthusiasm over the possibility of mobile computing and cloud computing giving different chip architectures like ARM and Tilera a chance to compete against Intel and Advanced Micro Devices. And why not? Chewing up from the bottom is how the x64 architecture killed off myriad processors used in proprietary and RISC systems, after all.
One last thing: Bailey confirmed that the top-end “Stratton” part could be a 15×15 grid with 225 cores. Prior to today, the company was saying it would have 200 cores, which obviously cannot make a square. (You could do 14×14 to get 196 cores.)
So is the Tilera IPO scheduled to coincide with the Stratton chips? Stay tuned… ®
This article originally appeared in The Register.