Monday, 22 November 2010

10 Gigabit Ethernet is ready for the cluster

Say "cluster" and try to keep the mind from images of a massive, scientific applications, funded by the government, or herds of caffeine grad student. Rather difficult. But in reality the vast majority of the High Performance Computing (HPC) clusters are by far not large enough to qualify as a solid, are used in commercial environments, and run on Gigabit Ethernet links. Even within the TOP500 Supercomputer Sites ® the number of clusters running Gigabit Ethernet is more than doubleNumber of InfiniBand cluster running. Certainly a higher speed and lower latency would be nice for every installation. But the performance requirements for most applications, not only due to high maintenance costs and manpower of InfiniBand.

What more Gigabit Ethernet HPC sites could really move to 10 Gigabit Ethernet (10GE) if you could make low-cost and reliable. Until now, this idea would generate skepticism and doubt knowledgeableMakers. But anchored by Gigabit Ethernet is already in the HPC market and offers a number of advantages, few obstacles have prevented the growing importance of 10GE. These obstacles have evaporated quickly. With access to the latest technological advances, improvements in prices and best player on the market, 10GE choice for HPC clusters are very attractive.

Understanding 10GE
Understanding the context of 10GE merits a little story. Although Ethernet hasabout three decades, the technology remains viable because it has developed over time, just to the changing needs of the sector to be. The widespread adoption of Ethernet started as IEEE, the 10 Mbit / s Ethernet standard established in 1983. The standard Fast Ethernet (100 Mbps), Gigabit Ethernet (1000 Mbit / s) and 10 Gigabit Ethernet, 40 Gigabit and 100 standards developed in the near future. In reality, began talks Terabit Mbps Ethernet-million rate that was hard to imagine onlysome years ago.

Despite these developments, the Ethernet frame format and the basic operating principles have remained virtually unchanged. As a result of administration of mixed networks with speed (10/100/1000 Mbps), even without the need for expensive or complex gateway. When Ethernet for the first time in use, are easily confused with real heater, was coaxial tubes, the special tools necessary to bend it. As you develop advances in optics and Ethernet cabling absorbed, changed fromtogether to be switched media, was the concept of virtualization and integrated VLAN jumbo frames, and other improvements. Ethernet continues today with the far-reaching changes, such as support for the development of block-level storage (Fibre Channel over Ethernet).

Ratified in 2002 as IEEE 802.3ae 10GE today to 10 gigabits per second over distances of supporting up to 80 km. In almost all aspects 10GE is fully compatible with earlier versions of Ethernet. It uses the same frameSize, Medium Access Control (MAC) protocol handlers and image size and the network can use to understand the tools of family management and operational procedures.

Ethernet Benefits for HPC
The fact that run more than half of the TOP500 Supercomputer Sites, and almost all the smaller clusters, Ethernet is not a surprise when you look at the benefits this technology offers:
o High level of comfort as a widely used standard, Ethernet is a family environment for IT managers, network administrators, serverManufacturer and supplier of managed services around the world. We have the tools and knowledge to manage them to keep them standing. vendor support guidelines is a plus also support almost all Ethernet provider.

or Best practice: high availability, reliability, manageability, security, backup, networking and other best practices are well established in Ethernet and its implementation is widely understood. This is another example of acceptance and Ethernet vendor for support. (Good luck with thatInfiniBand firewall, for example!)

Single-or infrastructure: Ethernet HPC administrators the advantage of a unified infrastructure that supports the four most important areas for connectivity: user access, server management, storage and cluster interconnect connectivity. A single infrastructure is easier to manage and less expensive to buy, power, and maintain than using a separate technology for storage or for the processor to combine.

O Lower power: Power is aThe increased expenditure in relation to data center managers today. New environmental mandates with rising energy costs and demand together to force administrators to focus on green initiatives. Ethernet is an effective way to power and cooling, especially in designs that reduce energy consumption.

o Reduction in costs: use the new server shipment 10G ports on the motherboard and 10G switch ports now priced under $ 500 10GE has a price / performance advantage over niche technologiessuch as InfiniBand.

or rate of growth: high-speed Ethernet is to use the largest installed base of Gigabit Ethernet. New 100gE 40GE and products will be available shortly and is considered by many producers of silicon and supported.

For those applications that could benefit from high-speed 10GE offers even more advantages.
or a more efficient use of energy: 10GE requires less power for Gigabit Ethernet and Gigabit Ethernet, so you get ten times the bandwidth of ten times withoutpower.

O Practical Performance: 10GE can move data over 10 times faster Gigabit Ethernet, but because of the new generation of 10GE NICs and the latency between the servers of 8 times can be reduced.

This gain bandwidth and latency in application performance higher than you might think. For the molecular dynamics (VASP on a cluster of 64 core) for the application worked more than six times faster than Gigabit Ethernet and was almost identical to DDR InfiniBand. In aMechanical Simulation Benchmark (PAM CRASH on a core-64 Compute Cluster), 10GE activities were completed on time about 70 percent less than Gigabit Ethernet and InfiniBand DDR was the same. Similar results were clustered HPC applications are common, such as FLUENT and RADIOSS and test results are available in more with similar results were observed.

These benchmarks are impressive. Salespeople love to talk microseconds and gigabit per second. But the real advantage in commercial applicationsis to increase user productivity, and that is measured by the clock on the wall. If 70 percent of the calculations to run faster, users 70 percent more productive.
The benefits of 10GE, many architects Cluster practically drooling at the prospect likely to upgrade to 10GE and experts for rapid growth in the market for 10GE industry for years. This has not happened yet.

Barriers removed
Until recently 10GE in the starting gate because of a few, but stuckThe most significant problems with the prices, stability and standards. These problems were overcome, GE and 10 has decreased. Here's what happened.

or network card (NIC): Some early adopters of 10GE have been discouraged by the problems with the network cards, starting price. Until recently, the only cost-10GE NICs for applications and more than $ 800 Many users prefer to use two of them per server. Now the server manufacturers start with a chip Ethernet AddMotherboard known board-on-motherboard (LOM) instead of a disconnect as a LAN. This progress makes the cost of less than $ 100 and removes the price barrier 10GE NIC. Standalone NIC prices are now starting at $ 500 and is expected to fall further as the technology allows LOM NIC manufacturer for the high volumes they need to keep costs low reach.

Another obstacle has been NIC-related questionable reliability of some of the most popular. Some of these create a bad impression, first 10GE withimmature driver software, which tend to have below-average or even crash. The industry has now grown past these problems and strong players like Chelsio, Intel and Broadcom are stable, reliable products.

Rates Turn O: such as network cards, inhibited only 10GE switch prices early adoption of technology. The original cost 10GE switch up to $ 20,000 per door, which was more than the price of a server. Now, the list price for the 10GE switch is less than $ 500 per port andStreet prices are even lower. And the price is for embedded blade switch product and the top rack.

Move or Scale: A market for large clusters as inhibitor switch hook combine to create a cluster lock. The clusters are small enough that this is not a problem. For larger districts CLOS provides technology for scaling Ethernet switch, a solution, and begins to penetrate the market.

O PHY Confusion: The rapid evolution ofDifferent standards in fiber optic transceivers was a stop for the customer. Standards define the plug-in transceiver quickly changed X2 XENPAK XFP SFP +, each with smaller size and low cost. But since each type of transceivers vary in size and shape of a switch or network adapter is compatible only with one option. The use of different types of lenses increase the complexity in the data center and additional costs such as storage, additional spare parts. With visions of blue-ray vs. HD-DVD,VHS vs. Betamax and MS-DOS were versus CP / M users are not willing to wait to bet a survivor's pension and to avoid the technology they see in which direction the market moves.

At some point in the evolution of SFP + peaked. This technology is specified by the ANSI T11 group and 8.5 for 10-Gbps Fibre Channel, as well as 10GE. The SFP + module is small enough to fit in a single rack unit switch, just like the connectors used 48 RHJ-45 Ethernet previous generations. It also contains lessElectronically, reducing the performance and cost per port. SFP + is a blessing for the 10GE industry, allowing switch vendors, multiple ports into a smaller form factor packaging and reduce system costs through better integration of the functions of IC-card at the host level. As a result, fewer sparks flying in the format war, and the industry is a very fast convergence for the SFP +.

O connectivity: Many people have the perseverance to 10GBASE-T, because it uses an RJ45 common and maygive the market what is waiting for: simple, inexpensive 10GE. But the physics are different 10GE. With today's technology, the chips are expensive, power hungry and require a new cabling (Cat Cat6a or 7). 10GBASE-T also add components, 2.6 microsecond latency of each cable is not exactly what you want in a cluster interconnect. And while we wait for 10GBASE-T, are less expensive and less power-hungry technology. 10GBASE-CX4 offers reliability and low latency, and is aproven solution, which has become a landmark for the 10GE technology.

To facilitate the expected new SFP + Copper (Twinax) direct connect cables, the thin passive cable with SFP + are complete. With support for up to 10 meters, they are actually ideal for wiring in a rack or between servers and switches that are nearby. In a first prize of $ 40 to $ 50 and a forecast of much lower prices, Twinax offers a more simple and more convenient for optical cables.With advances in this way, clarity is confusion over the market. The combination of SFP + Direct Attach Cable for short-distance optical transceiver family for longer runs and 10GBASE-CX4 for the lowest latency, there are big opportunities today for clusters of wiring.

If the Cluster service is greater
Up to this point we have discussed how the barriers to the adoption of 10 GE for many HPC clusters using Gigabit Ethernet is exceeded. Now look at the possibility ofprovides the benefits of 10GE, the largest cluster with higher requirements. These implementations require a direct connection, an adequate performance and a system environment that meets the stringent hardware can be used to meet the challenges of multiple processors, such as heat dissipation and power cost-effectiveness.
Performance analysis shows that some HPC applications, which are loosely coupled, or can not be run on an excessive demand for low latencyvery well in the 10GE. Many applications TCP / IP-based fall into this category, and many others can be supported by adapters, TCP / IP offload processing. In fact, some TCP / IP applications actually run faster and with less latency over a 10GE on InfiniBand.

For the most performance-hungry and latency-sensitive, performance comparable to current developments in technology 10GE InfiniBand. 40-Gig InfiniBand vendors start InfiniBand (QDR) vesselbut let's see what that brings real. Since all 8b/10b InfiniBand used to take 20 percent of the bandwidth advertised hour-40 Gig InfiniBand is really 32 Giga and 20 Giga InfiniBand is really only capable of reaching speeds of 16 Gig. But the real limitation, the PCIe bus on the server can usually only 13 concerts for the majority of servers shipped in 2008. More recent servers "PCIe Gen 2" for up to 26 concerts, but soon begin to see 40 Gigabit Ethernet faster internalBus and then increase the volumes and prices will fall. We have seen this movie niche technologies by the supplier of motion and mass of Ethernet are obsolete.

Moreover, as Fast Ethernet Gigabit Switch Gigabit uplink capabilities and 10-GE, will not last long, up to 10 switches with 40 Gigabit and 100-gigabit connections to upstream switches and routers. And you do not need a complex and performance-limiting gate connect to resources inLAN or WAN. At one point, 10, 40 and 100-Gigabit Ethernet is the right choice for the largest cluster.

What is important: the performance of applications
A Reuters Market Data System (RMDS) Benchmark (stacresearch.com), showed that in comparison with the solution Blade Network Systems, which exceeded 10GE 10GE InfiniBand InfiniBand, updates with much higher latency for the second and 31 percent lower (see Figure 1 and Figure 2). These figuresdemonstrate the practical benefits of 10GE much more decisive than the micro-benchmarks of each component.

Practical Considerations
Switch can occur in many sizes and shapes, and new, more efficient form factors. Blade servers can be used to provide an efficient and powerful for clusters of any size, with the first level of switching and interconnection completely within the blade server chassis. Connecting server blades internally1 or 10 Gigabit cabling requirements reduced, and will create improvements in reliability, cost and performance. Since blade servers appeared on the scene a few years ago were used to create some of the largest cluster. Blade servers are often used to create compact groups of the department, often dedicated to the creation of a single critical application.

A solution specifically designed for power and cooling for a strong supportCluster, the IBM ® System x (TM) iDataPlex (TM). This new system design is based on industry-standard components, the open-source software like Linux ® in support of the base. IBM has developed this system to modulate the proven portfolio of products and systems for HPC clusters and Web 2.0 community to expand.

The system is designed specifically for power computing applications, where the density is crucial designed cooling. An iDataPlex rack has the same footprint of a standard rack, but it also has much greater cooling capacityEfficiency due to its shallow depth fan. An optional liquid cooling wall on the back of the system eliminates the need for special air conditioning. 10GE switches from BLADE Network Technologies specializes iDataPlex meet the airflow, which in turn governs data center, "hot aisles and cold, creating an integrated solution that can support large clusters.

Blade servers and scale-out solutions like iDataPlex are just two of the new trends in the data center switching, whichcluster architectures may efficiently.

A clear path
The last obstacles to 10GE for HPC has been removed:
or NIC technology is stable and prices have continued to decrease, while the latency and throughput continue to improve, thanks to improved silicon and LAN-on-motherboard (LOM) technology.

10GE switch or are now cost less than $ 500 per port.

o The combination of SFP + Direct Attach cabling, SFP + 10GBASE-CX4 and offers a practical andcosts of wiring solutions.

New platforms or are presented with power and cooling efficiency advances to meet the stringent requirements HPC, even for large clusters.

or benchmarks show that the 10 GE can offer real business benefits for your orders faster, while maintaining the simplicity of using Ethernet.

support or 10GE blade server technology, while meeting the demanding physical requirements of large clusters.

With Gigabit Ethernet De factoStandard for all but the largest group of applications and cleared the final hurdles for 10GE for HPC, it is time to re-create the image of the network HPC components based on standard, widely available expertise, compatibility, reliability and cost of technology.

No comments:

Post a Comment