<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://port25.technet.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Port 25: The Open Source Community at Microsoft : Kishi Malhotra, HPC</title><link>http://port25.technet.com/archive/tags/Kishi+Malhotra/HPC/default.aspx</link><description>Tags: Kishi Malhotra, HPC</description><dc:language>en</dc:language><generator>CommunityServer 2007.1 (Build: 40109.1145)</generator><item><title>What Lies Beneath: Setting up underlying HPC tools</title><link>http://port25.technet.com/archive/2006/12/21/what-lies-beneath-setting-up-underlying-hpc-tools.aspx</link><pubDate>Thu, 21 Dec 2006 22:34:00 GMT</pubDate><guid isPermaLink="false">af7480c4-26b7-468d-87b0-2acebabb473d:3387</guid><dc:creator>kishi</dc:creator><slash:comments>3</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://port25.technet.com/rsscomments.aspx?PostID=3387</wfw:commentRss><comments>http://port25.technet.com/archive/2006/12/21/what-lies-beneath-setting-up-underlying-hpc-tools.aspx#comments</comments><description>&lt;p&gt;&lt;strong&gt;This blog continues what I started writing about w/ &lt;a href="http://port25.technet.com/archive/2006/12/01/thinking-about-hpc-infrastructure.aspx"&gt;Thinking About HPC Infrastructure&lt;/a&gt;&amp;nbsp;and what Frank wrote in about in &lt;a href="http://port25.technet.com/archive/2006/10/20/Overloading-_2700_Clusters_2700_.aspx"&gt;Overloading Clusters&lt;/a&gt;. &lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;After reading thru the previous blogs on HPC, someone might ask &amp;ldquo;What are some of the core components of HPC ?&amp;rdquo;. After all, once you&amp;rsquo;ve seen the outside of a Maserati or a Pantera DeTomaso, you&amp;rsquo;re not going to be satisfied just by ogling at it. Even after a test drive, the engineer in you will want to pop the hood and see what&amp;rsquo;s inside. Taking a similar approach let&amp;rsquo;s uncover some underlying HPC technologies by looking at any basic HPC setup. Once all the provisioning has been completed, the HPC system will be physically deployed with an OS and relevant drivers, utilities etc. Yet, before the actual HPC application can get installed across, there remains a critical step in the process, i.e. configuration of cluster and file system along with any tools and interfaces such as MPI (Message Passing Interface) etc. After peeling through the HPC application layer, its worthwhile to do a &amp;ldquo;deep-dive&amp;rdquo; into what really runs the HPC clusters. A broad category of these tools are:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Cluster Management tools e.g. CSM&lt;/li&gt;&lt;li&gt;Job Scheduling tools e.g. SCALI, Maui&lt;/li&gt;&lt;li&gt;Resource Management tools e.g. Torque&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;If you&amp;rsquo;re trying to understand the &amp;ldquo;WHY&amp;rdquo; behind the existence of these tools and their importance, take a look at Cluster Management for example. Cluster configuration, installation and management can be difficult and requires intimate familiarity with the HPC hardware, OS, underlying architecture etc. Without specific tools that attend to and manage specific underlying HPC sub-components, HPC just won&amp;rsquo;t be what it is. So, it is worthwhile to understand the unique installation experience of the tools, such as the ones listed above to understand the complexity of HPC systems. Ready &amp;ndash; let&amp;rsquo;s dive in to the installation and function of these tools:&lt;/p&gt;&lt;p&gt;1. &lt;strong&gt;SCALI&lt;/strong&gt;: The &lt;a href="http://www.scali.com/"&gt;SCALI&lt;/a&gt; management and MPI software packages provide deployment, monitoring and job scheduling services for a cluster.&amp;nbsp; After you deploy this software, you will be able see all the compute nodes that may have been preconfigured or are configured on your system. Scali will enable you to monitor the systems and run jobs using the SCALI graphical interface.&amp;nbsp; In order to license the SCALI software, you must utilize the scainstall command to produce a &lt;em&gt;license request file.&lt;/em&gt;&amp;nbsp; This file can then be sent to SCALI to receive a permanent key. For those that need some hand-holding through this, luckily SCALI provides very comprehensive documentation on their website.&amp;nbsp; A large portion of the SCALI Manage User&amp;rsquo;s Guide is dedicated to pre-setup planning and configuration of the cluster and the network.&amp;nbsp; The documentation provides detailed recommendations about how you can set up their Ethernet-based network environment and out-of-band management network.&amp;nbsp; The documentation also provides a general overview about how to install and configure higher performance interconnects, including bonded Ethernet, Infiniband, Myrinet and SCI. The SCALI Manage interface provides simple tools to assist in configuring and testing DET, Infiniband, and Myrinet devices for use with the SCALI MPI implementation.&amp;nbsp; The SCALI MPI software supports multiple Infiniband stacks including Mellanox, Topspin, Voltaire and Infinicon.&lt;/p&gt;&lt;p&gt;2. &lt;strong&gt;HP-MPI&lt;/strong&gt;: &lt;a href="http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,1238,00.html"&gt;HP-MPI&lt;/a&gt; is Hewlett-Packard&amp;rsquo;s Linux-based implementation of the Message Passing Interface (MPI).&amp;nbsp; Many of the utilities distributed with HP-MPI are similar to other common MPI utilities such as MPICH - e.g. mpicc, mpirun, etc. In order to utilize the HP-MPI software, a license is required for each CPU core in the cluster.&amp;nbsp; To obtain a license file you are required to obtain the MAC address from each node (typically eth0) and input that information into a form at licensing.hp.com.&amp;nbsp; The resulting file can then be copied to the compute node. The HP-MPI software is non-functional until licensing files are generated for the nodes&lt;/p&gt;&lt;p&gt;3. &lt;strong&gt;CSM&lt;/strong&gt; (&lt;strong&gt;Cluster Systems Management&lt;/strong&gt;): The &lt;a href="http://www-03.ibm.com/servers/eserver/clusters/software/csm.html"&gt;CSM&lt;/a&gt; software suite is designed to automate the deployment and management of cluster nodes.&amp;nbsp; Nodes can be remotely installed with an operating system as well as the CSM software for later monitoring.&amp;nbsp; The CSM software supports RedHat and Novell on multiple platforms.&amp;nbsp; In order to obtain and install the CSM software one must register with IBM&amp;rsquo;s website and download the required RPMs. In order to configure CSM, it can remotely install the operating system and/or the CSM software on the compute nodes.&amp;nbsp; Much like Platform ROCKS, CSM makes use of PXE functionality and RedHat&amp;rsquo;s kickstart or the autoyast software to remotely install the operating system. The CSM software provides multiple methods for defining the nodes that should be deployed and managed:&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;a. The first method involves creating a hostname mapping (hostmap) file, which is a colon-delimited file that defines a number of attributes of each node&lt;br /&gt;b. The second method also involves manually creating and editing a &amp;ldquo;node definition&amp;rdquo; (nodedef) file.&amp;nbsp; This is the method suggested by the documentation for use with small clusters&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;Proper remote power and remote console capabilities greatly ease the administration and deployment of the compute nodes, however according to the &lt;em&gt;&lt;u&gt;CSM FAQ&lt;/u&gt;&lt;/em&gt; remote power management is not absolutely required. All the compute nodes must be rebooted (remotely or manually).&amp;nbsp; They are then PXE booted and installed with RHEL4 using the kickstart installation system.&lt;/p&gt;&lt;p&gt;4. &lt;strong&gt;Maui and Torque&lt;/strong&gt;: Both Torque and Maui are free software which must be compiled from the source distribution on the head node.&amp;nbsp; Maui is an open-source job scheduler for compute clusters.&amp;nbsp; It supports a number of task management features not found in other parallel batch processing software including policy-based scheduling and prioritization of tasks. Torque is an open-source resource manager for managing compute nodes and scheduled jobs.&amp;nbsp; It can integrate with Maui to provide additional features for scheduling and managing scheduled tasks.&amp;nbsp; Installation of Torque can be done using the guidance available in the &lt;a href="http://www.clusterresources.com/torquedocs20/1.1installation.shtml"&gt;&lt;em&gt;Torque 2.0 Admin Manual&lt;/em&gt;&lt;/a&gt;&lt;em&gt; .&lt;/em&gt;&amp;nbsp; &lt;/p&gt;&lt;p&gt;5. &lt;strong&gt;Platform Rocks&lt;/strong&gt;: &lt;a href="http://www.platform.com/Products/Platform.OCS/"&gt;Platform Rocks&lt;/a&gt; is a cluster deployment software that facilitates the deployment of various software stacks (&amp;ldquo;rolls&amp;rdquo;) onto the compute nodes.&amp;nbsp; The software is capable of deploying the base operating system and utilities required for cluster administration, management and scheduling.&amp;nbsp; The software can also manage configuration and updates to ensure consistency throughout the cluster. &lt;em&gt;Platform Rocks&lt;/em&gt; is a suite of utilities that are packaged together as separate installable rolls.&amp;nbsp; One of the main goals of the software is to allow for easy installation and integration of third-party rolls and applications.&amp;nbsp; One unique aspect to the Platform Rocks installation approach is that the software installs an operating system on the head node, and also installs all the required rolls at the same time.&amp;nbsp; The software can also automatically set up the subsystem required to install an operating system and other packages on the compute nodes (such as management agents, etc). &lt;/p&gt;&lt;p&gt;That about does it for a quick &amp;ldquo;deep-dive&amp;rdquo;. Let me insert a gentle reminder that these are not the only cluster or resource management technologies out there in the HPC space but rather the ones most prevalent. If you have additional tools that you have worked with, we&amp;rsquo;d like to hear from you and thank you for tuning in to Port 25. &lt;strong&gt;HAPPY HOLIDAYS!&lt;/strong&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;font face="Verdana" size="2"&gt;&lt;/font&gt;&lt;/p&gt;&lt;img src="http://port25.technet.com/aggbug.aspx?PostID=3387" width="1" height="1"&gt;</description><category domain="http://port25.technet.com/archive/tags/Kishi+Malhotra/default.aspx">Kishi Malhotra</category><category domain="http://port25.technet.com/archive/tags/Technical+Analysis/default.aspx">Technical Analysis</category><category domain="http://port25.technet.com/archive/tags/HPC/default.aspx">HPC</category><category domain="http://port25.technet.com/archive/tags/Windows+Server/default.aspx">Windows Server</category><category domain="http://port25.technet.com/archive/tags/Server+Center/default.aspx">Server Center</category></item><item><title>Thinking about HPC Infrastructure</title><link>http://port25.technet.com/archive/2006/12/01/thinking-about-hpc-infrastructure.aspx</link><pubDate>Fri, 01 Dec 2006 19:21:00 GMT</pubDate><guid isPermaLink="false">af7480c4-26b7-468d-87b0-2acebabb473d:3317</guid><dc:creator>kishi</dc:creator><slash:comments>4</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://port25.technet.com/rsscomments.aspx?PostID=3317</wfw:commentRss><comments>http://port25.technet.com/archive/2006/12/01/thinking-about-hpc-infrastructure.aspx#comments</comments><description>&lt;p&gt;&lt;font face="Verdana" size="2"&gt;I started the first HPC blog (See &amp;ldquo;&lt;a href="http://port25.technet.com/archive/2006/11/01/HPC-_2D00_-The-way-all-computing-will-look_2E002E002E00_.aspx" style="color: blue; text-decoration: underline; text-underline: single"&gt;previous blog&lt;/a&gt;&amp;ldquo;) with an understanding that HPC is an area where there has been a surge of activity from a development/investment standpoint. This segment of Information Technology has experienced a heightened level of engagement from OEM&amp;rsquo;s and partners, all trying to meet the growing computing needs of their customers. So after getting a basic understanding behind the importance of why HPC matters, the next logical step that needed uncovering was &amp;ldquo;How to think&amp;rdquo; about HPC Infrastructure and tap into the &amp;ldquo;wisdom&amp;rdquo; behind managing it. You might ask why this is relevant. For starters, setting up HPC Infrastructure is an experience that, just like any other infrastructure, be it Network or Storage, requires intricate planning and intimate familiarity with its individual contributing components. In case of HPC, let&amp;rsquo;s just say you really need to know your nodes J. Let&amp;rsquo;s talk more about what&amp;rsquo;s involved in setting up an HPC Infrastructure and how to think about it as a whole:&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;strong&gt;1.&amp;nbsp;&amp;nbsp;&amp;nbsp; Investment Impetus:&lt;/strong&gt; To successfully plan and design an HPC Infrastructure, the first and foremost step should be to &amp;ldquo;look beneath the surface&amp;rdquo; . This simply means to understand, the primary reason for investing in HPC. The demand for HPC equipment, linked to a set of business objectives should have clear purpose around the outcome and expectation. This is specially true today than at any other moment in time because the consumption of HPC cycles, specifically in the research and development areas across all verticals has seen a steady 70% growth over the past four years (Source: &lt;a href="http://www.hoise.com/primeur/06/articles/monthly/AE-PR-05-06-21.html" style="color: blue; text-decoration: underline; text-underline: single"&gt;primeur&lt;/a&gt; ). Despite this tremendous growth in the proliferation of HPC technology, &amp;nbsp;the growth pattern itself is sporadic. One of the reasons for it may be the complexity, not only in terms of design but also in terms of consumption as well. &amp;nbsp;Take the case of &lt;a href="http://www.c3.ca/ce/archives/uploadedFiles/LRP_english.pdf" style="color: blue; text-decoration: underline; text-underline: single"&gt;SHARCNET&lt;/a&gt; in Southern Ontario that developed a long range plan around adoption and implementation of HPC technology. According to the report, some of the elementary challenges around planning for HPC emerge from the fact that &amp;ldquo;it is an enabling technology for an extremely diverse set of researchers&amp;rdquo;. This embodies the essence of the sentiment behind the complexity and diversity predominant in the HPC space. &lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;strong&gt;2.&amp;nbsp;&amp;nbsp;&amp;nbsp; Planning and Designing Hardware:&lt;/strong&gt; While thinking about planning and designing an HPC infrastructure implementation, I spoke to several folks in this area, drew from a decade and a half of my experience as an Infrastructure Architect and thought of some key areas that I would consider. These include:&lt;/font&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;a.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Facility considerations&lt;/strong&gt;&lt;/em&gt; (Rackspace, Power and Cooling): Talk to any enterprise level Datacenter manager what his/her top 10 pain-points are and you are bound to hear the words &amp;ldquo;rackspace, power and cooling&amp;rdquo; in what follows. Dig deeper and you&amp;rsquo;ll realize that in any datacenter, there&amp;rsquo;s a fixed number of colo&amp;rsquo;s (&lt;a href="http://en.wikipedia.org/wiki/Colocation" style="color: blue; text-decoration: underline; text-underline: single"&gt;Colocation&lt;/a&gt;) you can populate based on the HVAC designs. This means that rackspace is what&amp;rsquo;s at a premium in each of these colo&amp;rsquo;s with every &amp;ldquo;u&amp;rdquo; accounted for. Packing in dense chipsets in small form-factor server add to existing power and cooling challenges &lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;em&gt;Translation&lt;/em&gt; &amp;ndash; you need more outlets and more airflow per rack than what you did a decade ago with a handful of 4 and 5u servers taking up the entire rack&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;b.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Physical Plant planning:&lt;/strong&gt;&lt;/em&gt; Quoting the resident HPC Guru &lt;a href="http://port25.technet.com/archive/2006/10/18/Introducing-Frank-Chism_3A00_--High-Performace-Computing-Blogger-on-Port-25.aspx" style="color: blue; text-decoration: underline; text-underline: single"&gt;Frank Chism&lt;/a&gt; who says &amp;ldquo;I cannot over emphasize the importance to planning for physical plant in HPC deployments. Things like room and raceways for well managed and planned cabling. HPC uses more cable than anything except maybe SAN. Also, pay attention to floor loads, air flow, clean and redundant power. Finally, never never forget out-of-band management. Deep subfloor really helps with all that cabling&amp;rdquo;. &lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;em&gt;Translation&lt;/em&gt; &amp;ndash; Effective HPC performance calls for an effective HPC design, which includes tweaking hard as well as soft components. These components can be as covert as chip-design or as overt as subfloor depth.&amp;nbsp;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;c.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Hardware and Processing Power:&lt;/strong&gt;&lt;/em&gt; Pushing the envelope on hardware and processor architectures today translates to increased performance (the heart and soul of HPC). Adding energy efficient hardware on top of the architecture amounts to greater investment in raw computing power, which in turn translates to building a sound HPC infrastructure. The key advantages one needs to look for in this scenario are faster data access and increased instructions. The word &amp;ldquo;performance&amp;rdquo; is repeated throughout the theme of this topic because it IS what HPC is all about, the ability to reduce the number of cycles to process data. Addressing the hardware and processing specs as part of core requirements ensures a smoother build-out.&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;strong&gt;3.&amp;nbsp;&amp;nbsp;&amp;nbsp; Implementing HPC Tools and Software:&lt;/strong&gt; Like any other piece of hardware, a HPC cluster is just that until software and tools exploit the underlying architecture to drive results and performance to do what it does best &amp;ndash; compute. When thinking of some core elements of HPC tools and software, here&amp;rsquo;s how I thought to break them up:&lt;/font&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;a.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Setup and deployment systems:&lt;/strong&gt;&lt;/em&gt; Setting up HPC clusters goes back to what I said earlier in Section 1 &amp;ndash; what do you want to do with it? Although there are various ways and methods that allow you to drive the software and installation experience of an HPC system, the bottom line is that this depends to a great extent of what components make up the genetic composition of the HPC cluster you ordered. Taking a look at some HPC software setup and deployment tools out there, a few mainstream ones are &lt;a href="http://www.scali.com/" style="color: blue; text-decoration: underline; text-underline: single"&gt;SCALI&lt;/a&gt; and HP-MPI (&lt;a href="http://h21007.www2.hp.com/dspp/tech/tech_TechDocumentDetailPage_IDX/1,1701,1238,00.html" style="color: blue; text-decoration: underline; text-underline: single"&gt;HP&amp;rsquo;s message passing interface&lt;/a&gt;). These packages provide deployment, monitoring and job scheduling services for managing and administering an HPC cluster just like IBM&amp;rsquo;s CSM (&lt;a href="http://www-03.ibm.com/servers/eserver/clusters/software/csm.html" style="color: blue; text-decoration: underline; text-underline: single"&gt;Cluster Systems Manager&lt;/a&gt;). In the Open Source space, there&amp;rsquo;s &lt;a href="http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php" style="color: blue; text-decoration: underline; text-underline: single"&gt;Maui&lt;/a&gt; and &lt;a href="http://www.clusterresources.com/pages/products/torque-resource-manager.php" style="color: blue; text-decoration: underline; text-underline: single"&gt;Torque&lt;/a&gt;, that work as job scheduler and resource managers for managing compute nodes and clusters. &lt;a href="http://www.platform.com/products/Rocks" style="color: blue; text-decoration: underline; text-underline: single"&gt;Platform Rocks&lt;/a&gt; is another suite of utilities that allow installation and integration of third party apps&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;b.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Parallel FS:&lt;/strong&gt;&lt;/em&gt; This is truly what I think is going to be the frontier for some intense activity over the next few years. Using &lt;a href="http://wikipedia.com/" style="color: blue; text-decoration: underline; text-underline: single"&gt;Wikipedia&amp;rsquo;s&lt;/a&gt; description, &amp;ldquo;&lt;span lang="EN"&gt;Distributed &lt;a href="http://en.wikipedia.org/wiki/Parallel" style="color: blue; text-decoration: underline; text-underline: single" title="Parallel"&gt;parallel&lt;/a&gt; file systems stripe data over multiple servers for high performance. Some of the distributed parallel file systems use &lt;a href="http://en.wikipedia.org/wiki/Object_storage_device" style="color: blue; text-decoration: underline; text-underline: single" title="Object storage device"&gt;object storage device&lt;/a&gt; (OSD) (In Lustre called OST) for chunks of data together with centralized &lt;a href="http://en.wikipedia.org/wiki/Metadata" style="color: blue; text-decoration: underline; text-underline: single" title="Metadata"&gt;metadata&lt;/a&gt; servers such as &lt;a href="http://en.wikipedia.org/wiki/Ceph_file_system" style="color: blue; text-decoration: underline; text-underline: single" title="Ceph file system"&gt;Ceph Scalable, Distributed File System&lt;/a&gt; from &lt;a href="http://en.wikipedia.org/wiki/University_of_California,_Santa_Cruz" style="color: blue; text-decoration: underline; text-underline: single" title="University of California, Santa Cruz"&gt;University of California, Santa Cruz&lt;/a&gt;. (Fault-tolerance in their roadmap.), &lt;a href="http://en.wikipedia.org/wiki/Lustre_(file_system%2529" style="color: blue; text-decoration: underline; text-underline: single" title="Lustre (file system)"&gt;Lustre&lt;/a&gt; from &lt;a href="http://en.wikipedia.org/wiki/Cluster_File_Systems" style="color: blue; text-decoration: underline; text-underline: single" title="Cluster File Systems"&gt;Cluster File Systems&lt;/a&gt;. (Lustre has failover, but multi-server RAID1 or RAID5 is still in their roadmap for future versions.) and &lt;a href="http://en.wikipedia.org/wiki/Pvfs" style="color: blue; text-decoration: underline; text-underline: single" title="Pvfs"&gt;Parallel Virtual File System&lt;/a&gt; (PVFS, PVFS2)&amp;rdquo;. &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;&lt;em&gt;Deep-Dive:&lt;/em&gt; At Base, parallel file systems are global namespaces for files that achieve high bandwidth via parallelism. That bandwidth comes in three dimensions, high aggregate bandwidth, high single stream bandwidth, and high metadata operations per second. No one seems to have achieved high performance in all of these dimensions. Don&amp;rsquo;t forget that the volumes of data are so large that backup is a major undertaking and thus, reliability is required as well. Further, nobody seems to be able to make a parallel file system that performance well for high-speed data for short I/Os, like say you do when compiling a major application&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;c.&amp;nbsp; &lt;em&gt;&lt;strong&gt;Multiple Networks:&lt;/strong&gt;&lt;/em&gt; A final comment on implementation of HPC is that HPC often has multiple networks. For example, it does little good to have a parallel file system that delivers gigabytes per second of data to single nodes if the network can&amp;rsquo;t handle that much bandwidth!&lt;/font&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;So in conclusion, here&amp;rsquo;s a recap on the learning behind setting up HPC Infrastructure:&amp;nbsp;&lt;/font&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;Comprehensive understanding beneath WHY you&amp;rsquo;re investing in HPC and what you expect as an outcome&lt;/font&gt;&lt;/li&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;Deep familiarity with the core HPC Hardware and design components&lt;/font&gt;&lt;/li&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;Facility and Physical plant considerations to ensure adequate cabling and subfloor space&lt;/font&gt;&lt;/li&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;Visibility into prominent HPC based software and toolsets&lt;/font&gt;&lt;/li&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;Understanding the three dimensions of bandwidth&lt;/font&gt;&lt;/li&gt;&lt;li&gt;&lt;font face="Verdana" size="2"&gt;And finally accommodating the concept of &amp;ldquo;Multiple Networks&amp;rdquo; into node design to accommodate the required bandwidth&lt;/font&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p class="MsoNormal"&gt;&lt;font face="Verdana" size="2"&gt;Look forward to getting back to you with more on HPC over the new few weeks again. Until then &amp;ldquo;Happy Computing&amp;rdquo;!!&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;font face="Verdana" size="2"&gt;&lt;/font&gt;&lt;/p&gt;&lt;img src="http://port25.technet.com/aggbug.aspx?PostID=3317" width="1" height="1"&gt;</description><category domain="http://port25.technet.com/archive/tags/Kishi+Malhotra/default.aspx">Kishi Malhotra</category><category domain="http://port25.technet.com/archive/tags/HPC/default.aspx">HPC</category><category domain="http://port25.technet.com/archive/tags/Community/default.aspx">Community</category><category domain="http://port25.technet.com/archive/tags/Server+Center/default.aspx">Server Center</category></item><item><title>HPC - The way all computing will look...</title><link>http://port25.technet.com/archive/2006/11/01/HPC-_2D00_-The-way-all-computing-will-look_2E002E002E00_.aspx</link><pubDate>Wed, 01 Nov 2006 22:37:00 GMT</pubDate><guid isPermaLink="false">af7480c4-26b7-468d-87b0-2acebabb473d:3249</guid><dc:creator>MichaelF</dc:creator><slash:comments>7</slash:comments><wfw:commentRss xmlns:wfw="http://wellformedweb.org/CommentAPI/">http://port25.technet.com/rsscomments.aspx?PostID=3249</wfw:commentRss><comments>http://port25.technet.com/archive/2006/11/01/HPC-_2D00_-The-way-all-computing-will-look_2E002E002E00_.aspx#comments</comments><description>&lt;p&gt;&lt;font face="Verdana" size="2"&gt;I have been itching to write on this subject ever since I first met w/ Doug Lora and &lt;a href="http://port25.technet.com/archive/2006/10/18/Introducing-Frank-Chism_3A00_--High-Performace-Computing-Blogger-on-Port-25.aspx" style="color:blue;text-decoration:underline;text-underline:single;"&gt;Frank Chism&lt;/a&gt;. High-Performance computing &amp;ndash; Wow! The first time someone explained the concept to me, I couldn&amp;rsquo;t help but visualize a scene from the movie &amp;ldquo;&lt;a href="http://imdb.com/title/tt0083658/" style="color:blue;text-decoration:underline;text-underline:single;"&gt;Blade Runner&lt;/a&gt;&amp;rdquo; and the futuristic feel of how Supercomputers actually work. My interest took me deeper into the heart of HPC to try to get my head around what HPC really is all about. High-performance computing systems, also referred to sometimes as &amp;ldquo;Supercomputers&amp;rdquo; are more prevalent today across prominent verticals such as Oil and Gas, Bioinformatics, Finance and Entertainment than ever before. &lt;a href="http://en.wikipedia.org/wiki/High-performance_computing" style="color:blue;text-decoration:underline;text-underline:single;" title="http://en.wikipedia.org/wiki/High-performance_computing" target="_blank"&gt;Wikipedia&lt;/a&gt; described HPC as &amp;ldquo; &lt;span&gt;Supercomputers and Computer Clusters i.e. computing systems comprised of multiple (usually &lt;a href="http://en.wikipedia.org/wiki/Mass_production" style="color:blue;text-decoration:underline;text-underline:single;" title="http://en.wikipedia.org/wiki/Mass_production
Mass production" target="_blank"&gt;mass-produced&lt;/a&gt;) processors linked together in a single system with commercially available &lt;a href="http://en.wikipedia.org/wiki/Connectivity_(computer_science%2529" style="color:blue;text-decoration:underline;text-underline:single;" title="http://en.wikipedia.org/wiki/Connectivity_(computer_science)
Connectivity (computer science)" target="_blank"&gt;interconnects&lt;/a&gt;. Usually, computer systems in or above the &lt;a href="http://en.wikipedia.org/wiki/Teraflop" style="color:blue;text-decoration:underline;text-underline:single;" title="http://en.wikipedia.org/wiki/Teraflop
Teraflop" target="_blank"&gt;teraflop&lt;/a&gt;-region are counted as HPC-computers&amp;rdquo; . An HPC Cluster is usually implemented to provide increased performance by splitting a computational task across many different &lt;a href="http://en.wikipedia.org/wiki/Node_(networking%2529" style="color:blue;text-decoration:underline;text-underline:single;" title="http://en.wikipedia.org/wiki/Node_(networking)
Node (networking)" target="_blank"&gt;nodes&lt;/a&gt; in the cluster. &lt;/span&gt;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;Applications that run on HPC systems are prevalent in heavy-duty research and experimentation to engineering scenarios including &lt;a href="http://en.wikipedia.org/wiki/Transaction_processing" style="color:blue;text-decoration:underline;text-underline:single;"&gt;transactional processing&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Data_warehouse" style="color:blue;text-decoration:underline;text-underline:single;"&gt;data warehousing&lt;/a&gt;, &lt;a href="http://en.wikipedia.org/wiki/Computational_fluid_dynamics" style="color:blue;text-decoration:underline;text-underline:single;"&gt;computational fluid dynamics&lt;/a&gt;, virtual prototype testing etc. The evolution behind clustering technology has a lot to do with the growing adoption of this technology as well. The additional element that has made this technology very attractive is price. An HPC cluster can be implemented at a fraction of&amp;nbsp;the cost today&amp;nbsp;as compared to&amp;nbsp;10-15 years ago. Take a Cray Y-MP c916 supercomputer that cost close to $40 million 15 years ago. Today, you can get computing power very close to that for almost $4,000. The proof of this adoption is in the fact that every industry vertical is deploying HPC. From a &amp;ldquo;mainframe&amp;rdquo; approach that existed decades ago, the implementation trend of this technology is gravitating towards decentralized grids and clusters.&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;So why do I say that this is the way all computing will look - HPC is already a $9 billion growing market (source &lt;a href="http://www.hpcwire.com/hpc/612853.html" style="color:blue;text-decoration:underline;text-underline:single;" title="http://www.hpcwire.com/hpc/612853.html" target="_blank"&gt;HPCwire&lt;/a&gt;). Evolution in this sphere is occurring at a blazing speed and the demand for HPC systems across various verticals is expected to multiply. Bottom-line &amp;ndash; HPC will play a very key role in how computing power is used, stacked and scaled. Not only that, the development of HPC propagated file-systems will be an area that we all should watch very closely over the next few years. Let&amp;rsquo;s also fully realize the &amp;ldquo;impact&amp;rdquo; of this technology in the business space. If done right, HPC clusters hold the key to superior systems performance, while maintaining reasonable economies of scale. Delving into the benefits of these clusters, which until recently was a domain of the scientific community, is literally like lighting a fuse to an explosive. I say this with a strong ethos because we have yet to recognize&amp;nbsp;all&amp;nbsp;possible&amp;nbsp;uses of HPC clusters with their underlying potential. According to some researchers at &lt;a href="http://cat.inist.fr/?aModele=afficheN&amp;amp;cpsidt=14477686" style="color:blue;text-decoration:underline;text-underline:single;" title="http://cat.inist.fr/?aModele=afficheN&amp;amp;cpsidt=14477686" target="_blank"&gt;INIST-CNRS&lt;/a&gt;, &lt;em&gt;&amp;ldquo;Analytic methods, statistical modeling, and pattern searching algorithms that are common in scientific computing can now be applied to the vast amounts of operational and historical data generated by business transactions to extract knowledge that can be used for competitive advantage&amp;rdquo;.&lt;/em&gt; &lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;A nagging question still remained in my head as to the WHY behind the importance of HPC? I kept looking for the single reason behind the heavy investment in this area and why it&amp;rsquo;s such a critical component of highly-complex computational analysis being done. The biggest advantage or theme that emerged from wherever I looked was that HPC is one of the few tangible technologies out there, whose sheer computing power helps solve highly complex computational workloads and problems. This is not to mention the solitary advantage of using this technology &amp;ndash; time. Time that it takes to resolve highly complex workloads is greatly reduced with a faster outcome. And anyone reading this blog knows the value of time and how it&amp;nbsp;is&amp;nbsp;THE most valuable element of all. And everyone knows, no matter how evolved and fast hardware can get, there will always be bleeding-edge problems that will demand processing power beyond what the best clusters can provide.&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana" size="2"&gt;My endeavor here at OSSL, is to understand this topic from the ground up, have an open discussion on the subject matter as well as educate the audience along the way. And how do I plan to do that &amp;ndash; well, we have started venturing into doing more with HPC and understanding the various HPC platforms and technologies out there. Over the course of the next few months I&amp;rsquo;ll be sharing more on this subject with all of you including market trends, evolution of HPC, Grid Computing Scenarios, &amp;ldquo;chip&amp;rdquo; supercomputing etc. &lt;/font&gt;&lt;font face="Verdana"&gt;&amp;nbsp;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana"&gt;-Kishi&amp;nbsp;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Verdana"&gt;&lt;/font&gt;&lt;/p&gt;&lt;img src="http://port25.technet.com/aggbug.aspx?PostID=3249" width="1" height="1"&gt;</description><category domain="http://port25.technet.com/archive/tags/Kishi+Malhotra/default.aspx">Kishi Malhotra</category><category domain="http://port25.technet.com/archive/tags/HPC/default.aspx">HPC</category><category domain="http://port25.technet.com/archive/tags/Community/default.aspx">Community</category></item></channel></rss>