 |
System Software Environment
|

Q1. What open source components are integrated in XC?
A1. The key open source components, in addition to the Linux distribution, include: SystemImager, Linux Virtual Server, SLURM, Nagios, RRDtool, Supermon, Parallel Distributed Shell (pdsh), Modules. A more comprehensive listing is in the XC QuickSpecs. XC integrates these with commercial components, in a specific design architected by HP, and pre-configured to allow rapid and simple deployment.
|

Q2. What commercial or proprietary components are integrated in XC?
A2. The two major components are Platform Computing’s LSF and HP-MPI, both of which are also available as separate commercial products. Also, integrated and tested are the system drivers for servers and interconnects necessary to support the XC cluster hardware components. In addition, XC includes HP-engineered performance and diagnostic tools.
|

Q3. What do I get with XC that I can’t build on my own?
A3. The XC components have been configured into a specific implementation designed by HP based on our experience with these technologies over many years. The components are integrated and tested to work together in a common environment. A user can replicate the basic XC environment, although it will take time and experimentation, and lack some of the deployment performance and diagnostic tools. The user as well will need to monitor the different sources of components for updates and patches, to maintain the environment. Also, the XC implementation is fully documented and supported, as a single product.
|

Q4. Can XC be run on top of Red Hat Enterprise Linux acquired separately?
A4. The XC software includes a full Linux distribution, compatible with Red Hat Enterprise Linux 4. However, XC Version 3.1 does permit, at initial start-up, the installation of Red Hat Enterprise Linux Version 4 (Update 3 or later) purchased separately.
|

Q5. We use LDAP for user accounts on our clusters. Can I use that with XC?
A5. User accounts and access privileges can be set up on an XC System using Linux administrative procedures incorporated in the standard XC operating environment. However, NIS and LDAP are also commonly used in various standard configurations.
|

Q6. What are “distributed services”?
A6. Distributed services refers to the XC software capability to take key management functions (services), typically residing on a single cluster head node, and allocate these services across multiple ‘service’ nodes. Typical services include: log-in, NAT/NIS services, job management, I/O services, monitoring. Distributing these services eliminates bottlenecks, and improves system availability. At initial cluster set-up, XC software will provide recommendations on service allocation based on the topology, but the system management has control to determine which services are distributed.
|

Q7. What provisions are made for system failure and high availability?
A7. An availability infrastructure provides for a robust and generic system interface to failover tools, such as HP Serviceguard or Heartbeat. The following services are failover-enabled at this time: the Nagios master, MySQL, NAT and LVS. In addition, XC system can be configured to increase redundancy and availability with distributed services for system functions. (see question above). Should a node fail running one of these services, the service can be manually resumed on an alternate node. The global file system can be deployed on a highly available file server such as SFS.
In addition, there are some key job-management features that help to limit single points of failure. The SLURM and LSF job launch mechanisms, as well as scheduling and resource management, provide application level failover support between the two administration nodes,
|


Q8. What resource/job management options are supported by XC software?
A8. XC system customers can choose to run standard LSF (with or without LSF's HPC extensions), or they can elect to run LSF-HPC integrated with SLURM which has been designed by Platform and HP specifically for XC systems. Alternatively, an XC system can be configured to run SLURM standalone or with the Maui scheduler, or it can be configured to run the Altair PBS Professional batch processing system (see PBS Pro How to).
|

Q9. How do SLURM and LSF interact within an XC job environment?
A9 . SLURM is responsible for the underlying allocation of resources to jobs. The resources are provided by nodes that are designated as compute resources. The LSF scheduler obtains all compute resource information from SLURM, and then uses that information to dispatch appropriately-scheduled jobs to SLURM. SLURM is run across the XC system, and LSF's scheduling operations and overhead for the XC system are confined to one node. In this integration, LSF regards the entire XC system as a single large "SLURM-based multi-processor machine". This makes it easy to add an XC system into an existing LSF cluster as a new compute resource.
|

Q10. What tools are there for integrating XC into a grid environment?
A10. HP closely collaborates with independent software vendors so that third-part grid software runs well on the XC. HP’s grid partners include Altair Engineering, Cluster Resources, DataSynapse, Platform Computing, and United Devices. HP tests the Globus Toolkit, the popular, open source middleware for grids, on XC clusters to ensure that Globus runs well on these systems. See Tips for installing Globus on XC.
|

Q11. The parallel application I have uses MPI, but does not have support for HP-MPI. What are my options?
A11. XC is, at its core, a Linux cluster. Applications, libraries and tools, including third-party MPI libraries such as MPICH, that run on systems running Red Hat Enterprise Linux 4 should run on XC system software, including third-party MPI libraries such as MPICH.
However, if you do have access to the application source code, you may want to recompile it to use the HP-MPI library. HP-MPI is a high-performance, robust, high-quality, native implementation, with support for MPI-1.2 and MPI-2 standards, and provides transparent support for leading interconnects, including InfiniBand, Myrinet and Quadrics. For commercial codes, dozens of ISVs support HP-MPI and are redistributing HP-MPI in their Linux parallel implementations. (click here for more information.) Contact HP to request HP-MPI support for your key applications.
|

Q12. How is HP-MPI integrated into the XC job environment?
A12. HP-MPI jobs are launched using the SLURM launch mechanism, which is responsible for the startup and coordination of the MPI processes, and responsible for process-cleanup after normal or abnormal termination of the MPI job.
|


Q13. What configurations are supported by XC software?
A13. XC software is supported on HP Cluster Platforms, offering a choice of rack-optimized or blade-based ProLiant system with AMD Opteron or Intel Xeon processors; or Integrity systems with Intel Itanium2 processors, as well as HP workstations for visualization. HP Cluster Platforms feature the leading high performance interconnects, including InfiniBand, Myrinet, Quadrics and Gigabit Ethernet. (Note: check QuickSpecs for specific supported node/network combinations.) Also, XC software has been deployed in non-HP Cluster Platform configurations as a custom deployment.
|

Q14. Can XC be used to manage a cluster with a mix of hardware servers?
A14. XC can be used in clusters having a mix of supported ProLiant servers including BladeSystems based on Xeon or Opteron processors; or in clusters having a mix of Integrity servers.
|


Q15. How do I obtain XC for a new system?
A15. You can purchase XC standalone, or as part of a cluster order. XC System Software is a featured option for both HP Cluster Platforms, and in Americas and Asia/Pacific, as an option on Cluster Platform Express. XC can be factory installed, as an option, when purchasing an integrated cluster. XC software licensing is on a per-processor (i.e., ‘socket’) basis.
|

Q16. Can I purchase XC as a stand alone product, and install it on my existing HP Cluster?
A16. Yes, you can install XC software on existing HPC cluster as long as you conform to the system configurations in the XC QuickSpecs and deploy on supported hardware. Ordering information is provided in the same XC QuickSpecs.
|

Support and Other Services
|

Q17. How do I get support for XC software?
A17. HP provides warranty support for HP XC System Software for the first 90 days after installation. North America customers can call 1-800-HP Invent to receive support. Customers in other countries can use this link: support phone numbers all other countries.
After the first 90 days, both fixed and flexible care pack services are available. Standard technical support and update service is offered for coverage 9 by 5, and for 24 by 7, for durations of one through five years (See QuickSpecs for most commonly selected part numbers.) For other coverage options, consult your HP Services Representative or reseller.
|

Q18. What training and consulting services are available for XC Software?
A18. HP Services, and our C&I organization, has a broad range of offerings to assist with deployment and management of HPC clusters, including XC-based solutions. Options include on-site assistance with quickstart, knowledge transfer, and training. Contact your HP Services Representative or contact HP via our HPC Web Form.
|
|