Answers to Commonly Asked Questions

When should someone consider using a cluster for their work?
Clusters are very specialized resources for performing certain types of computationally demanding work. More often than not a cluster is the wrong solution for data analysis and not all types of data analysis are even amenable for use with a cluster.

Tasks that require a lot of manual intervention, take modest amounts of time on traditional computers (a few hours, or perhaps a day) or that can't be run in parallel won't benefit from running on a cluster.

However, if a task takes a few hours, and you have 100's or 1000's of that kind of task to do, a cluster may be the best option.

If you still aren't certain, just ask!

What kind of computations can Gemini be used for?
Gemini excels as certain kinds of computations, performs very well on others and will perform poorly on others.

Numerically intensive calculations and calculations that scale to a large number of processors or that require very tightly coupled inter-process communication are ideal. Examples include: molecular dynamics simulations, computational fluid dynamics, quantum mechanics/computational chemistry calculations and image processing.

Gemini will also perform batch operations of tasks that can be easily split up and processed in chunks. Certain types of data-mining are possible or in-silico small-molecule compound docking.

If what you work on is I/O intensive with a lot of reading and writing to disk, Gemini is not the proper resource.

Who can use it?
Gemini is available for use to anyone at SLU with a demonstrable need for its level of computing capacity. Ideal projects require large continuous amounts of computing power for defined periods of time.

Gemini can't accommodate projects that only need a couple of CPUs or tasks that can be refactored using other means (e.g. converting scripts into executable applications).

Can a portion of the cluster be set aside for exclusive use by a specific user?
In general, no. Projects running on Gemini typically need all of the resources for defined periods of time (get on, calculate, get off). In some cases where nodes need to be reconfigured for scaling or performance purposes, a subset of the nodes may be temporarily re-provisioned.

Would what I'm working on benefit from using it?
Without knowing the specifics, it's hard to say. Check out the general requirements below, and after reading this FAQ you still aren't certain, please ask.

How much does it cost to use?
Gemini is considered a collaborative tool. There is no charge for using it. If you require a fair amount of assistance working out an analysis, a program to be written or some other kind of specialized help some form of cost recovery for time may be required. More often than not, "payment" comes in the form of co-authorship or a small amount of salary recovery in grant applications.

What are the requirements for an application to run on Gemini?
The general requirements are:
  • The application or scripts need to be able to run on the Linux operating system.
  • Capable of running via the clusters job scheduler.
  • Not require access to external resources or significant database access.
  • Run primarily without human intervention (that is, be non-interactive).
  • Not require commercial software licenses that aren't currently available or the cost of which can't be covered as a part of a given project.

Where can I get help getting an analysis running?
If you need help, or have questions, just ask!