Grid computing has become an important technology for
organizations that:
-
have long-running applications
that can benefit from parallel execution
-
want to leverage existing IT infrastructure
to optimize computing resources and manage data and computing workloads
The function of a grid
is to distribute tasks. Each of the tasks that are distributed across
the grid must have access to all the required input data. Computing
tasks that require substantial data movement generally do not perform
well in a grid. To achieve the highest efficiency, the nodes should
spend the majority of the time computing rather than communicating.
With grid computing using SAS Grid Manager, the speed at which the
grid operates is related more to the storage of the input data than
to the size of the data.
Data must either be
distributed to the nodes before running the application or—
much more commonly—made available through shared network libraries.
Storage on local nodes is discouraged. The data storage must scale
to maintain high performance while serving concurrent data requests.
The parallel data load
is monitored throughout.