3.5.1 Managing memory and disk use in ABAQUS

Products: ABAQUS/Standard  ABAQUS/Explicit  

References

Overview

For small analyses management of computer resources is generally of secondary concern, but with large models intelligent use of disk and memory resources is a critical part of the analysis process. Each of the resource management parameters provided by ABAQUS has default settings that are typically appropriate, but for large analyses you will find it necessary to modify resource management settings.

Understanding resource use

For ABAQUS disk and memory are effectively two similar means of storing data. Data that will be required after an analysis completes must eventually be written to disk; but during an analysis, disk and memory provide functionally equivalent storage mechanisms. Typically disk is a more abundant resource, while memory provides faster access to stored data. Management of ABAQUS resources hinges on this simple trade-off.

ABAQUS data

There are effectively two types of data generated by an ABAQUS analysis. The first is “output” data that must persist after an analysis is complete. Output data are typically either results that you require for postprocessing or data that are necessary to restart an analysis. As mentioned above, these data must be stored on disk before an analysis completes.

In addition, an analysis generates a considerable amount of “scratch” or temporary data. These are data that are needed only while an analysis is running. The scratch data can be subdivided into two groups: performance-critical data and generic data. The performance-critical data are always stored in memory, while the generic data can be stored either in memory or on disk.

Requirements and considerations

To run an analysis, the following requirements must be satisfied:

  • There must be sufficient disk space available to hold the requested output data.

  • There must be sufficient memory available to hold all performance-critical data.

  • There must be sufficient disk space available to hold all generic scratch data.

If the above requirements are satisfied, an analysis can be completed; however, for ABAQUS/Standard you may find that allowing ABAQUS to use additional memory will often improve performance. No scratch data are written to disk during the ABAQUS/Explicit analysis phase, since the majority of scratch data is performance-critical.

Resource management parameters

ABAQUS resource management parameters fall into two classes: memory management and disk management. A basic listing of the environment file parameters is given below followed by a description of how to best make use of these parameters. For information about the environment file, see Using the ABAQUS environment settings, Section 3.4.1.

Memory management parameters

The pre_memory parameter specifies an upper limit for the memory that can be used by the analysis input file processor, which is invoked for both ABAQUS/Standard and ABAQUS/Explicit analyses.

There are two additional environment file memory parameters for the ABAQUS/Standard analysis phase: the standard_memory parameter specifies an upper limit for the memory use in ABAQUS/Standard, and the standard_memory_policy parameter allows you to guide the way in which memory is used by ABAQUS/Standard.

There are no memory management parameters for the ABAQUS/Explicit analysis phase, since no scratch data are written to disk during this phase.

Environment file parameters can be set for a host, for a user, or for a particular job (see Using the ABAQUS environment settings, Section 3.4.1, for further discussion). Because a default memory setting that works well for one machine with a large amount of memory may not be ideal for another machine that has less memory, it may be useful to vary the default memory settings by machine.

The values specified for pre_memory and standard_memory must be reasonable for the machine being used. ABAQUS will not check the virtual or physical memory on your machine to make sure that the memory management parameters specified are valid for the machine. If you do specify a value for either pre_memory or standard_memory that is greater than the memory that is actually available, an ABAQUS analysis may run for some time before the problem is detected and the analysis ends with an error.

Disk management parameters

Management of output data is discussed in detail in Output, Section 4.1.1. Output data are written to files in the directory from which you launched the job.

ABAQUS/Standard scratch files are written to a separate scratch directory. You can control the directory used to hold the scratch files with the scratch environment file parameter. In addition, you can split certain scratch files between two physical disks using the split_xxx and spill_list_xxx parameters. For frequency analyses using the parallel Lanczos solver, you can have ABAQUS/Standard write the scratch files for the different Lanczos intervals to different scratch directories using the lanczos_scratch parameter, which can improve performance considerably if the machine has access to multiple independent file systems.

As explained above, no scratch data are written to disk for ABAQUS/Explicit, so you have to be concerned only with proper management of output data.

Units

Memory and disk sizing parameters can be specified in 64-bit words, bytes, kilobytes, or megabytes. The choice of units is indicated by including w for words, b for bytes, kb for kilobytes, or mb for megabytes following the memory size. In most cases ABAQUS reports memory use in bytes, kilobytes, and megabytes; however, if no unit is specified, words are assumed. For example,

pre_memory="100 mb"
standard_memory=20000000
will use up to 100 megabytes of memory during input file processing and 20 million words (152.5 megabytes) of memory for the ABAQUS/Standard analysis. If a unit is used, the memory setting value must be surrounded by quotes.

To be consistent with operating system memory measurement tools, a megabyte is defined by ABAQUS to be 1,048,576 bytes, not 1,000,000 bytes.

Input file processing

In general, the amount of disk space used during input file processing is not large. The amount of disk space needed for the analysis phase of a job is more likely to be a concern.

Memory management during input file processing may require some attention from you. The pre_memory parameter specifies the maximum amount of memory that can be used by the analysis input file processor. If the memory specified is insufficient to hold all the performance-critical scratch data needed during input file processing, an error message will be issued indicating that pre_memory must be increased. It is not possible for ABAQUS to accurately estimate the amount of memory that will be required to complete input file processing when the error message is issued. General guidelines for setting the pre_memory parameter are given below.

Guidelines for memory settings

The default value for pre_memory is 256 megabytes. This setting is sufficient for small jobs, but larger jobs will require that you reset pre_memory to a larger value. Table 3.5.1–1 lists some typical memory settings for problems of various sizes. The actual values required for pre_memory may vary considerably from problem to problem depending on the features used in a model.

Table 3.5.1–1 Typical pre_memory settings.

Degrees of freedompre_memory
250,000250 megabytes
1 million750 megabytes
2.5 million1200 megabytes
5 million2000 megabytes

Setting pre_memory on single-user machines

A common case is a user with a personal desktop workstation who is free to use all the memory on the machine. Such a user is advised to set pre_memory to a relatively large value (50% of a machine's physical memory may be reasonable).

If you are planning on running multiple ABAQUS jobs (or one ABAQUS job and another application that will use significant memory) simultaneously, the pre_memory setting should be decreased in proportion to the number of jobs or applications being run.

If you are running an unusually large job, setting pre_memory to 50% of the machine memory may not be sufficient. In such a case the analysis input file processor will issue an error message indicating that pre_memory is not sufficient. You should increase pre_memory to 90% of the physical memory on the machine and run only the large job (no other jobs or applications) on the machine. If the job still fails as a result of insufficient memory, you will need to find a machine with more resources to run the job (you should keep in mind that the analysis phase of the run will require more memory than the analysis input file processor).

Setting pre_memory on multi-user machines

A reasonable way to work on a multiple-user machine is to determine roughly how much memory each user can expect to have available. The pre_memory parameter can be set to 50% of each user's “available” memory.

Setting pre_memory when using queues

Users generally wish to submit their jobs to the queue with the smallest memory limit possible. Hence, setting pre_memory to some fraction of a total memory pool is a poor solution. In such a case you will have to gain some experience with the types of models in common use at your site. Memory use by the analysis input file processor varies significantly for a given number of elements depending on the ABAQUS features used. Some amount of testing with typical models is required to develop rules for memory use that can be used to select a queue. You may wish to perform a number of iterations, increasing pre_memory by a fixed amount each time, to find the minimum possible pre_memory setting for a particular problem.

ABAQUS/Standard analysis

An ABAQUS/Standard analysis consists of two phases with respect to memory use: an initialization phase and a computational phase. In the second phase the bulk of the computation required for an analysis is performed; thus, the computational phase requires significantly more memory than the initialization phase. The initialization phase ends with the calculation of estimates for the memory and disk space that will be required by the computational phase. These estimates are written to the printed output (.dat) file under the heading “MEMORY AND DISK ESTIMATE.” A data check analysis (see Execution procedure for ABAQUS/Standard and ABAQUS/Explicit, Section 3.2.2) is sufficient to obtain these estimates. Users running analyses of large models are advised to check these estimates to assess the resource requirements for a job.

The estimates used in calculating the memory required by the ABAQUS/Standard analysis do not include the memory required for writing results to the ABAQUS output database (.odb) file. In most cases the memory requirements for writing results should be negligible relative to the memory requirements for the analysis phases. Memory requirements for writing results will be most apparent when attempting to analyze small models in which a large number of result variables are requested over a significant fraction of the model.

Memory management

The ABAQUS/Standard memory parameters provide two levels of control over memory use. The standard_memory parameter specifies an upper limit on the memory that can be used by a single analysis on a machine. The standard_memory_policy parameter controls the amount of allocated memory that a job will use. It is not necessary to rerun the analysis input file processor after a data check run to reset standard_memory or standard_memory_policy.

The standard_memory parameter

The standard_memory parameter specifies the maximum amount of memory that can be allocated by an ABAQUS/Standard analysis. The default value is 256 megabytes. For an analysis to run, standard_memory must be larger than the amount of performance-critical data held in memory for the analysis. The estimates written to the .dat file during the initialization phase can be used to determine the appropriate setting for the standard_memory parameter. The first column that is relevant to memory use is labeled “MINIMUM MEMORY REQUIRED” and specifies the standard_memory setting that is needed to hold critical scratch data in memory. An attempt to run the analysis with standard_memory set below this value will result in an error. A second relevant entry in the estimates is labeled “MEMORY TO MINIMIZE I/O” and specifies the standard_memory setting that is required to hold all scratch data, both critical and generic, in memory.

If the memory specified by standard_memory is larger than the amount of performance-critical data for the analysis, the additional memory can be used to improve speed of access to generic scratch data that would otherwise be written to disk. The amount of additional memory used for this purpose is determined by the value of the standard_memory_policy parameter.

The standard_memory_policy parameter

The standard_memory_policy parameter has three possible values: MINIMUM, MODERATE, and MAXIMUM.

  • Setting standard_memory_policy to MINIMUM causes an analysis to run using the minimum possible memory, which corresponds to the memory listed under “MINIMUM MEMORY REQUIRED” in the printed output file.

  • Setting standard_memory_policy to MAXIMUM allows an analysis to run using as much memory as it needs to hold all the generic scratch data in memory. The amount of memory used in this case will be the minimum of the memory listed under “MEMORY TO MINIMIZE I/O” and the value of the standard_memory parameter.

  • Setting standard_memory_policy to MODERATE (the default) causes the analysis to run holding the most performance-sensitive subset of the generic scratch data in memory. This setting typically provides good performance at a reasonable memory cost.

The memory allocated for an analysis will never exceed standard_memory, regardless of the standard_memory_policy setting.

Scratch files are always written to disk, regardless of the standard_memory_policy setting. This allows ABAQUS/Standard to restart from the last completed increment in the event that the job terminates prematurely. Writing the scratch files to disk does not affect performance significantly since this operation is typically handled efficiently by the operating systems under which ABAQUS/Standard runs. Reading from disk is generally more detrimental to performance.

Guidelines for memory settings

The standard_memory parameter allows you to configure ABAQUS/Standard to provide information as early as possible if an analysis is going to require more memory than is available. You can specify the amount of memory that should generally be available to ABAQUS/Standard on a particular machine in the host environment file. Settings can be modified as necessary for individual jobs in job-specific environment files. Reasonable settings for a particular machine depend on the size of the problems being run and how the machine is being used. Once a value for standard_memory is chosen for a machine, the standard_memory_policy parameter can be used to further tune memory use.

You should be aware of the difference between physical and virtual memory. When virtual memory is used, a machine's operating system simply uses disk for additional memory. While this can be useful, memory access may require I/O operations that add a considerable performance penalty. Therefore, the following guidelines for managing memory in ABAQUS/Standard are always given relative to the physical memory on a machine. Virtual memory should be used only when necessary and with awareness of the associated performance penalty.

Setting standard_memory on single-user machines

For a single-user machine that is dedicated to running ABAQUS/Standard, setting standard_memory to 90% of the machine's physical memory is sensible. If the estimates calculated after the initialization phase indicate that the job requires more than the allocated memory, the job is too large to run efficiently on this machine. At this point you can decide to move the analysis to another machine or to run the analysis using the machine's virtual memory, accepting the performance penalty for doing so.

For a single-user machine that is used to run both ABAQUS/Standard and other applications simultaneously, setting standard_memory to a lower percentage of the machine's physical memory makes sense. If an analysis requires more than the allocated memory, you can decide to increase standard_memory and continue the job. However, ABAQUS/Standard will have to contend with the other applications for memory, which will impair the efficiency of both ABAQUS/Standard and the other applications. If the other applications are interactive, the performance degradation could be problematic. In such a case you might decide to delay continuing the analysis until the machine can be dedicated to running ABAQUS/Standard alone.

Setting standard_memory on multi-user machines

The guidelines for setting standard_memory on a multi-user machine are very similar to those for single-user machines, except that a judgement must be made as to the amount of memory that each user on the machine can expect to have for a single analysis. A reasonable approach might be to divide the machine's physical memory by the number of expected simultaneous jobs. As with the single-user machine, if you do attempt to run a job that requires standard_memory to be larger than the machine default, ABAQUS/Standard will issue an error message after calculating the memory estimates. You can wait until machine use is less than normal before increasing standard_memory and continuing the job.

Setting standard_memory when using queues

Often queues have an associated memory limit, and determining the appropriate queue for a job requires some judgement. You are advised to run a data check analysis and select a queue based on the estimates provided in the printed output file. However, for large analyses even a data check analysis can require a large amount of standard_memory. Choosing an appropriate queue for a data check analysis requires some experience with particular classes of problems. You may want to submit data check runs initially to queues with very large memory limits to get the necessary estimates. An appropriate queue can then be chosen to actually run the job. In general, it makes sense to set standard_memory to about 90% of the memory limit for the queue.

Setting standard_memory_policy

In previous versions of ABAQUS/Standard any memory available under standard_memory was used to hold all possible generic scratch data in memory. However, the improvement in performance that results is often not significant enough to justify the considerable increase in memory use. Holding a subset of the generic scratch data in memory can often provide good performance at a considerably lower memory cost. Therefore, the default value for the standard_memory_policy parameter is MODERATE. The MAXIMUM setting should be used only for analyses in which I/O to the scratch files is critical to performance, such as eigenvalue analyses and analyses with perturbation steps that do not require matrix decomposition. Even for such analyses the MAXIMUM setting should be used only when you know there is sufficent physical memory to hold all scratch files in memory. The MINIMUM setting for standard_memory_policy is provided for cases in which the memory on a machine is only just sufficient to allow an analysis to run.

Examples

The following examples illustrate the effects of the ABAQUS/Standard memory management parameters for some typical usage cases:

Single-user machine with 512 megabytes of physical memory and 1 gigabyte of virtual memory

The host environment file is used to set standard_memory to 450 megabytes. The standard_memory_policy parameter is set to MODERATE. An analysis that requires a minimum of 300 megabytes of memory is run; 1.5 gigabytes of memory is required to hold all scratch files in core. The job runs using 350 megabytes of memory since ABAQUS/Standard determines that 50 megabytes above the minimum will provide good performance. You are not required to change any memory settings.

For a problem that requires a minimum of 750 megabytes of memory, ABAQUS/Standard issues an error message after the estimates are calculated indicating that standard_memory is insufficient to run the job. You increase standard_memory to 800 megabytes and continue the analysis. The job completes successfully using 800 megabytes of memory, but the performance is quite poor since considerable I/O was required due to the extensive use of virtual memory.

Again a problem requiring 300 megabytes of memory and 1.5 gigabytes of memory to hold all scratch files is run. standard_memory is set to 750 megabytes and standard_memory_policy is set to MODERATE. The job again runs using 350 megabytes of memory. The standard_memory_policy parameter value is changed to MAXIMUM, and the job is rerun. This time it uses 750 megabytes of memory, and the elapsed time required is greater than for the initial run since ABAQUS/Standard and the operating system have to contend for memory.

Multi-user machine with 4 processors, 8 gigabytes of memory, and 12 gigabytes of virtual memory

The standard_memory parameter is set to 2 gigabytes in the host environment file. A job that requires a minimum of 4.5 gigabytes of memory is run. After calculating the estimates, ABAQUS/Standard issues an error message indicating that the current setting for standard_memory is insufficient. You check machine use and finds that three other users are running jobs that require 2 gigabytes of memory each. You increase standard_memory to 6 gigabytes of memory, set standard_memory_policy to MINIMUM, and continue the analysis. The job runs using 4.5 gigabytes of memory. However, since the total memory use on the machine is 10.5 gigabytes, there is considerable competition for physical memory. There is sufficient virtual memory on the machine to allow all four jobs to run, but the performance for all four jobs is quite poor. After the jobs complete, there are no jobs running on the machine. You then resubmit the job with standard_memory_policy set to MODERATE and find that the job runs using 5.3 gigabytes of memory. The performance is much better than in the initial run since only physical memory is used.

Scratch file management

In some cases the size of a particular ABAQUS/Standard scratch file may exceed the space available on an individual disk, or it may exceed a file size or file system limit for a given machine. Therefore, the following parameters can be used to split the more substantial ABAQUS/Standard scratch files into separate files either on the same disk or on different disks. For large ABAQUS/Standard jobs the factor (.fct) file will usually be the largest file and can be split using the split_fct and spill_list_fct parameters. This functionality has also been extended to allow the operator (.opr), solution (.sol), Lanczos vector (.lnz), Lanczos eigenvector (.eig), and Lanczos scratch (.scr) files to be split. You can always split the Lanczos vector, eigenvector, and scratch files in this manner. However, if the parallel Lanczos solver is used, it is more effective to specify the directories using the lanczos_scratch environment parameter as described below.

General scratch file management parameters

split_xxx

List of file sizes to be used in splitting a sparse solver file specified by xxx, which can be set to fct, opr, sol, lnz, eig, or scr. The units for the size of a given file can be specified following the file size. The available units are b (bytes), mb (megabytes), and w (words). If the units are not specified for a file size, the size is assumed to be in words. If no sizes are specified with split_xxx, the maximum file size allowed on the platform in question will be used.

spill_list_xxx

List of directories to be used for pieces of a sparse solver file specified by xxx, which can be set to fct, opr, sol, lnz, eig, or scr. If no directories are specified with spill_list_xxx, the scratch directory will be used for all the sparse solver files. This parameter is used in conjunction with split_xxx. Please refer to the example below for further explanation.

You must create the directories specified by the spill_list_xxx parameter and give them write permission.

Determining when to split files

An estimate of the size of the scratch files used in each step is printed under the heading “SIZE ESTIMATES FOR CURRENT STEP” in the printed output (.dat) file. If you are running a large job in which scratch files may exceed either available disk space for a single disk or a file system limit, you should check the file size estimates after running a data check analysis. At this point you can split files as necessary before continuing the analysis. In most cases the only files that might need to be split are the .fct and .lnz files. Specifying spill_list_xxx for a particular file makes it possible to move that file to a different file system without actually splitting the file, which can be useful.

Example: Splitting the factor (.fct) file

Consider using the parameters necessary to split the factor (.fct) file. The syntax of the commands is as follows:

split_fct=[" "," ",""," "]
spill_list_fct=["","","",""]
The units that are listed for the file sizes for split_fct are optional. The first piece of the factor file (size ) will be written to the file id_metsp.fct in the directory , the second piece (size ) to id_metsp.fct_1 in the directory , the third piece (size ) to id_metsp.fct_2 in the directory , and so on.

There are three cases that must be considered:

  1. The number of file sizes specified is equal to the number of directories specified (n=m, where n is the number of sizes given by split_fct and m is the number of directories given by spill_list_fct). In this case a file of each specified size is written to the corresponding directory. If the nth file requires a size that is greater than , a new file id_metsp.fct_n+1 of maximum size will be created in the directory . This process will be repeated as necessary.

  2. The number of file sizes specified is greater than the number of directories specified (n > m). In this case the first m files will be written to the corresponding directory. All subsequent files will be written to the directory . If more than n files are required, additional files will be created in the same fashion as in Case 1.

  3. The number of file sizes specified is less than the number of directories specified (n < m). In this case the first n files will be written to the corresponding directory. If the nth file requires a size that is greater than , the file id_metsp.fct_n+1 will be written to the directory . This process will continue as necessary until the directory list is exhausted. All subsequent files will be written to the directory.

Parallel Lanczos scratch file management

Analyses using the parallel Lanczos eigensolver (see Natural frequency extraction, Section 6.3.5) generate sets of scratch files (described above) for each of the Lanczos intervals; i.e., each Lanczos interval will have its own Lanczos vector (.lnz), Lanczos eigenvector (.eig), Lanczos scratch (.sct), and solver factor (.fct) files. These files are distinguished internally by appending the interval number to the job name and essentially increase the disk space requirements for storing the scratch files by a factor equal to the number of Lanczos intervals.

In addition to the increase in disk space requirements, the parallel Lanczos solver performs factorizations and forward-backward passes simultaneously (and independently) on each Lanczos interval as the analysis proceeds, which will considerably increase the I/O cost as the number of intervals increases since they will contend for the same I/O resources. To eliminate this I/O contention, the lanczos_scratch environment parameter can be used to specify additional file systems (directories) that ABAQUS/Standard can use for scratch files (these directories must exist and have write permission prior to executing the job). Since all scratch files for a given Lanczos interval will be written to the same directory, there is no need to split the individual files as described in the previous sections (and it is, therefore, not allowed). ABAQUS/Standard will use the scratch directory in addition to all the directories specified by the lanczos_scratch parameter for these scratch files. If the number of Lanczos intervals exceeds the number of directories specified by scratch and lanczos_scratch, the additional files will be written to scratch; if additional directories are given, they will be ignored.

Consider the following line in the environment file:

lanczos_scratch=["","","",""]
If a frequency extraction is then used with the parallel Lanczos eigensolver, the scratch files from each of the frequency intervals (except the first, or zero, interval files which are written to scratch) will be written to a different directory mentioned in the list. If the number of frequency intervals is greater than , the files from the additional frequency intervals will be written to the directory specified by scratch.