Project Resources

Imagix 4D uses a series of databases to load the results of data collection and to perform its analysis functions. For large projects, the amount of data collected and the resulting memory and processing requirements can become significant. Through options for controlling project contents and database storage, larger sets of source code can be supported and tool responsiveness can be improved.

Core Data

Core entity and relationship data is always kept in memory. This data forms the basis of most of the Imagix 4D analysis and resulting displays, including file, class, and function graphs, cross reference information, and source browsing. Memory requirements for the core data are dependent on the number and uses of symbols (entities) in your code.

Typically, source code containing up to 4-10 million statements can be handled in a single project by the 32-bit version of Imagix 4D. Larger projects can be supported by running the 64-bit version of Imagix 4D, available for Windows and Linux. The 64-bit version uses twice as much memory for the same project, but by avoiding the memory limit of a 32-bit application is able to support larger projects. For example, to do basic analysis of the 4-10 million statement project mentioned above, the 64-bit version of Imagix 4D would require a 4GB of memory, twice what's available on a 32-bit system. Increasing the system size to 8GB would be enable projects containing nearly 8-20 million statements. However, with the larger projects, performance loading, analyzing and viewing data would start to suffer.

Memory requirements can be reduced by managing the contents of a project. One approach is to split your source code, creating separate projects that can be independently analyzed. If you’re able to create smaller projects that are meaningfully scoped, this has the additional benefit of eliminating extraneous data, enabling you to focus your analysis efforts within a given project. Comboprojects can be used to analyze the union of such projects. The memory consumed by comboprojects is the same as if the two (or more) projects had been loaded as a single, unified project.

Another approach for reducing the size of the core data is turn off the collection of specific information. The initial candidate is to eliminate data about local variables, an option available in the specification of project data sources. Doing so will restrict some of the analysis that can be performed; in particular, the Data Flows and several of the flow check reports require local variable data. However, because the relationships of local variables are so limited in scope, most of the analysis of Imagix 4D is unaffected by omitting this data.

More comprehensive data reduction is achievable by setting up a new project as either a Light or Minimal project. By default, any new project is created as a Regular project, and a full, extensive set of data is collected about the source code. Through a setting in the Options dialog (File > Options... > Data Collection > Project Location and Resources), a step can be added during project creation to specify one of the alternative project types. A Light project eliminates collection of data about types and macros as well as local variables. A Minimal project reduces project contents further, omitting all variables; this extends the supported project size by about 4x (15-40 million statements in a 32-bit environment). In the resulting projects, graph views, metrics and reports are limited to those that are supported by the available data.

Source Check and Metric Data

Software metrics and source check information is kept in a second database. Unlike core data, metrics data is loaded on demand. By default, the resulting metrics database is loaded into memory. Being demand driven, the data size depends on what metrics are requested; as a general approximation, the data size can be assumed to be 1-2 times the size of the core data. This means that when metrics or source checks are loaded, source code that can be handled in a single project by the 32-bit version of Imagix 4D is reduced to 1-5 million statements.

These memory requirements can be reduced through the Project Resource settings (File > Options... > Data Collection > Project Location and Resources). The Metrics and Source Checks Data setting applies across all projects in an installation. With a setting of `Load into disk database' or `Store values between sessions', projects containing up to 3-8 million statements are possible in the 32-bit version. This reduction is achieved by using a disk database. Using disk rather than memory results in a slowdown of 2-10 times in the initial calculation of metrics data. With the `Store values between sessions ' setting, this is a one-time slowdown for a given project; the second time metric information is loaded for a project, the load time is actually faster than from memory, because the metrics have already been calculated from the raw data.

Dataflow Data

Dataflow data used for flow check reports and Data Flows is kept in a third database. Like metrics data, dataflow data is loaded and calculated on demand. It is used for many of the flow check reports and for the Data Flows.

The underlying global dataflow analysis is very complex, and requires considerable time and memory. The size of the dataflow data ranges between 5 and 20 times that of core data, depending on such factors as number of non-local variables, number of functions and recursions, and cyclomatic complexity. This means projects can handle source code containing 0.2-0.3 million statements in the 32-bit version of Imagix. Projects of those sizes can take hours to analyze. In the 64-bit version with 16 to 32GB of memory, source code containing 2.0-3.0 million statements can be analyzed but it might take days to complete the analysis.