ADABAS Performance With Cache

By Dieter W. Storr

Based on a Presentation at ADABAS

Symposium; New Orleans, Louisiana; 1992

Last update: May 13,1998

Contents

General

External memory functions

Internal memory functions

Cache possibilities with ADABAS

Comparison

General

During performance analyses in Europe and America I have found, again and again, that poor on-line response times and long run times for batch applications are caused by high numbers of ADABAS calls and high rates of disk I/O. I have often counted between 10 and 20 million ADABAS calls per day; sometimes more.

Within a database/data-communication transaction I have discovered many table look ups, some of which were redundant. I have discovered awkward programming. I have discovered inefficient physical data modeling (denormalizing).

There are many ways of improving the performance: For some users, the large number of commands could be reduced by redesigning the programs to improve database accesses or by installing PREFETCH. One of the best ways is to reduce the physical I/Os per ADABAS command, per DC transaction, or for the entire application. If it is not possible to minimize the I/Os, you should utilize logical I/Os instead of physical ones, for example use internal storage or ADABAS buffer pool.

Hardware manufacturers offer other possibilities: Cache controllers, solid state devices, and main storage extensions. Let me expand a little on the individual components of internal and external storage structures.

External memory functions

In the mainframe world of IBM (MVS / VSE) and SNI (BS2000), manufacturers have tried in the past to optimize the very slow mechanical disk access.

Cache controller with read cache

IBM cache controllers have existed since the introduction of the 3880-3 and 3880-23, and compatible PCM controllers. Cache is controlled by microcode and improves read operations. With the newer model 3990, cache improves medium access time from between 10-30 ms to between 3-4 ms.

In the BS2000 area (SNI), control units 3860-42 with 3490 disks are still not yet available with cache. The medium access time is between 16.6 ms and 19.6 ms (H120); 24.3 ms (H60).

When a read command is issued, the block requested first will be located in the cache. If a copy of the data is in the cache (read hit), it is sent directly to the channel; if not (read miss) it is read from DASD, and the rest of the track (3990 model 3) is written into the cache for subsequent reads. This is called staging. In this way, read hits improve performance by saving I/O. Read misses will take as long as normal read I/Os, but can have a greater impact on the entire I/O subsystem, because they keep disks and controller busy for staging.

IBM cache controllers have existed since the introduction of the 3880-3 and 3880-23, and of compatible PCM controllers. Cache is controlled by microcode and improves read operations. With the newer model 3990, the medium access time is between 3-4 ms, and without cache it is between 10-30 ms.

In the BS2000 area (SNI), the control units 3860-42 with the 3490 disks are still not yet available with cache. The medium access time is between 16.6 ms and 19.6 ms (H120) and/or 24.3 ms (H60).

Cache controller with write cache

Controllers with write cache have been supported since the newer IBM models (3990-3) came on to the market. These controllers have the same basic write caching functionality as mentioned above. If a copy of the data is in the cache (write hit), it is updated and also written to DASD (branching transfer); if it is not (write miss), it is written to DASD, but not to the cache.

The new cache controller 3990 model 3 also provides

DASD Fast Write

One of the most important facilities is DASD fast write. With this capability, data with write hits are stored simultaneously at channel speed in cache and in NVS, and signals successful completion of the I/O to the CPU. Only when the cache or NVS need space will the data be written asynchronously to DASD.

But what happens if there is a power failure?

What effects does this have on ADABAS recovery?

When DASD fast write is active, the cache controller tries to guarantee data integrity during a power failure by keeping the data in NVS until the system is brought up again. The system then automatically writes all data changes to disk. However, the responsibility for data integrity lies with the user and the hardware supplier. Software AG does not give any guarantee in this case. Electronic storage - solid state device (SSD)

Electronic storage - solid state device (SSD)

Solid state devices are becoming more and more common and are employed for very fast accesses. They are characterized by the absence of seek times and queue times. Also, they have a very fast channel-to-cache transfer rate (at channel speed). They have their own hard disk and power supply, and are able in the case of a power failure to write the data automatically from the internal storage to the disk within 8 minutes.

Candidates for SSDs in the MVS environment are page datasets, swap datasets, control files (RACF, JES), log files, catalogs, TSP broadcast, ISPF datasets, CLIST datasets, VSAM indexes, or frequently used medium-sized datasets.

For example, the electronic storage device Amdahl 6680, an Electronic Direct Access Storage (EDAS) product for the IBM-MVS world, was installed at a German insurance company. EDAS provides

  • 4,5 Mb/sec data transfer rate
  • 32-512 Mb cache
  • Split into 16 logical volumes, each up to 128 Mb large
  • 4 accesses to 4 CPUs per volume possible simultaneously

    The SSD is approx. 20 times faster than normal disks. The data can be stored in a common format (e.g. 3380 - but still not 3390).

    In addition, Memorex offers a 6898 solid state device with a 6890 controller with comparable values. With this device, a big company in the North of Germany has improved its performance significantly.

    In the VSE environment, a user in Austria produced good results with a CPX 6580-1 SSD system. Combinations of 3-8 logical units (3380) with 64-256 Mbyte cache are possible. The load time for the hard disk at power on is 4 minutes, the unload time for the hard disk at power off is 8 minutes.

    Also, for the SNI/BS2000 environment there is an external 3410 SSD with a maximum of 202 Mbyte storage for the H120 and an access time of only 0.3 ms. This SSD has a hard disk and a battery, can write the data automatically from this internal storage to disk within 8 minutes.

    Internal storage functions

    Main storage with cache

    At the beginning I pointed out that the best way to improve performance is to do disk accesses without physical I/Os. With cache controllers and solid state devices (SSDs), an improvement in the access times can already be reached. However, you can achieve a 25-fold improvement in access times if the data are stored and held in the main storage. The disk accesses are measured in milliseconds (thousandths of a second) and the accesses to the main storage in nanoseconds (billionths of a second). The maximum 16 Mbyte available can be used by the programs and the buffers too.

    In the SNI/BS2000 environment, a disk access buffer (DAB) is used as a cache for prereading records. Since ADABAS Version 5.1 was released this DAB has not been able to improve ADABAS performance any further.

    With the introduction of MVS-XA, BS2000-XS, MVS-ESA, and VSE-ESA some changes and enhancements can be detected in the storage hierarchy.

    Extended Storage

    Under MVS-XA, BS2000-XS and MVS-ESA the buffer pools and programs may be placed in extended storage above the 16 Mbyte line. The limit of this extended storage is 2 gigabytes and this storage is part of the real memory.

    Expanded Storage

    Under MVS-ESA with the 3090 architecture, additional storage extensions have been created. The data-only spaces that are available for your programs are called data spaces and hiperspaces. These spaces are similar in that both are areas of virtual storage that your program can ask the system to create. Their size can be anything between 4 kilobytes and 2 gigabytes, as your program requests. Unlike an address space, a data space or hiperspace only contains user data; it does not contain system control blocks or common areas. Program code cannot run in a data space or hiperspace.

    To be able to compare data manipulation performance in data spaces and hiperspaces, you should understand how the system "backs" these two virtual storage areas. It uses the same resources to back data space virtual storage as it uses to back address space virtual storage: a combination of central storage and expanded storage (if available) frames, and auxiliary storage slots. The system can move infrequently used pages of data space storage to auxiliary storage and bring them in again when your program references them. The paging activity for data spaces includes I/Os between auxiliary storage paging devices and central storage. A program can reference data in a data space directly. It addresses the data by the byte, manipulating, comparing, and performing arithmetic operations.

    The system backs hiperspace virtual storage either with expanded storage only, or with a combination of expanded and auxiliary storage, depending on your choice. When you create a hiperspace, the system gives you storage that will not be the target of ESA/370 instructions and will not need to be backed by real storage frames. Therefore, when the system moves data from hiperspace to address space, it can make the best use of the available resources.

    In contrast to data space, a program does not directly access the data in a hiperspace. Instead, MVS provides a system service, the HSPSERV macro, to transfer the data between an address space and a hiperspace in 4 Kbyte blocks. The read operation transfers the blocks of data from the hiperspace into the address space buffer, where the program can manipulate the data.

    You have a choice of creating a standard hiperspace or an ESO (expanded storage only) hiperspace. The standard hiperspace is backed with expanded storage and auxiliary storage. The ESO hiperspace is backed with expanded storage only. To back this storage, the system does not use auxiliary storage slots; data movement does not include paging I/O operations. However, during peak use:

  • The system may not be able to back the data you are writing to the hiperspace
  • The system may take away the expanded storage that backs the hiperspace.

    These actions mean that data in an ESO hiperspace are volatile. The program must be prepared to read data from a permanent backup copy on DASD or recreate the data that was in the hiperspace. When the system swaps an address space out, it discards the data in any hiperspace that is owned by TCBs that are running in the address space. For this reason, you might consider making such an address space non-swappable.

    Cache possibilities with ADABAS

    ADABAS buffer pool

    ADABAS is able to read ASSO/DATA blocks from DASD into an internal storage buffer (buffer pool), to change the data there, and to hold them for a long time, in order to reduce the physical I/Os should the data need to be reused. The buffer size is determined with the ADARUN parameter LBP. The session statistics report on the ratio of physical to logical I/Os (buffer efficiency).

    Under ADABAS Version 4, values higher than 2.5 Mbytes did not improve performance. With Version 5.1 in combination with MVS-XA or BS2000-XS, the buffer pool may be placed in extended storage (above the 16 M Byte line). ADABAS Version 5.2 has an improved buffer algorithm and a new upper-upper-header in the buffer pool for rapid retrieval of large buffers. However, you should keep in mind that big ADABAS buffers may cause an increase in the paging rate.

    The ADABAS buffer pool can be placed above the 16 Mbyte line, by simply linking the ADARUN module specifying AMODE (31) and RMODE (24).

    Some users have already used buffers of approx. 50 Mbyte and have produced very good performance values under ADABAS Version 5.1. With ADABAS 5.2, a German bank has reached a surprisingly good buffer efficiency rating of up to 64, despite buffer sizes (LBP) of 128 Mbyte, 50-60 million calls per day, and approx. 7000 terminals.

    ADABAS FASTPATH (AFP)

    Application developed according to the entity-relationship model very often lead to normalized data designs with multiple tables. Some attributes of good designs are the independence of programs from the physical data model, and a close correlation between the data model and the business model. The normalized designs should produce a logical data model which gives flexibility and is easy to understand. However, such normalized designs result in an inefficient physical data model with potentially high data access rates. Where denormalization is not performed, especially for ADABAS, a high number of I/Os and high CPU consumption will be produced.

    What is the benefit of ADABAS FASTPATH (AFP) and how does it work?

    A local buffer (AFP buffer) is installed in the ADABAS link module in the user address space. When the application program is started, the whole of the relevant file will be transferred into this cache memory. When the records of the cached file are read, no ADABAS calls will be sent to the nucleus. This means no interregion communication and no I/Os. Also, these cached blocks are not transferred via the nucleus and therefore these blocks cannot monopolize the buffer pool.

    With the installation of a global buffer, multiple address spaces can access this cached memory.

    Updates, deletes, and adds can cause problems by changing values in the AFP buffer and on DASD. However, it is possible to steer the method of handling update commands using parameters. Different levels determine the flexibility of accesses; ignore updates, disallow updates, allow and purge set(s) from AFP processing, and allow updates via ISN index.

    ADACSH - Dynamic Caching

    With ADABAS Dynamic Caching, it is possible to determine and create additional storage in different areas of the operating system. Under MVS-XA an extended storage (XA cache) is made available, as is an extended storage (XA cache) or an expanded storage (ESA cache) under MVS-ESA.

    The fundamental mode of operation of both caches is identical. The goal of the XA cache and the ESA cache is to improve ADABAS performance by augmenting the ADABAS 5 buffer manager. Dynamic caching improves performance by reducing the number of read EXCPs to the database.

    ASSO and DATA blocks remain in the ADABAS buffer pool for as long as possible. When no more space is available for new blocks, the oldest one is overwritten. Very costly accesses in terms of I/Os are necessary to read these blocks again. Caching can eliminate or reduce the number of I/Os required. In other words, if the buffer pool does not contain enough space for new blocks, the unused blocks are moved into a cache area. This cache area is like an overflow area or a working set in the buffer pool. The max. sizes of these cache areas are 2 Gbyte for the data space, 2 Gbyte for the hiperspace and 2 Gbyte minus 16 Mbyte for extended memory.

    Totally new is the possibility of creating cache areas for the ADABAS work dataset (WORK) parts 2 and 3, in order to improve performance in environments that service large numbers of complex queries. As you know, the sort work space (LS) is a part of the work pool (LWP), and overflows are moved to WORK part 2. WORK I/Os on part 3 can be reduced for non-selective FINDs with high ISN quantity values by applying the parameter NSISN.

    All blocks contained in cache have to move back in the buffer pool before ADABAS read or write commands can be performed.

    New ADARUN session parameters determine the type of cache for ASSO, DATA, and WORK parts 2 and 3, the size of cache, which ASSO rabn and DATA rabn, etc.

    Comparison

    What about the advantages and disadvantages with the different kind of storage or cache? Which storage should be used? Which technology is good in the future and what is with further developments of the hardware house and software house?

    Cache controllers with read cache

    If no or less main storage is available and read accesses predominantly happen, then this cache controller with read cache is suited for this apply. It can be an advantage using this cache for ASSO and DATA, if only less buffer pool is available. The more ADABAS caches the less effective a controller with read cache. Not to recommend using cache for WORK and PLOG data sets, because they applicate most write I/Os.

    The rule of thumbs are: If read hits greater than 70 % of read miss, a cache controller with read cache can improve performance. But notice that cached DASD can have more impact (workload) on the whole I/O subsystem than uncached DASD, because read miss will keep disks and controller busy for staging.

    Cache controller with write cache and DASD fast write

    DASD fast write have only an effect on write hit operation. The block which ADABAS writes back needs to be in a cache and therefore a previous read miss for the same track must have been before one write hit can expect and even then the block in the cache may not be overwritten in the meantime.

    ASSO and DATA blocks in write cache can reduce the duration of a buffer flush. The nonvolatile storage in the cache controller should be big enough holding all blocks for the next buffer flush. 4M byte for IBM should be enough.

    Since ADABAS 5.2 asynchronously buffer flushes are possible. After filling the I/O pool from the buffer pool ADABAS nucleus get control again and the disk I/Os can start asynchronous. In this case forget the duration of a buffer flush. And this is no candidate for DASD fast write, too.

    What is with other ADABAS components? Protection log and WORK part 1 are no candidates, because a read hit presuppose a read miss. But PLOG and WORK part 1 do not have often reads. WORK parts 2 and 3 have at first writes and then maybe reads.

    Therefore, ADABAS is a very poor candidate for caching in the controller.

    Solid state device

    In the past time ADABAS users with a lot of I/Os and problems in long response times had solved the problems by getting a solid state device. These users did not have the possibilities to create big buffers, because older ADABAS versions did not support big buffers, the operating system did not support XA, or big buffers encase the number of pages dramatically. For these users a SSD was the solution, but the price paying this solution was very high.

    Using such SSDs you have to find out and exactly to determine which components of ADABAS are candidates for caching. High frequented parts of ASSO holding not in buffer pool are candidates. WORK part 1 with high update rate, WORK part 2 with a lot of complex finds are candidates, too.

    But, it is a hard work to find the candidates. You must have an ADABAS performance monitor collecting the I/Os by file number by ASSO components (AC, NI, UI) an by WORK parts 1-3.

    Main storage, expanded memory, and extended memory

    Since ADABAS Version 5.1 principle rule is: the bigger buffer pool the less the physical I/Os. But, reducing of the physical I/Os also means a higher CPU time. With release 5.2 the buffer algorithm was improved and an additional upper-upper-header was created. Now, increasing of the buffer pool provide greater performance benefits with less CPU costs.

    What is the benefit of ADABAS dynamic caching? Krupp Mak Kiel has started in March 1992 with ADABAS 5.2.2 in a test environment. ADABAS dynamic caching (ADACSH) will be released with 5.2.3. And we will install ADACSH next week. So I think I can report about our experiences at the Users' Conference in New Orleans in May 1992.

    SAG Users' Group of Germany discussed this new feature during the last meeting of SIG DBA together with SAG-DA. And we pointed out increasing the buffer pool is the best thing to do in most of the installations.


    Back to
    Top Page