Groby Computers Ltd - Leicester

Oracle Database Administration

Useful Stuff Oracle Scripts Unix Scripts

ORA-04031: unable to allocate 270424 bytes of shared memory
("large pool","unknown object","large pool","KSFQ Buffers")




Symptoms:

=========

RMAN backup fails with: ORA-04031: unable to allocate 270424 bytes of shared memory ("large pool","unknown object","large pool","KSFQ Buffers")

Oracle version – 10gR2

Actions:

========

Check database memory configuration parameters:

select name, value/1024/1024 "Size Mb"
from   v$parameter
where  name in
       (
           'sga_target', 'java_pool_size',
           'streams_pool_size', 'shared_pool_size',
           'large_pool_size', 'db_cache_size'
       );
 
NAME                         Size Mb
------------------------- ----------
shared_pool_size                   0
large_pool_size                    8
java_pool_size                     0
streams_pool_size                  0
sga_target                       512
db_cache_size                      0
 
6 rows selected.

This show that ASMM ( automatic shared memory management ) has been enabled and that the large pool should be at least 8Mb.

Find the actual size of the memory pools (example output from a later time):

select a.ksppinm  "Parameter",
       b.ksppstvl "Session Value",
       c.ksppstvl "Instance Value",
       a.ksppdesc "Description"
from   x$ksppi a, x$ksppcv b, x$ksppsv c
where  a.indx = b.indx and a.indx = c.indx
and    substr(a.ksppinm,1,2)='__'
and    a.ksppinm like '%size'
order  by a.ksppinm;
 
Parameter                   Session Value     Instance Value    Description
--------------------------- ----------------- ----------------- ------------------------------------------------------------
__db_cache_size             176160768         176160768         Actual size of DEFAULT buffer pool for standard block size b
__java_pool_size            4194304           4194304           Actual size in bytes of java pool
__large_pool_size           25165824          25165824          Actual size in bytes of large pool
__shared_io_pool_size       0                 0                 Actual size of shared IO pool
__shared_pool_size          314572800         314572800         Actual size in bytes of shared pool
__streams_pool_size         4194304           4194304           Actual size in bytes of streams pool

This SQL was ran at a later time so the sizes shown here do not reflect the sizes at the time of failure but it demonstrates the SQL needed to see if the large pool is big enough. If it is, then the size of the large pool per se is not the problem. Next we must check to see how much free space there is in the large pool.

Check free memory (example output from a later time):

select name, sum(bytes)/1024/1024 "Size Mb"
from   v$sgastat
where  pool = 'large pool'
group  by name;
 
 
NAME                                   Size Mb
----------------------------------- ----------
free memory                         15.1679688
KSFQ Buffers                        1.01953125
PX msg pool                             7.8125

Typically when this problem occurs, the large pool has plenty of free space – so why did the RMAN backup fail?

Dump the large pool contents to trace-file:

When the RMAN backup fails it will ( hopefully ) create a trace file containing all the necessary information to confirm what the problem is but if not, you can force a HEAPDUMP of the large pool using the statement below but note this may not demonstrate the issue under discussion, especially if the instance has recently been started ...

ALTER SESSION SET EVENTS 'immediate trace name heapdump level 32';

( see Julian dyke for more info )

Looking in the trace file generated it tells us why it was unable to allocate the required amount of space but it is not obvious and at first sight appears contradictory...

  1. 1) In the section headed ‘HEAP DUMP heap name="large pool"’ we can see
    1. a. how many extents were allocated to the large pool – e.g. 29 - EXTENTS 0 to 28
    2. b. size of large pool i.e. 29 * granule size ( 4M ) = 116Mb
  2. 2) Following on after the dump of the extents, the trace file shows the Free Lists – a sort of ‘index’ of the free memory and at the bottom of this is the total that was free at the time the trace file was generated:
    1. a. Total free space = 74682072 bytes
  3. 3) Result of RMAN backup – fail!
    1. a. ORA-04031: unable to allocate 270424 bytes of shared memory ("large pool","unknown object","large pool","KSFQ Buffers")

Despite the fact that there was 74682072 bytes of free memory in the large pool, the backup failed trying to allocate 270424 bytes. We can see why if we go back to the section headed ‘HEAP DUMP heap name="large pool"’. For example, here is part of the trace file showing what was stored in the beginning of Extent 26 ...

EXTENT 26 addr=0x62800000
  Chunk        062800058 sz=     8072    free      "               "
  Chunk        062801fe0 sz=     8224    no access "KGH: NO ACCESS
  Chunk        062804000 sz=     8160    free      "               "
  Chunk        062805fe0 sz=    16416    no access "KGH: NO ACCESS
  Chunk        06280a000 sz=     8160    free      "               "
  Chunk        06280bfe0 sz=    16416    no access "KGH: NO ACCESS
  Chunk        062810000 sz=    32736    free      "               "
  Chunk        062817fe0 sz=    16416    no access "KGH: NO ACCESS
  Chunk        06281c000 sz=     8160    free      "               "
  Chunk        06281dfe0 sz=     8224    no access "KGH: NO ACCESS

The KGH: NO ACCESS chunks are ASMM allocating space to somewhere else leaving lots of little bits of free space. So the backup fails due to fragmentation of the large pool caused by ASMM!

Solution:

=========

The position is now clearer and might have implications for all databases using ASMM and maybe even AMM in 11g.

To stop ASMM from fragmenting the space allocated to the large pool, it must be configured to have enough to do the RMAN backup. When RMAN does a full database backup including the controlfile the amount of space it requires is [ see metalink note id 336313.1 ] ...

size = 32Mb + ( channels * 4 * ) # table blocksize may be 1Mb

Comment:

========

If the only reason you have a large pool is to help RMAN and the backups run during quiet times, then it seems rather a waste to permanently allocate space in this way as it won’t then be available for other pools during the working day.

In practice, on the site that experienced the problem as described above – it was possible to set the large_pool_size to 24M and for RMAN to work fine but I guess this approach will always be a bit of a gamble.

Page Updated Thu Oct 13 21:43:39 BST 2011