Tuesday, December 7, 2010

Solaris Error: 32: Broken pipe or Linux Error: 32: broken pipe

Yesterday I found in alert log file bellows warning.
and In that time some clients informed us they are not able to log in
to the application and after some times they able to log in to
database automatically. so we tried to find out the cause and solution. see below...



It is a dedicated environment .

In parameter file
----------------------
processes = 4000



===========================================
Content of alert log file.
===========================================


Mon Dec 6 16:03:00 2010
Thread 1 advanced to log sequence 67690 (LGWR switch)
Current log# 3 seq# 67690 mem# 0: /d01/oracle/oradata/stlbas/redo03.log
Mon Dec 6 16:20:39 2010
Process J000 died, see its trace file
Mon Dec 6 16:20:39 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:20:39 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Process J000 died, see its trace file
Mon Dec 6 16:20:45 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:20:45 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Mon Dec 6 16:21:01 2010
Process J000 died, see its trace file
Mon Dec 6 16:21:01 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:21:01 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Mon Dec 6 16:21:17 2010
Process J000 died, see its trace file
Mon Dec 6 16:21:17 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:21:17 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Mon Dec 6 16:36:24 2010
Process J000 died, see its trace file
Mon Dec 6 16:36:24 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:36:24 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Process J000 died, see its trace file
Mon Dec 6 16:36:30 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:36:30 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Mon Dec 6 16:36:41 2010
Process J000 died, see its trace file
Mon Dec 6 16:36:41 2010
kkjcre1p: unable to spawn jobq slave process
Mon Dec 6 16:36:41 2010
Errors in file /d04/admin/stlbas/bdump/stlbas_cjq0_1885.trc:

Mon Dec 6 16:38:18 2010
Thread 1 advanced to log sequence 67691 (LGWR switch)
Current log# 4 seq# 67691 mem# 0: /d01/oracle/oradata/stlbas/redo04.log



====================================
contents in the trace file are -
====================================


/d04/admin/stlbas/bdump/XXXX_cjq0_1885.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, Oracle Label Security, OLAP, Data Mining Scoring Engine
and Real Application Testing options
ORACLE_HOME = /d04/oracle/ora102
System name: SunOS
Node name:XXXX
Release: 5.10
Version: Generic_142900-07
Machine: sun4v
Instance name: XXXX
Redo thread mounted by this instance: 1
Oracle process number: 26
Unix process pid: 1885, image: oracle@XXXX (CJQ0)

*** SERVICE NAME:(SYS$BACKGROUND) 2010-12-01 13:24:30.248
*** SESSION ID:(2191.1) 2010-12-01 13:24:30.248
*** 2010-12-01 13:24:30.248
Process J000 is dead (pid=25006, state=3):
*** 2010-12-01 13:24:37.277
Process J000 is dead (pid=25012, state=3):
*** 2010-12-01 13:59:56.397
Process J000 is dead (pid=1185, state=3):
*** 2010-12-01 15:55:42.277
Process J000 is dead (pid=15686, state=3):
*** 2010-12-06 16:20:39.526
Process J000 is dead (pid=9458, state=3):
*** 2010-12-06 16:20:45.565
Process J000 is dead (pid=9480, state=3):
*** 2010-12-06 16:21:01.641
Process J000 is dead (pid=9518, state=3):
*** 2010-12-06 16:21:17.712
Process J000 is dead (pid=9550, state=3):
*** 2010-12-06 16:36:24.213
Process J000 is dead (pid=11942, state=3):
*** 2010-12-06 16:36:30.238
Process J000 is dead (pid=11970, state=3):
*** 2010-12-06 16:36:41.289
Process J000 is dead (pid=11998, state=3):



===============================================================
content of listener log file like bellows (in that time only)
================================================================

06-DEC-2010 16:16:21 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=001zohur))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1771)) * establish * STLBAS * 0
06-DEC-2010 16:16:21 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\RWRBE60.exe)(HOST=APPLICATION-02)(USER=154shahadat))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.32)(PORT=1528)) * establish * STLBAS * 0
06-DEC-2010 16:16:21 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\RWRBE60.exe)(HOST=APPLICATION-07)(USER=043sathekur))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1758)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:21 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=038alfee))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1772)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:22 * service_update * stlbas * 0
06-DEC-2010 16:16:24 * service_update * stlbas * 0
06-DEC-2010 16:16:25 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=003aftab))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1794)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:25 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\RWRBE60.exe)(HOST=NEW-SUN-APP)(USER=023sohel))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.37)(PORT=4291)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:27 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=038alfee))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1796)) * establish * STLBAS * 0
06-DEC-2010 16:16:27 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=038alfee))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1797)) * establish * STLBAS * 0
06-DEC-2010 16:16:28 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\RWRBE60.exe)(HOST=APPLICATION-11)(USER=030salah))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.33)(PORT=3982)) * establish * STLBAS * 0
06-DEC-2010 16:16:29 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=stlbas)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=REPORTS_APP)(USER=504refayet))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.233)(PORT=4995)) * establish * stlbas * 0
06-DEC-2010 16:16:29 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=019aporna))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1800)) * establish * STLBAS * 0
06-DEC-2010 16:16:29 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=stlbas)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=REPORTS_APP)(USER=504refayet))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.233)(PORT=4996)) * establish * stlbas * 0
06-DEC-2010 16:16:30 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=003aftab))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1801)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:30 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=STLBAS)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=APPLICATION-07)(USER=019aporna))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.36)(PORT=1802)) * establish * STLBAS * 12518
TNS-12518: TNS:listener could not hand off client connection
TNS-12547: TNS:lost contact
TNS-12560: TNS:protocol adapter error
TNS-00517: Lost contact
Solaris Error: 32: Broken pipe
06-DEC-2010 16:16:30 * service_update * stlbas * 0
06-DEC-2010 16:16:30 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=stlbas)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=REPORTS_APP)(USER=501azad))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.233)(PORT=4997)) * establish * stlbas * 0
06-DEC-2010 16:16:31 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=stlbas)(CID=(PROGRAM=D:\OraNT\BIN\ifrun60.EXE)(HOST=REPORTS_APP)(USER=501azad))) * (ADDRESS=(PROTOCOL=tcp)(HOST=10.11.1.233)(PORT=4999)) * establish * stlbas * 0
06-DEC-2010 16:16:32 * service_update * stlbas * 0









What is the cause ?
and
why it shown "Solaris Error: 32: Broken pipe" ?




===================================================
===================================================
solutions:
===================================================
===================================================



1. listener log file is so big (greate than 2GB in linux)

2.

You have reached your max utilization of process parameter.

So increase processes and sessions parameter.


[ Each dedicated session creates process onto the server, consequently, you may overload your server itself. These parameters should not be set too large. you can think about shared server

Check CPU/Memory utilization of server before increasing the values .

Also you might have to increase other parameter also like SGA and PGA accordingly. So check their utilization also.

You will have to analyze database load during peak time and also find out the which processes are creating so many sessions. ]


SQL> select * from v$resource_limit ;

RESOURCE_NAME CURRENT_UTILIZATION MAX_UTILIZATION INITIAL_AL LIMIT_VALU
------------------------------ ------------------- --------------- ---------- ----------
processes 1607 4000 4000 4000
sessions 1596 4005 4405 4405
enqueue_locks 1380 6280 57390 57390
enqueue_resources 563 2901 19600 UNLIMITED
ges_procs 0 0 0 0
ges_ress 0 0 0 UNLIMITED
ges_locks 0 0 0 UNLIMITED
ges_cache_ress 0 0 0 UNLIMITED
ges_reg_msgs 0 0 0 UNLIMITED
ges_big_msgs 0 0 0 UNLIMITED
ges_rsv_msgs 0 0 0 0
gcs_resources 0 0 0 0
gcs_shadows 0 0 0 0
dml_locks 43 718 19380 UNLIMITED
temporary_table_locks 0 3 UNLIMITED UNLIMITED
transactions 306 722 4845 UNLIMITED
branches 1 13 4845 UNLIMITED
cmtcallbk 2 4 4845 UNLIMITED
sort_segment_locks 1594 4451 UNLIMITED UNLIMITED
max_rollback_segments 136 361 4845 65535
max_shared_servers 0 0 UNLIMITED UNLIMITED
parallel_max_servers 1324 3600 2560 3600

22 rows selected.


or

SQL> select * FROM v$license;


========================
caution:
===========================



PROCESSES
=============================================
Property Description
Parameter type Integer
Default value 40 to operating system-dependent
Modifiable No
Range of values 6 to operating system dependent
Basic Yes

PROCESSES specifies the maximum number of operating system user processes that can simultaneously connect to Oracle. Its value should allow for all background processes such as locks, job queue processes, and parallel execution processes.

The default values of the SESSIONS and TRANSACTIONS parameters are derived from this parameter. Therefore, if you change the value of PROCESSES, you should evaluate whether to adjust the values of those derived parameters.



SESSIONS
================================================
Property Description
Parameter type Integer
Default value Derived: (1.1 * PROCESSES) + 5
Modifiable No
Range of values 1 to 231
Basic Yes
Real Application Clusters Multiple instances can have different values.



SESSIONS specifies the maximum number of sessions that can be created in the system. Because every login requires a session, this parameter effectively determines the maximum number of concurrent users in the system. You should always set this parameter explicitly to a value equivalent to your estimate of the maximum number of concurrent users, plus the number of background processes, plus approximately 10% for recursive sessions.

Oracle uses the default value of this parameter as its minimum. Values between 1 and the default do not trigger errors, but Oracle ignores them and uses the default instead.

The default values of the ENQUEUE_RESOURCES and TRANSACTIONS parameters are derived from SESSIONS. Therefore, if you increase the value of SESSIONS, you should consider whether to adjust the values of ENQUEUE_RESOURCES and TRANSACTIONS as well. (Note that ENQUEUE_RESOURCES is obsolete as of Oracle Database 10g release 2 (10.2).)

In a shared server environment, the value of PROCESSES can be quite small. Therefore, Oracle recommends that you adjust the value of SESSIONS to approximately 1.1 * total number of connections.

No comments: