-
-
Notifications
You must be signed in to change notification settings - Fork 222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server Hangs on I/O error during "open" operation for file "/tmp/firebird/fb_trace_ksVDoc" [CORE2917] #1210
Comments
Commented by: Derryck welas (welas) b.t.w. trace is not enabled on this server. (default fbtrace.conf ) |
Modified by: Derryck welas (welas)Component: Engine [ 10000 ] |
Commented by: @hvlad It is not clear, you wrote : a) nobody is using the system so how it can be at the same time ? And a question : if there was file /tmp/firebird/fb_trace at that time ? |
Commented by: Derryck welas (welas) See your point, but what i mean is there were no people actually testing on the server which means no "load". (It is a test server) What was running are ~ 10 jboss instances connected to this server (10 databases) and there is some activity gooing on when these what was interesting is that all connections from multiple databases were affected. based on the firebird.log this error appeared on Mar 8 ~ 04:05:04 (2 days ago, you can imagine firebird.log was very big) When the server was unreachable (today) i did not see any " /tmp/firebird/fb_trace*** file |
Commented by: Derryck welas (welas) other strange thing: i have "TempDirectories =/local/apps/tmp" in my firebird.conf so dont know how the system comes up with "/tmp/firebird/" |
Commented by: @hvlad Files, which backed shared memory (lock table, events table, monitoring snapshots, trace files, etc) are placed at /tmp/firebird folder independently of TempDirectories setting value |
Commented by: Derryck welas (welas) -its FirebirdCS-2.5.0.25920-ReleaseCandidate2.amd64.rpm currently i dont use the trace api but i will start testing it from here on. |
Commented by: Derryck welas (welas) How to recreate the error: go to /tmp/firebird and delete the file fb_trace_****** file . (the one where the trace config is copied to) i know it should not happen but when that file is not there your server is locked down ! (no new connections possible ) the moment you kill all instances of fb_inet_server , than you can make new connections to the server. i think it would be better to just recreate an empty fb_trace_****** file in such situation ( it is empty in most situations anyway). |
Commented by: @AlexPeshkoff Manually removing file is not a way to reproduce a bug. Removing file from /tmp is a known way to prevent normal OS operation. |
Commented by: Derryck welas (welas) Hey will keep an close eye on this error (if happening on normal DSQL processing over jdbc), but for now not able to reproduce bug. the only things i encounter during load testing (something i already mentioned in CORE2891) eusdlsdbb024.logistics.corp Wed Mar 17 15:02:46 2010 |
Commented by: Yakushev.Y (yyy) OS: RDBMS: Firebird.log: http://IS2.md.net Sat Mar 20 04:02:44 2010 Firebird.conf TempDirectories = /inf/fbtemp The same situation. Server is up for 11 days. Everything is fine till Mar 20. Server load is low at this time. |
Commented by: @hvlad Yakushev.Y, are you sure nobody deleted files in /tmp/firebird ? |
Commented by: Yakushev.Y (yyy) Yes. 20 MArch is saturday. All IT specialists had a day off. This problem was detected on Monday. |
Commented by: @hvlad No cron tasks with disk cleanup ? |
Commented by: Yakushev.Y (yyy) This database is used with a special application in one of the Moscow companies. This company consists of central office (CO) and has about 30 units over Moscow. Each unit has it's own database. By now, most of the units have Firebird 2.0 installed. But the central database(CDB) was recently upgraded to FB 2.5. There were no problems on the CDB server, exept 1. The RS DNS name was the same with the main server http://IS2.md.net. So since 18 March I I saw in the messages, and one of them was near the bug time: |
Commented by: Derryck welas (welas) Hey it happened again ! I think the "fb_trace_*****" file is not deleted firebird.log reports: eusdlsdbb024.logistics.corp Fri Apr 2 04:03:01 2010 While in the /tmp/firebird/ directory there is "fb_trace_kDlCCp" with date 03/17/2010 after i kill the fb_inet_server instances a new trace file is created "fb_trace_GamqsW" with date 04/02/2010 also see attached picture: "error_sequence_of_events.PNG" |
Modified by: Derryck welas (welas)Attachment: error_sequence_of_events.PNG [ 11608 ] |
Commented by: @AlexPeshkoff Guys, I have collected here all known to me (not only from this ticket, fb-devel was also used) times when you had problems. eusdlsdbb024.logistics.corp Mon Mar 8 04:05:04 2010 Pay attention - this is always soon after 4 o'clock in the night time. Be sure firebird DOES NOT have any scheduling to do something special a specific moment of time. But there is one program in unix, which does perform this activity - cron. And one of typical tasks - clear tmp. And the time when this activity is performed - night. When you say 'we have no cron tasks' may be it means 'we did not add cron tasks'? Many linux distro already have /tmp cleanup task when installed. Please check files and dirs /tmp/cron* - I suppose you may find something interesting. |
Commented by: Derryck welas (welas) The 1e incident there was no crontab task on the server. but now i have crontab calling a "java" program every minute for 9 different databases: ( one database even every 30 sec) SELECT ATT.MON$SERVER_PID,ATT.MON$REMOTE_ADDRESS,10000*(CURRENT_TIMESTAMP-STMT.MON$TIMESTAMP) AS ELAPS_TIME,STMT.MON$SQL_TEXT FROM MON$ATTACHMENTS ATT JOIN MON$STATEMENTS STMT ON ATT.MON$ATTACHMENT_ID = STMT.MON$ATTACHMENT_ID WHERE STMT.MON$STATE = 1 "; edit: edit2: here is from "/etc/cron.daily/tmpwatch" .. edit3: the current "stat on fb_trace_GamqsW" based on this it looks like "tempwatch" will remove the file (after 240 hours of no access) |
Commented by: @hvlad Derrick, looks like you've found a correct reason for a problem. So far, you can try as a workaround to - add cron task to touch files in /tmp/firebird/*, or - tune tempwatch to not delete (opened) files in /tmp/firebird/*, if it is possible It seems --fuser option could help : http://linux.about.com/library/cmd/blcmdl8_tmpwatch.htm |
Commented by: Derryck welas (welas) Looking forward to RC3 ! |
Commented by: @AlexPeshkoff Made any running instance of firebird perodically touch mentioned file. |
Modified by: @AlexPeshkoffstatus: Open [ 1 ] => Resolved [ 5 ] resolution: Fixed [ 1 ] Fix Version: 2.5 RC3 [ 10381 ] |
Modified by: @pcisarstatus: Resolved [ 5 ] => Closed [ 6 ] |
Commented by: John Franck (bozzy) Almost 9 years later, I'm experiencing this same behaviour. Firebird 3.0.4 SS 64bit on CentOS 7. As reported here (in russian): http://www.sql.ru/forum/1292505/sever-hangs-on-i-o-error-during-open-operation-for-file-tmp-firebird-fb-trace-ksvdoc I/O error during "open" operation for file "/tmp/firebird/fb_trace_1tdYvH" <<--- (my example) Luckily enough, remote connections continue to work. As this happens on our production server, it would be disastrous if it would block remote connections. On this server, tracking is disabled, so these fb_trace files seems to be just spurious. As others mentioned, it's quite strange that the server stops responding to local connections if a seemingly useless file is not where it pretends to be, and if this is really needed to exist, it could be recreated by the engine if it doesn't find it. Side note: I know that on SuperServer it's not a good idea to connect to databases locally, but for some batch operations that we run overnight when our service is already offline, this doesn't represent a problem, as there are no other connections to the DBs. |
Commented by: @AlexPeshkoff If there are files /tmp/firebird/fb_trace* that means that there are connections to the server. And if some file is deleted that connection(s) remain active for a very long time (about a week). But may be not to database you are going to access locally. On the other hand if there are connections and if it's not super (but classic) missing such file may become a problem. I.e. I see need to fix an issue. Please create new ticket in the tracker. |
Commented by: @AlexPeshkoff BTW - this is not actually a bug in FB but a bug in linux - it may delete opened file dduring regular /tmp cleanup. That actions are performed by script in /tmp/cron.* directory, unfotunately I can't say for sure what script to delete to quickly fix your issue (I use gentoo which by default miss that broken feature) but useally looking at scripts and appluing some common sense to them helps to find that script easily. |
Modified by: @dyemanovassignee: Alexander Peshkov [ alexpeshkoff ] |
Commented by: @dyemanov Alex, I've just discovered that your commit for this issue (touchThreadFunc for trace) was never frontported into v3/v4. This may be the reason why the bug strikes back for v3 users. Or am I missing something? |
Commented by: @hvlad IIRC, there was intention to move touch function to the new timer thread |
Commented by: @AlexPeshkoff yes, will take a look after beta1 |
Fix from 2.5 got broken in 3.0 due to bug (integer overflow) in ITimer code. |
Submitted by: Derryck welas (welas)
Jira_subtask_outward CORE6009
Attachments:
error_sequence_of_events.PNG
The server has been online for ~ 2 weeks then,
hangs while nobody is using the system (4 AM) ( i had to kill all fb_inet_server instances, then connection was possible again)
this is written in the firebird.log (every second)
eusdlsdbb024.logistics.corp Mon Mar 8 04:05:04 2010
Error in isc_attach_database() API call when working with security database
I/O error during "open" operation for file "/tmp/firebird/fb_trace_ksVDoc"
Error while trying to open file
No such file or directory
/tmp seems to have enough space .
Client application jdbc stack error (no connection possible):
Caused by: org.firebirdsql.jdbc.FBSQLException: GDS Exception. 335544344. I/O error for file open "/tmp/firebird/fb_trace_ksVDoc"
Error while trying to open file
null, error code: HY000
Reason: I/O error for file open "/tmp/firebird/fb_trace_ksVDoc"
Error while trying to open file
null
i have ODS Version 11.2 and 11.1 databases running on this instance.
Commits: 0979b7c e8d1eea 6ea33a5 f7f1def
The text was updated successfully, but these errors were encountered: