[Encore] forked task stops after a few weeks
Kevin Jepson
kevijeps at telusplanet.net
Thu Feb 25 07:21:33 MST 2010
Daniel
I'm curious as to why you used a Fork for this and not just repeated
suspends?
Not that I think it would behave differently, I'm interested in the how and
why of the code as I'm always on the lookout for ideas :-)
Is it possible that occaisionally it does run out of ticks but there is more
than one task already forked so your logs only see the last one not the
aborted one?
Ciao
KJ
-----Original Message-----
From: encore-bounces at encore-consortium.org
[mailto:encore-bounces at encore-consortium.org] On Behalf Of Daniel Jung
Sent: Tuesday, February 23, 2010 6:57 AM
To: General MOO discussion.; encore at encore-consortium.org
Subject: [Encore] forked task stops after a few weeks
Hi all
I need help in finding out how I can track down the sudden abortion of a
repeated forked task.
I have a verb that checks for purloined objects (players) and moves them
home, then forks/suspends and runs again. Every once in a while, the process
stops, and I don't know why. The code has a debugging feature which prints
errors (code line numbers, callers, stacks, comments ... ) to a debugger; it
counts and prints ticks left (which are replenished by suspending anyway) to
outrule dying by too few ticks.
I (think I) noticed that when the fork was set to 5 minutes, it was more
likely (=sooner) to abort than now that I set it to 10 minutes. Last time,
the task went smoothly for 17 days (one instance every ten minutes), moving
guests and players home every now and then, then it stopped.
A rough sketch of the structure:
--------------------------
$housekeeper:run_players()
--------------------------
9: for p in (players())
10: $command_utils:suspend_if_needed(0);
11: try
/* if p is "not connected" */
/* check p's home for validity etc. */
/* PRINT DEBUG on error */
/* move p to its home or suitable place */
/* PRINT DEBUG on actual moving, or on error */
52: except error (ANY)
/* PRINT DEBUG on error */
57: endtry
58: endfor
/* PRINT DEBUG telling fork */
61: fork (delay)
62: this:(verb)(@args);
63: endfork
65: return result;
The last debug entry is this one, exactly like the preceding ones.
**************** DEBUG *****************
Thu Feb 18 00:09:36 2010 CST
Daniel: debug index 2477
----------------------------------------
running $housekeeper:run_players, line 60
running $housekeeper:run_players, line 62
----------------------------------------
about to fork, ticks left: 10450
****************************************
There are no indications, no error reports, no weird objects to handle (as I
can see) which could have caused a drop out. The indexing began at 0 when I
stared the first task.
I see that the property gets fairly crowded with 2477 indices, but when
restarting the procedure, it adds the new ones without complains, and we're
at index 2874 now (meaning: writing to the long property doesn't seem to
have stopped the task.)
The ticks left vary between 4012 and 14334.
The server log has nothing for the time this occurred (and the ten following
minutes):
Feb 18 00:04:33: DISCONNECTED: #-509150 on port 7878 from
spider36.yandex.ru, port 61218
Feb 18 00:09:06: ACCEPT: #-509151 on port 7878 from
66-199-234-66.reverse.ezzi.net, port 34716 Feb 18 00:09:07:
DISCONNECTED: #-509151 on port 7878 from
66-199-234-66.reverse.ezzi.net, port 34716 Feb 18 00:10:22: ACCEPT:
#-509152 on port 7878 from
msnbot-65-55-207-94.search.msn.com, port 56255 Feb 18 00:10:22:
DISCONNECTED: #-509152 on port 7878 from
msnbot-65-55-207-94.search.msn.com, port 56255 Feb 18 00:12:30: ACCEPT:
#-509153 on port 7878 from
66-199-234-66.reverse.ezzi.net, port 41118 Feb 18 00:12:50: CLIENT
DISCONNECTED: #-509153 on port 7878 from
66-199-234-66.reverse.ezzi.net, port 41118 Feb 18 00:14:22: ACCEPT:
#-509154 on port 7878 from
msnbot-65-55-207-72.search.msn.com, port 25311 Feb 18 00:14:22:
DISCONNECTED: #-509154 on port 7878 from
msnbot-65-55-207-72.search.msn.com, port 25311 Feb 18 00:15:09: ACCEPT:
#-509155 on port 7878 from
b3091297.crawl.yahoo.net, port 50798 Feb 18 00:15:09: DISCONNECTED:
#-509155 on port 7878 from
b3091297.crawl.yahoo.net, port 50798 Feb 18 00:19:36: CLIENT
DISCONNECTED: mzr (#1462) on port 8200 from
124.160.46.113, port 31601
Feb 18 00:29:06: ACCEPT: #-509156 on port 7878 from spider36.yandex.ru,
port 64753
Feb 18 00:29:06: DISCONNECTED: #-509156 on port 7878 from
spider36.yandex.ru, port 64753
There is a checkpoint at 00:36, but there is one every hour anyway, they
don't seem to harm the tasks, and the next instances should have been on
00:19 and 00:29 anyway.
Thank you for any input.
Greetings,
- Daniel
_______________________________________________
Encore mailing list
Encore at encore-consortium.org
http://encore-consortium.org/mailman/listinfo/encore_encore-consortium.org
More information about the Encore
mailing list