Forums :
General Topics :
Work Unit of 80+ hours??
Message board moderation
Author | Message |
---|---|
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Hi, I almost completed a WU that took more than 80 hours to compute. Even on recent hardware, I think this is way too much as C@H is not multi-threaded so it runs on one core only thus needing such a long time to compute. Can we please get a bit smaller WUs? |
.clair. Send message Joined: 4 Nov 07 Posts: 651 Credit: 14,555,207 RAC: 594 |
That was a very long running work unit, though at least it validated ok. Makes me think that something odd happened to it during processing. I don`t like it if they run longer that 100k sec`s on my old Athlon xp 3000 machine, and it`s a snail compared to recent pc`s. edit - smaller work units, I think we got two chance`s there, And one of them is no chance. |
![]() Volunteer moderator Project administrator Project scientist ![]() Send message Joined: 24 Jun 07 Posts: 192 Credit: 15,273 RAC: 0 |
Hi clive and microchip - Could you explain to me why you have a preference for short work units? It's an honest question. From the side of the project manager longer work units are almost always better, since they reduce communication and server overheads. On your end, you can always ask to run several workunits at the same time which is a trivial way to multithread. Thank you - Ben Creator of Cosmology@Home |
.clair. Send message Joined: 4 Nov 07 Posts: 651 Credit: 14,555,207 RAC: 594 |
Sorry Ben, it is not that i am wishing for shorter running work units it is much more like i need a faster pc to run them on. And my comment is remembering that some time ago it was sed that sometime in the future work units may become longer and use more ram, sorry for putting it badly. |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Hi Ben, I'd like a bit smaller units as when I get such a long one, BOINC goes into "high priority" mode and assigns one of the cores to that WU only for the complete time it needs to be computed, thus nothing else but C@H can run on this core. I also crunch for other projects as well and would like BOINC to switch every hour between the various projects but with a WU from C@H in high priority mode, this doesn't happen. So either sent smaller WU's or increase the reporting deadline in order not to make BOINC go into high priority mode from the start. |
![]() Volunteer moderator Project administrator Project scientist ![]() Send message Joined: 24 Jun 07 Posts: 192 Credit: 15,273 RAC: 0 |
Ok, thanks for the feedback. Sounds like we need to look at increasing the reporting deadline. Happy crunching - Ben Creator of Cosmology@Home |
![]() Volunteer moderator Project administrator Project scientist ![]() Send message Joined: 24 Jun 07 Posts: 192 Credit: 15,273 RAC: 0 |
I just checked and our reporting deadline is set to 15 days. 80 hours are 3.333 days, so it seems like there would be plenty of time. At what point does your client go into high-priority mode? Could this be a setting you can change at your end? Thanks, Ben Creator of Cosmology@Home |
mickydl* Send message Joined: 4 Jan 10 Posts: 1 Credit: 222,180 RAC: 0 |
3.33 days is only true if you run the machine 24/7. Although many of us do , some don't. My machines are switched off during the night and are running an average of 15 hours a day. Another thing I have encountered during one of the last WUs I crunched is that the checkpointing seems to be quite inefficient. The particular WU had finished to abt. 80% when I switched the machine off for the night. When I started everything on the next day it continued from it's last checkpoint at 60% (corresponding to several hours of work). I didn't investigate this any further so I don't know if this is the normal behavior or just an unlucky coincidence. So, let's make a new calculation :-) 80 hours / 15 hours per day = 5.33 days add 5 % penalty (maybe 10% ?) for checkpointing and you get 5.6 days (5.68). If at least one other project is being crunched on the same machine double the times. So now we have 11.2 days (or 11.73 days with 10% penalty). OK, it's still less then the 15 day deadline but we are getting closer. Add yet another project to BOINC on that machine and you're likely to run into the "high priority" problem. Regards, Michael |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Forget the 80 hours WU. I just finished one that took 127 hours to compute on an AMD Phenom II x6 1090T CPU and what did I get for it? A measly 420 credits. Ridiculous. BOINC was running it for the past 3 days in high priority mode. This is enough. Ben, please increase the deadline or send smaller WUs and also review the credits system Also, I am with mickydl*. I crunch for a bunch of other projects too and have never seen such a long WU. The high priority mode becomes easily a problem when you also crunch for other projects, even though my server runs 24/7. |
.clair. Send message Joined: 4 Nov 07 Posts: 651 Credit: 14,555,207 RAC: 594 |
Do you have `Leave applications in memory while suspended` ticked for `yes` In BOINC manager preferences, if not, it will make work units take a very long tome to finish. Checkpoints in cosmo are a long way apart and you loose work done on all other projects as well when BM switches between projects if this is not ticked. BM will move the data out to swap / virtual memory until it is needed again. |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Do you have `Leave applications in memory while suspended` ticked for `yes` Yes, I have it ... |
.clair. Send message Joined: 4 Nov 07 Posts: 651 Credit: 14,555,207 RAC: 594 |
Sorry that it did not help, I had a look at the specs of your pc it has got every thing it needs to run well, You do run a lot of other projects on it, Do the other projects you do work units for run for similar times on their systems, is it just cosmo that is running slow. |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Sorry that it did not help, It is just cosmology@home that has such long work units. All other projects I'm attached to have WUs between 1 hour and 34 hours (excluding the GPU projects). I really don't mind running very long WUs, but the deadline should be extended for those. Also the credits system on here definitely needs a review. :) |
![]() Send message Joined: 19 Jan 08 Posts: 180 Credit: 2,500,290 RAC: 0 |
Workunits here usually run way shorter, between 4 and 12 hours (lately more of the shorter ones occur) would be normal and I haven't had such a long running one so far. I guess there has been something wrong with the workunit or the host somehow had a problem with it. Normal Cosmo result credits are not too low here in average and deadlines are quite comfortable. If your next one runs that long again, I would rather suspect that there is something that somehow collides with the Cosmo ressource requirements on your host. |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Workunits here usually run way shorter, between 4 and 12 hours (lately more of the shorter ones occur) would be normal and I haven't had such a long running one so far. I guess there has been something wrong with the workunit or the host somehow had a problem with it. Actually, I tried on a different host and I get the same result. WUs between 36 and 70 hours (which I aborted as it was a test) so it's not just this one host that gets them. About the credits, I have to disagree. I favor a credits system (which is also used by virtually all other projects) that gives credit based on amount of work you do. Not a fixed-credit system like on here. If you run other projects as well and get such a long WU, the deadline definitely becomes a problem, hence why I asked to increase it a bit :) |
![]() ![]() Send message Joined: 19 Jul 08 Posts: 19 Credit: 2,963,018 RAC: 0 |
Ok, after further investigation I found out why C@H is running such long tasks on my host. It is because of shitty checkpointing issue. Suppose a WU has crunched to 85%. When BOINC suspends it to run another WU from some project and then goes back to the C@H WU, instead of continuing from 85%, it starts crunching it from 70% instead. I guess Ben doesn't care much about adding more checkpoints (as I've seen others complain as well) so I'll also not care much and am detaching from C@H. Enough is enough. Good bye! |
![]() Send message Joined: 31 May 10 Posts: 234 Credit: 4,896,378 RAC: 0 |
The WUs are as big as they have to be. Work that has to be done. And that is the way it should be. |