Waste of computation power.

17 replies [Last post]
User offline. Last seen 29 weeks 4 days ago. Offline
Believer
Joined: 2010-09-08

If I look in some of my WUs, they look like this one:

application Blender
created 12 Oct 2011 10:53:32 UTC
name ses0000004116frm0000000320prt00001
canonical result 4196916
granted credit 6.23
minimum quorum 3
initial replication 14
max # of error/total/success tasks 25, 35, 20
Task ID
click for detailsComputerSentTime reported
or deadline
explainServer state
explainOutcome
explainClient state
explainCPU time (sec)claimed creditgranted credit
4195692 6205 12 Oct 2011 10:53:45 UTC 12 Oct 2011 10:57:40 UTC Over Success Done 187.42 --- 0.00
4195693 5872 12 Oct 2011 10:53:38 UTC 12 Oct 2011 11:10:41 UTC Over Success Done 503.02 --- 0.00
4195694 5675 12 Oct 2011 10:53:42 UTC 12 Oct 2011 11:01:11 UTC Over Success Done 281.21 --- 0.00
4196938 6253 13 Oct 2011 6:50:13 UTC 13 Oct 2011 7:03:35 UTC Over Success Done 381.81 --- 0.00
4196968 3941 13 Oct 2011 8:21:33 UTC 13 Oct 2011 8:24:04 UTC Over Success Done 333.53 --- 0.00
4196993 1852 13 Oct 2011 9:31:30 UTC 13 Oct 2011 18:35:10 UTC Over Success Done 281.14 2.50 0.00
4204637 3897 13 Oct 2011 19:56:09 UTC 13 Oct 2011 20:04:01 UTC Over Success Done 694.07 --- 0.00
4196916 5053 13 Oct 2011 5:50:36 UTC 13 Oct 2011 5:52:31 UTC Over Success Done 423.38 --- 6.23
4204707 5261 14 Oct 2011 0:04:33 UTC 14 Oct 2011 0:16:56 UTC Over Success Done 486.54 --- 6.23
4196131 3779 12 Oct 2011 11:06:12 UTC 12 Oct 2011 11:22:34 UTC Over Success Done 700.63 --- 0.00
4196132 6029 12 Oct 2011 11:06:12 UTC 12 Oct 2011 11:11:02 UTC Over Success Done 415.63 --- 0.00
4204697 6307 13 Oct 2011 23:28:21 UTC 13 Oct 2011 23:31:44 UTC Over Success Done 471.58 --- 0.00
4204665 6370 13 Oct 2011 21:32:12 UTC 13 Oct 2011 21:40:35 UTC Over Success Done 345.12 --- 0.00
4204651 1452 13 Oct 2011 20:39:34 UTC 13 Oct 2011 20:42:12 UTC Over Success Done 364.26 2.01 0.00

As you can see. 14 were initially sent, most of them were ditched and just futile heating items for the CPU.

Why do you do such an excessive waste of ressources? Is the app really that crappy that you need them all, because 80% of the results are plain rubbish?

Gruesse vom Saenger

User offline. Last seen 10 hours 12 min ago. Offline
Maintainer, developer
Joined: 2009-07-21
This is the animation in

This is the animation in question. Like I said in the other thread we are currently attempting to get our validator set up properly. The animation in question had very low samples for the ambient occlusion approximate method which caused the frames to have a large number of "messy" content. This content was a bit different for every host and the validator rejected most of the frames. I have tuned the validator error tolerance up, but it is better is users use more samples and eliminate the artifacts in the first place.

I wrote a bit more about this here.

- Jesse Kaukonen, www.jessekaukonen.net

User offline. Last seen 29 weeks 4 days ago. Offline
Believer
Joined: 2010-09-08
So they were probably all

So they were probably all fine, you just can't get the validator to agree that they are all fine.

Can't you send just one each then and simply take them all instead of stealing so much resources from other projects for futile crunching, that you know beforehand will be futile?

Gruesse vom Saenger

User offline. Last seen 10 hours 12 min ago. Offline
Maintainer, developer
Joined: 2009-07-21
Interestingly you are

Interestingly you are somewhat correct here. From what I understand many BOINC projects have already dropped the redundancy (computing multiple results to eliminate errors) and some folk at the BOINC workshop this summer were asking why we were still using that. The reason is that we do get a lot of errorenous results which forces us to render multiple samples and then compare the results together and pick the successful one. In general our results are usually correct, especially with some desperately needed bug fixes recently pushed to BURP and including libraries for Linux systems. Even with these fixes we at times see something really odd at the backend, one example being a render where one computer had an error in downloading the .blend file. For some reason the .blend file ceased to download ultimately causing the file to be corrupted (we think this is what happened). When the computer started rendering the file it couldn't get it open, and did what Blender does in this case: opened the default scene. The computer rendered the default cube with a differing resolution, returned the result to the service and moved on to the next file.

This error was before we got our validator up and took some effort to fix. If the validator was working back then it would have received the two other results and used them instead, also rejecting the one with an odd resolution. I don't think we are yet ready for dropping redundancy due to oddities like this. Our computation also is a bit differing to most BOINC projects as we have computers rendering things a bit differently due to using such a vast construction as Blender. As the service is now I have tuned the validator tolerance up so that it allows more differences. This isn't a solution though.

We apologize for wasting resources like this. We are still in Beta, and these problems happen at times.

- Jesse Kaukonen, www.jessekaukonen.net

User offline. Last seen 29 weeks 4 days ago. Offline
Believer
Joined: 2010-09-08
Would a stricter homogenous

Would a stricter homogenous redundancy be of any value, or are those differences even with similar setups?

 

I know that LHC had to this in before the restart, I don't know whether they still use it.

Gruesse vom Saenger

User offline. Last seen 10 hours 12 min ago. Offline
Maintainer, developer
Joined: 2009-07-21
Yes, that would help us

Yes, that would help us eliminate some issues especially with particles, but might not help with the low ambient occlusion sample counts. Janus is using one OS type / session at BURP to a great degree. We might go for that at Renderfarm.fi as well.

- Jesse Kaukonen, www.jessekaukonen.net

User offline. Last seen 1 day 20 hours ago. Offline
Fellow
Joined: 2010-05-31
New Host Credit

I've added a new host recently, and have been crunching away, but have received no credit at all...

http://renderfarm.fi/show_host_detail.php?hostid=6669

Processed about 50 tasks, for about 9 different sessions.

http://renderfarm.fi/results.php?hostid=6669

I can't figure out a pattern. Could this be the validation thing? Or the OS thing?

jbk
User offline. Last seen 4 days 11 hours ago. Offline
BURP Ninja
Joined: 2008-10-05
At the bottom of a result

At the bottom of a result like http://renderfarm.fi/result.php?resultid=4558171 it says "validate state: Invalid". So something about the output of your host is different (enough) from the other machines that rendered the same workunit.

User offline. Last seen 1 day 20 hours ago. Offline
Fellow
Joined: 2010-05-31
Thanks for checking. I'm

Thanks for checking.

I'm surprised that every session worked out that way. I thought it was only the sessions with low samples that had problems validating(?).

Now that I looked, my other hosts are doing it also!

http://renderfarm.fi/results.php?hostid=1161

http://renderfarm.fi/results.php?hostid=1158

http://renderfarm.fi/results.php?hostid=1358

Is this something you guys could correct later ? (i.e. grant credit)

I haven't gotten any credit for a week.

User offline. Last seen 10 hours 12 min ago. Offline
Maintainer, developer
Joined: 2009-07-21
I've been a bit busy with

I've been a bit busy with university / guest visiting / herding camels and haven't really done much Renderfarm stuff as of late. Sorry about late responses.

I did a few quick checks on the database. I was wondering if there was something wrong with Linux clients claiming credits, but apparently not. On average, Linux , OS X and Windows machines have he same number of recent average credit, so at least some machines do get credit. Overkill, I'll launch a few tests today with no AO or that sort of stuff. Please render some of that and see if you can get anything validated.

- Jesse Kaukonen, www.jessekaukonen.net

User offline. Last seen 10 hours 12 min ago. Offline
Maintainer, developer
Joined: 2009-07-21
@Overkill:

- Jesse Kaukonen, www.jessekaukonen.net

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.