Waste of computation power.
If I look in some of my WUs, they look like this one:
| application | Blender |
| created | 12 Oct 2011 10:53:32 UTC |
| name | ses0000004116frm0000000320prt00001 |
| canonical result | 4196916 |
| granted credit | 6.23 |
| minimum quorum | 3 |
| initial replication | 14 |
| max # of error/total/success tasks | 25, 35, 20 |
| 4195692 | 6205 | 12 Oct 2011 10:53:45 UTC | 12 Oct 2011 10:57:40 UTC | Over | Success | Done | 187.42 | --- | 0.00 |
| 4195693 | 5872 | 12 Oct 2011 10:53:38 UTC | 12 Oct 2011 11:10:41 UTC | Over | Success | Done | 503.02 | --- | 0.00 |
| 4195694 | 5675 | 12 Oct 2011 10:53:42 UTC | 12 Oct 2011 11:01:11 UTC | Over | Success | Done | 281.21 | --- | 0.00 |
| 4196938 | 6253 | 13 Oct 2011 6:50:13 UTC | 13 Oct 2011 7:03:35 UTC | Over | Success | Done | 381.81 | --- | 0.00 |
| 4196968 | 3941 | 13 Oct 2011 8:21:33 UTC | 13 Oct 2011 8:24:04 UTC | Over | Success | Done | 333.53 | --- | 0.00 |
| 4196993 | 1852 | 13 Oct 2011 9:31:30 UTC | 13 Oct 2011 18:35:10 UTC | Over | Success | Done | 281.14 | 2.50 | 0.00 |
| 4204637 | 3897 | 13 Oct 2011 19:56:09 UTC | 13 Oct 2011 20:04:01 UTC | Over | Success | Done | 694.07 | --- | 0.00 |
| 4196916 | 5053 | 13 Oct 2011 5:50:36 UTC | 13 Oct 2011 5:52:31 UTC | Over | Success | Done | 423.38 | --- | 6.23 |
| 4204707 | 5261 | 14 Oct 2011 0:04:33 UTC | 14 Oct 2011 0:16:56 UTC | Over | Success | Done | 486.54 | --- | 6.23 |
| 4196131 | 3779 | 12 Oct 2011 11:06:12 UTC | 12 Oct 2011 11:22:34 UTC | Over | Success | Done | 700.63 | --- | 0.00 |
| 4196132 | 6029 | 12 Oct 2011 11:06:12 UTC | 12 Oct 2011 11:11:02 UTC | Over | Success | Done | 415.63 | --- | 0.00 |
| 4204697 | 6307 | 13 Oct 2011 23:28:21 UTC | 13 Oct 2011 23:31:44 UTC | Over | Success | Done | 471.58 | --- | 0.00 |
| 4204665 | 6370 | 13 Oct 2011 21:32:12 UTC | 13 Oct 2011 21:40:35 UTC | Over | Success | Done | 345.12 | --- | 0.00 |
| 4204651 | 1452 | 13 Oct 2011 20:39:34 UTC | 13 Oct 2011 20:42:12 UTC | Over | Success | Done | 364.26 | 2.01 | 0.00 |
As you can see. 14 were initially sent, most of them were ditched and just futile heating items for the CPU.
Why do you do such an excessive waste of ressources? Is the app really that crappy that you need them all, because 80% of the results are plain rubbish?
So they were probably all fine, you just can't get the validator to agree that they are all fine.
Can't you send just one each then and simply take them all instead of stealing so much resources from other projects for futile crunching, that you know beforehand will be futile?
Interestingly you are somewhat correct here. From what I understand many BOINC projects have already dropped the redundancy (computing multiple results to eliminate errors) and some folk at the BOINC workshop this summer were asking why we were still using that. The reason is that we do get a lot of errorenous results which forces us to render multiple samples and then compare the results together and pick the successful one. In general our results are usually correct, especially with some desperately needed bug fixes recently pushed to BURP and including libraries for Linux systems. Even with these fixes we at times see something really odd at the backend, one example being a render where one computer had an error in downloading the .blend file. For some reason the .blend file ceased to download ultimately causing the file to be corrupted (we think this is what happened). When the computer started rendering the file it couldn't get it open, and did what Blender does in this case: opened the default scene. The computer rendered the default cube with a differing resolution, returned the result to the service and moved on to the next file.
This error was before we got our validator up and took some effort to fix. If the validator was working back then it would have received the two other results and used them instead, also rejecting the one with an odd resolution. I don't think we are yet ready for dropping redundancy due to oddities like this. Our computation also is a bit differing to most BOINC projects as we have computers rendering things a bit differently due to using such a vast construction as Blender. As the service is now I have tuned the validator tolerance up so that it allows more differences. This isn't a solution though.
We apologize for wasting resources like this. We are still in Beta, and these problems happen at times.
Would a stricter homogenous redundancy be of any value, or are those differences even with similar setups?
I know that LHC had to this in before the restart, I don't know whether they still use it.
Yes, that would help us eliminate some issues especially with particles, but might not help with the low ambient occlusion sample counts. Janus is using one OS type / session at BURP to a great degree. We might go for that at Renderfarm.fi as well.
I've added a new host recently, and have been crunching away, but have received no credit at all...
http://renderfarm.fi/show_host_detail.php?hostid=6669
Processed about 50 tasks, for about 9 different sessions.
http://renderfarm.fi/results.php?hostid=6669
I can't figure out a pattern. Could this be the validation thing? Or the OS thing?
At the bottom of a result like http://renderfarm.fi/result.php?resultid=4558171 it says "validate state: Invalid". So something about the output of your host is different (enough) from the other machines that rendered the same workunit.
Thanks for checking.
I'm surprised that every session worked out that way. I thought it was only the sessions with low samples that had problems validating(?).
Now that I looked, my other hosts are doing it also!
http://renderfarm.fi/results.php?hostid=1161
http://renderfarm.fi/results.php?hostid=1158
http://renderfarm.fi/results.php?hostid=1358
Is this something you guys could correct later ? (i.e. grant credit)
I haven't gotten any credit for a week.

I've been a bit busy with university / guest visiting / herding camels and haven't really done much Renderfarm stuff as of late. Sorry about late responses.
I did a few quick checks on the database. I was wondering if there was something wrong with Linux clients claiming credits, but apparently not. On average, Linux , OS X and Windows machines have he same number of recent average credit, so at least some machines do get credit. Overkill, I'll launch a few tests today with no AO or that sort of stuff. Please render some of that and see if you can get anything validated.
@Overkill: http://www.renderfarm.fi/animations/4586





This is the animation in question. Like I said in the other thread we are currently attempting to get our validator set up properly. The animation in question had very low samples for the ambient occlusion approximate method which caused the frames to have a large number of "messy" content. This content was a bit different for every host and the validator rejected most of the frames. I have tuned the validator error tolerance up, but it is better is users use more samples and eliminate the artifacts in the first place.
I wrote a bit more about this here.
- Jesse Kaukonen, www.jessekaukonen.net