|
Duff’s Device循环算法在AS中的应用
The Jargon 描述了作为C中很喜剧性用法的 Duff's Device 算法. 这个非凡的算法是由 Tom Duff在Lucasfilm时发现的。 Well, doesn't it justify the "everyone working at LucasFilm is a Real Programmer" dogma?
As the author explains, it is based on *unrolling a loop* and *fall through* of a switch statement. The result is both horrible and elegant, but overall magnificently efficient. Jeff Greenberg ported the algorithm to JavaScript, and an unknown contributor optimized it a bit more. I have just tested the device in ActionScript, and found out that it really makes difference. Okay, here is the code and my explanations: First, an average loop (the *muggle* way). We just increment an integer 99999 times. var MAX_VALUE = 99999; var testValue = 0; function loopNormal() { var startTime = getTimer(); trace("started regular version!"); for (var i = 0; i testValue++; } trace("it took me :" + (getTimer() - startTime) + " ms"); } loopNormal(); When I run the script, the result is 4992 ms (player WIN 6,0,4,0). Okay, now let's see Greenberg's port of Duff's device: var MAX_VALUE = 99999; var testValue = 0; function duffsDevice(iterations) { // Original JS Implementation by Jeff Greenberg 2/2001 var startTime = getTimer(); trace("started Duff's Device!"); var n = iterations / 8; var caseTest = iterations % 8; do { switch (caseTest) { case 0: testValue++; case 7: testValue++; case 6: testValue++; case 5: testValue++; case 4: testValue++; case 3: testValue++; case 2: testValue++; case 1: testValue++; } caseTest=0; } while (--n > 0); trace("it took me :" + (getTimer() - startTime) + " ms"); } duffsDevice(MAX_VALUE); Well, now it only took 2314 ms -less than the half of the time that the regular for loop took. So what is this device actually? First, we have to understand "loop unrolling"; this technique reduces the cost of loop overhead by decreasing the number of times we check the looping condition. Instead of writing a regular loop like this: var iterations = n; // number of iterations for (var i=0; i < iterations; i++) { doSomeMagic(); } We unroll the loop by performing the same iteration more than once in one iteration and we increment the counter every time we perform the iteration. Like this: var iterations = m; // a multiple of number of unroll statements for (var i=0; i < iterations; ) { doSomeMagic(); i++; // 1 doSomeMagic(); i++; // 2 doSomeMagic(); i++; // 3 // so m will be a multiple of 3 } In this sample we have unrolled the loop three times. Number of iterations will be a multiple of three. Duff's Device is a solution for guessing the number of unroll statements. First we decide how many times we will unroll the loop, and we put that many case labels in the switch statement. In the above sample, "caseTest" is the remainder of "iterations" divided by 8 (number of unroll statements). There are 8 test cases, they start with 0 and then go up to n-1 (7 in our case), and so on (n-2, n-3, ...) until we hit 1. "n" is the number of times we can execute all unroll statements without having any remainder. and "testCase" is the remainder of "iterations" divided by number of unroll statements. The switch statement tests the value of the remainder. In our sample code, the first time the do loop is called, "testCase" is 7 (99999 modulus 8), and "n" is 12499 (99992/8). We skip case 0 and *fall back* to 7 (note the lack of "break" statement). We execute 7 statements. Before we leave the switch statement, we set "caseTest" to 0 (the default statement). Now we decrement the "n" counter" (with prefix operator). Now we have 12498 (12499-1) groups of 8 to go through. This time "testCase" is set to 2 (12498 % 8), the switch statement switches to case 2 and all the cycle continues - until we reach the end (do...while loop checks the condition). An anonymous author posted an optimized version of Greenberg's port of Duff's device: var MAX_VALUE = 99999; var testValue = 0;
function optDuffsDevice(iterations) { var startTime = getTimer(); trace("started optimized Duff's Device!"); var n = iterations % 8; while (n--) { testValue++; } n = parseInt(iterations / 8); while (n--) { testValue++; testValue++; testValue++; testValue++; testValue++; testValue++; testValue++; testValue++; } trace("it took me :" + (getTimer() - startTime) + " ms"); } optDuffsDevice(MAX_VALUE); This took 1995 ms - even faster. Basically, Duff's Device also works in the SWF interpreter. You might wonder how it looks in FLASM. Here is the Flasm byte-code dump: frame 0 constants 'n', 'iterations', 'testVal', 'parseInt', 'MAX_VALUE', 'testValue', 'optDuffsDevice' function optDuffsDevice ('iterations') push 'n', 'iterations' getVariable push 8 modulo varEquals label1: push 'n' getVariable push 'n', 'n' getVariable decrement setVariable not branchIfTrue label2 push 'testVal', 'testVal' getVariable increment setVariable branch label1 label2: push 'n', 'iterations' getVariable push 8 divide push 1, 'parseInt' callFunction setVariable label3: push 'n' getVariable push 'n', 'n' getVariable decrement setVariable not branchIfTrue label4 push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable push 'testVal', 'testVal' getVariable increment setVariable branch label3 label4: end // of function optDuffsDevice
push 'MAX_VALUE', 99999 varEquals push 'testValue', 0.0 varEquals push 'MAX_VALUE' getVariable push 1, 'optDuffsDevice' callFunction pop end // of frame 0
|