--- /dev/null
+This directory contains file and shell scripts
+
+ tstaes.c
+ makegenarm.sh
+ makegenx86.sh
+ makeoptx86.sh
+
+that can be used to build executables. These executable are used to validate the implementation
+and to benchmark the performance of the aes functions in the kernel. This directory also serves
+as a development environment for porting of the aes functions to any new architectures.
+
+On xnu-1699.20.6 (from which we add this work), the generic aes source code sits at bsd/crypto/aes/gen. The x86_64
+and i386 architectural optimization is given in bsd/crypto/aes/i386.
+
+After making some code corrections (aes.h and most assembly code in i386), now you can build a test executable
+that is functionally equivalent to aes in the kernel code.
+
+To generate a test executable for the aes in x86_64/i386 kernel,
+
+ $ makeoptx86.sh
+
+This will build a test executable tstaesoptx86 (x86_64/i386). The executable will automatically detects the
+CPU clock rates. You specify the number of iterations and the number of 16-byte blocks for simulation.
+The executable generates (random number) the test data, and calls aes_encrypt_cbc to encrypt the plain data
+into cipher data, and then calls aes_decrypt_cbc to decrypt cipher into decrypted data. Afterwards, it compares
+the decrypted data against the plain data. Should there be a mismatch, the code breaks and exit.
+Otherwise, it measures the times the system spends on the 2 functions under test. Afterwards, it prints out
+the performance profiling data.
+
+On K5,
+
+$ tstaesoptx86 1000 2560
+device max CPU clock rate = 2659.00 MHz
+40960 bytes per cbc call
+ aes_encrypt_cbc : time elapsed = 220.24 usecs, 177.37 MBytes/sec, 14.30 cycles/byte
+ best iteration : time elapsed = 218.30 usecs, 178.94 MBytes/sec, 14.17 cycles/byte
+ worst iteration : time elapsed = 286.14 usecs, 136.51 MBytes/sec, 18.58 cycles/byte
+
+ aes_decrypt_cbc : time elapsed = 199.85 usecs, 195.46 MBytes/sec, 12.97 cycles/byte
+ best iteration : time elapsed = 198.17 usecs, 197.12 MBytes/sec, 12.86 cycles/byte
+ worst iteration : time elapsed = 228.12 usecs, 171.23 MBytes/sec, 14.81 cycles/byte
+
+On K5B (with aesni)
+
+$ tstaesoptx86 1000 256
+device max CPU clock rate = 2400.00 MHz
+4096 bytes per cbc call
+ aes_encrypt_cbc : time elapsed = 6.69 usecs, 583.67 MBytes/sec, 3.92 cycles/byte
+ best iteration : time elapsed = 6.38 usecs, 612.46 MBytes/sec, 3.74 cycles/byte
+ worst iteration : time elapsed = 9.72 usecs, 401.96 MBytes/sec, 5.69 cycles/byte
+
+ aes_decrypt_cbc : time elapsed = 2.05 usecs, 1902.65 MBytes/sec, 1.20 cycles/byte
+ best iteration : time elapsed = 1.96 usecs, 1997.06 MBytes/sec, 1.15 cycles/byte
+ worst iteration : time elapsed = 4.60 usecs, 849.00 MBytes/sec, 2.70 cycles/byte
+
+You can also build a test executable using the generic source code for the i386/x86_64 architecture.
+
+ $ makegenx86.sh
+
+When run on K5,
+
+$ tstaesgenx86 1000 2560
+device max CPU clock rate = 2659.00 MHz
+40960 bytes per cbc call
+ aes_encrypt_cbc : time elapsed = 278.05 usecs, 140.49 MBytes/sec, 18.05 cycles/byte
+ best iteration : time elapsed = 274.63 usecs, 142.24 MBytes/sec, 17.83 cycles/byte
+ worst iteration : time elapsed = 309.70 usecs, 126.13 MBytes/sec, 20.10 cycles/byte
+
+ aes_decrypt_cbc : time elapsed = 265.43 usecs, 147.17 MBytes/sec, 17.23 cycles/byte
+ best iteration : time elapsed = 262.20 usecs, 148.98 MBytes/sec, 17.02 cycles/byte
+ worst iteration : time elapsed = 296.19 usecs, 131.88 MBytes/sec, 19.23 cycles/byte
+
+We can see the current AES implementation in the x86_64 kernel has been improved from 17.83/17.02
+down to 14.12/12.86 cycles/byte for aes_encrypt_cbc and aes_decrypt_cbc, respectively.
+
+
+ --------- iOS ---------
+
+Similarly, you can build a test executable for the aes in the armv7 kernel (which uses the generic source code)
+
+ $ makegenarm.sh
+
+Note that you need the iOS SDK installed. We can then copy this executable to iOS devices for simulation.
+
+On N88,
+
+iPhone:~ root# ./tstaesgenarm 1000 2560
+device max CPU clock rate = 600.00 MHz
+40960 bytes per cbc call
+ aes_encrypt_cbc : time elapsed = 2890.18 usecs, 13.52 MBytes/sec, 42.34 cycles/byte
+ best iteration : time elapsed = 2692.00 usecs, 14.51 MBytes/sec, 39.43 cycles/byte
+ worst iteration : time elapsed = 18248.33 usecs, 2.14 MBytes/sec, 267.31 cycles/byte
+
+ aes_decrypt_cbc : time elapsed = 3078.20 usecs, 12.69 MBytes/sec, 45.09 cycles/byte
+ best iteration : time elapsed = 2873.33 usecs, 13.59 MBytes/sec, 42.09 cycles/byte
+ worst iteration : time elapsed = 9664.79 usecs, 4.04 MBytes/sec, 141.57 cycles/byte
+