numactl --interleave=all ./testing_cgeev -RN -N 100 -N 1000 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
MAGMA 1.6.0  compiled for CUDA capability >= 3.5
CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.3, MKL threads 16. 
device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
Usage: ./testing_cgeev [options] [-h|--help]

    N   CPU Time (sec)   GPU Time (sec)   |W_magma - W_lapack| / |W_lapack|
===========================================================================
  100     ---               0.1390
 1000     ---               0.9120
   10     ---               0.0004
   20     ---               0.0006
   30     ---               0.0009
   40     ---               0.0029
   50     ---               0.0036
   60     ---               0.0037
   70     ---               0.0057
   80     ---               0.0078
   90     ---               0.0087
  100     ---               0.0113
  200     ---               0.0726
  300     ---               0.0941
  400     ---               0.1416
  500     ---               0.1990
  600     ---               0.3623
  700     ---               0.4699
  800     ---               0.5551
  900     ---               0.7022
 1000     ---               0.7844
 2000     ---               2.4553
 3000     ---               7.3229
 4000     ---              11.6109
 5000     ---              17.5744
 6000     ---              31.7312
 7000     ---              43.0479
 8000     ---              56.3757
 9000     ---              70.3988
10000     ---              83.0759
12000     ---             124.8967
14000     ---             169.4146
16000     ---             230.7517
18000     ---             292.7448
20000     ---             376.6332

numactl --interleave=all ./testing_cgeev -RV -N 100 -N 1000 --range 10:90:10 --range 100:900:100 --range 1000:9000:1000 --range 10000:20000:2000
MAGMA 1.6.0  compiled for CUDA capability >= 3.5
CUDA runtime 7000, driver 7000. OpenMP threads 16. MKL 11.2.3, MKL threads 16. 
device 0: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 1: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
device 2: Tesla K40c, 745.0 MHz clock, 11519.6 MB memory, capability 3.5
Usage: ./testing_cgeev [options] [-h|--help]

    N   CPU Time (sec)   GPU Time (sec)   |W_magma - W_lapack| / |W_lapack|
===========================================================================
  100     ---               0.0200
 1000     ---               1.1048
   10     ---               0.0012
   20     ---               0.0022
   30     ---               0.0026
   40     ---               0.0046
   50     ---               0.0054
   60     ---               0.0057
   70     ---               0.0086
   80     ---               0.0112
   90     ---               0.0131
  100     ---               0.0165
  200     ---               0.0654
  300     ---               0.1203
  400     ---               0.1975
  500     ---               0.2633
  600     ---               0.4727
  700     ---               0.5677
  800     ---               0.7201
  900     ---               0.9013
 1000     ---               1.0094
 2000     ---               3.6737
 3000     ---              10.3041
 4000     ---              18.0482
 5000     ---              29.0504
 6000     ---              46.3775
 7000     ---              65.3780
 8000     ---              94.4202
 9000     ---             112.5593
10000     ---             147.0741
12000     ---             219.7016
14000     ---             328.8260
16000     ---             456.3178
18000     ---             594.9366
20000     ---             785.5734
