summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorHarel Ben-Attia <harelba@gmail.com>2018-12-19 14:10:20 +0200
committerHarel Ben-Attia <harelba@gmail.com>2018-12-19 14:10:20 +0200
commit03075c3e6c4c4fa5fb74fb43c9be02d86ac0a342 (patch)
tree770059f31236c1c164749af05a00ad032e4a2714
parentf1155377a6115fb45f0603e0a59bbd8f1e992165 (diff)
wip
-rw-r--r--test/BENCHMARK.md8
-rwxr-xr-xtest/test-suite4
-rw-r--r--test/unit-file.csv.gzbin0 -> 390046 bytes
3 files changed, 9 insertions, 3 deletions
diff --git a/test/BENCHMARK.md b/test/BENCHMARK.md
index fc225d3..5b729a7 100644
--- a/test/BENCHMARK.md
+++ b/test/BENCHMARK.md
@@ -1,7 +1,11 @@
-# Benchmark
+
*Please don't use or publish this benchmark data yet, it's still alpha, i'm checking the validity of the results, and python 3 q version has not been merged yet.*
+**NOTE**
+This just a preliminary benchmark, and the results I got are somewhat surprising. I would love to validate these results by having other people run the benchmark as well and send me emails with their results. If you're interested, please follow the "Running the benchmark" part, and send me the `all.benchmark-results` file, along with some details about your hardware. <harelba@gmail.com>
+
+# Benchmark
This is an initial version of the benchmark, along with some results. The following is compared:
* q running on python 2.7.11
* q running on python 3.6.4
@@ -16,7 +20,7 @@ The idea was to compare the time sensitivity of row and column count.
* Row counts: 1,10,100,1000,10000,100000,1000000
* Column counts: 1,5,10,20,50,100
-* Iterations for each combination: 3
+* Iterations for each combination: 10
The benchmark executes simple `select count(*) from <file>` queries for each combination, calculating the mean and stddev of each set of iterations. The stddev is used in order to measure the validity of the results.
diff --git a/test/test-suite b/test/test-suite
index bd589bd..44bd334 100755
--- a/test/test-suite
+++ b/test/test-suite
@@ -17,6 +17,7 @@ import random
import sys
import time
import unittest
+from gzip import GzipFile
from subprocess import PIPE, Popen
from tempfile import NamedTemporaryFile
@@ -2374,7 +2375,8 @@ class BenchmarkTests(AbstractQTestCase):
if os.path.exists('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR)):
return
- d = open('unit-file.csv').read()
+ g = GzipFile('unit-file.csv.gz')
+ d = g.read().decode('utf-8')
f = open('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR), 'w')
for i in range(100):
f.write(d)
diff --git a/test/unit-file.csv.gz b/test/unit-file.csv.gz
new file mode 100644
index 0000000..ade23b7
--- /dev/null
+++ b/test/unit-file.csv.gz
Binary files differ