wip

author: Harel Ben-Attia <harelba@gmail.com> 2018-12-19 14:10:20 +0200
committer: Harel Ben-Attia <harelba@gmail.com> 2018-12-19 14:10:20 +0200
commit: 03075c3e6c4c4fa5fb74fb43c9be02d86ac0a342 (patch)
tree: 770059f31236c1c164749af05a00ad032e4a2714
parent: f1155377a6115fb45f0603e0a59bbd8f1e992165 (diff)
3 files changed, 9 insertions, 3 deletions
diff --git a/test/BENCHMARK.md b/test/BENCHMARK.md
index fc225d3..5b729a7 100644
--- a/test/BENCHMARK.md
+++ b/test/BENCHMARK.md
@@ -1,7 +1,11 @@
 
-# Benchmark
+
 *Please don't use or publish this benchmark data yet, it's still alpha, i'm checking the validity of the results, and python 3 q version has not been merged yet.*
 
+**NOTE**
+This just a preliminary benchmark, and the results I got are somewhat surprising. I would love to validate these results by having other people run the benchmark as well and send me emails with their results. If you're interested, please follow the "Running the benchmark" part, and send me the `all.benchmark-results` file, along with some details about your hardware. <harelba@gmail.com>
+
+# Benchmark
 This is an initial version of the benchmark, along with some results. The following is compared:
 * q running on python 2.7.11
 * q running on python 3.6.4
@@ -16,7 +20,7 @@ The idea was to compare the time sensitivity of row and column count.
 
 * Row counts: 1,10,100,1000,10000,100000,1000000
 * Column counts: 1,5,10,20,50,100
-* Iterations for each combination: 3
+* Iterations for each combination: 10
 
 The benchmark executes simple `select count(*) from <file>` queries for each combination, calculating the mean and stddev of each set of iterations. The stddev is used in order to measure the validity of the results.
 
diff --git a/test/test-suite b/test/test-suite
index bd589bd..44bd334 100755
--- a/test/test-suite
+++ b/test/test-suite
@@ -17,6 +17,7 @@ import random
 import sys
 import time
 import unittest
+from gzip import GzipFile
 from subprocess import PIPE, Popen
 from tempfile import NamedTemporaryFile
 
@@ -2374,7 +2375,8 @@ class BenchmarkTests(AbstractQTestCase):
         if os.path.exists('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR)):
             return
 
-        d = open('unit-file.csv').read()
+        g = GzipFile('unit-file.csv.gz')
+        d = g.read().decode('utf-8')
         f = open('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR), 'w')
         for i in range(100):
             f.write(d)
diff --git a/test/unit-file.csv.gz b/test/unit-file.csv.gz
new file mode 100644
index 0000000..ade23b7
--- /dev/null
+++ b/test/unit-file.csv.gz
author	Harel Ben-Attia <harelba@gmail.com>	2018-12-19 14:10:20 +0200
committer	Harel Ben-Attia <harelba@gmail.com>	2018-12-19 14:10:20 +0200
commit	03075c3e6c4c4fa5fb74fb43c9be02d86ac0a342 (patch)
tree	770059f31236c1c164749af05a00ad032e4a2714
parent	f1155377a6115fb45f0603e0a59bbd8f1e992165 (diff)