6 files changed, 359 insertions, 9 deletions
diff --git a/.gitignore b/.gitignore
index 8f68670..27c87cc 100644
--- a/.gitignore
+++ b/.gitignore
@@ -11,3 +11,5 @@ win_build
 packages
 .idea/
 dist/windows/
+_benchmark_data*
+*.benchmark-results
diff --git a/test/BENCHMARK.md b/test/BENCHMARK.md
new file mode 100644
index 0000000..80146b2
--- /dev/null
+++ b/test/BENCHMARK.md
@@ -0,0 +1,97 @@
+
+
+*Please don't use or publish this benchmark data yet, it's still alpha, i'm checking the validity of the results, and python 3 q version has not been merged yet.*
+
+**NOTE**
+This just a preliminary benchmark, and the results I got are somewhat surprising. I would love to validate these results by having other people run the benchmark as well and send me emails with their results. If you're interested, follow the "Running the benchmark" part. After the benchmark is finished, send me the `all.benchmark-results` file, along with some details about your hardware, and i'll add it to the spreadsheet. <harelba@gmail.com>
+
+# Benchmark
+This is an initial version of the benchmark, along with some results. The following is compared:
+* q running on python 2.7.11
+* q running on python 3.6.4
+* textql 2.0.3
+* octosql
+
+The q version used for the benchmark is still on the python2/3 compatibility branch (hash f0b62b15b91583cd944ea2e8daf6f730198959fa)
+
+This is by no means a scientific benchmark, and it only focuses on the data loading time. Also, it does not try to provide any usability comparison between q and textql. Actually, I've created this benchmark in order to compare q over python 2 and 3, and only then decided it would be interesting to compare the results to textql and octosql.
+
+## Methodology
+The idea was to compare the time sensitivity of row and column count. 
+
+* Row counts: 1,10,100,1000,10000,100000,1000000
+* Column counts: 1,5,10,20,50,100
+* Iterations for each combination: 10
+
+The benchmark executes simple `select count(*) from <file>` queries for each combination, calculating the mean and stddev of each set of iterations. The stddev is used in order to measure the validity of the results.
+
+The graphs below only compare the means of the results, the standard deviations are written into the google sheet itself, and can be viewed there if needed.
+
+## Hardware
+OSX Sierra on a 15" Macbook Pro from Mid 2015, with 16GB of RAM, and an internal Flash Drive of 256GB.
+
+
+## Running the benchmark
+
+Please note that the initial run generates big files, so you'd need more than 3GB of free space available. This also means that the first run will take much longer than additional runs. This is typical, and does not affect the benchmark results. All the generated files reside in the `_benchmark_data/` folder.
+
+### Preparations
+Make sure you have pyenv and pyenv-virtualenv installed.
+
+* $ `git clone git@github.com:harelba/q.git`
+* $ `git checkout q-benchmark`
+* $ `cd test/`
+* $ `pyenv install 2.7.11`
+* $ `pyenv virtualenv 2.7.11 py2-q`
+* $ `pyenv activate py2-q`
+* $ `pip install -r ../requirements.txt`
+* $ `pyenv install 3.6.4`
+* $ `pyenv virtualenv 3.6.4 py3-q`
+* $ `pyenv activate py3-q`
+* $ `pip install -r ../requirements.txt`
+* $ `wget "https://s3.amazonaws.com/harelba-q-public/benchmark_data.tar.gz"`
+* $ `tar xvzf benchmark_data.tar.gz`
+* Install [`textql`](https://github.com/dinedal/textql#install)
+* Install [`octosql`](https://github.com/cube2222/octosql#installation)
+
+### Execution
+* $ `pyenv activate py2-q`
+* $ `./test-all BenchmarkTests.test_q_matrix` 
+* $ `pyenv activate py3-q`
+* $ `./test-all BenchmarkTests.test_q_matrix`
+* $ `./test-all BenchmarkTests.test_textql_matrix`
+* $ `./test-all BenchmarkTests.test_octosql_matrix`
+
+The results from each of the benchmarks will be written to `<virtual-env-name>.benchmark-results`, `textql.benchmark-results` for the textql test, and `octosql.benchmark-results`.
+
+* $ `paste py2-q.benchmark-results py3-q.benchmark-results textql.benchmark-results octosql.benchmark-results > all.benchmark-results`
+
+## Updating the benchmark markdown document file
+The results should reside in the following [google sheet](https://docs.google.com/spreadsheets/d/1Ljr8YIJwUQ5F4wr6ATga5Aajpu1CvQp1pe52KGrLkbY/edit?usp=sharing). 
+
+* Duplicate the baseline tab inside the spreadsheet.
+* Paste the content of `all.benchmark-results` to the new tab, near "Fill raw results here".
+
+* All the graphs below will be updated automatically.
+
+## Results
+(Results are automatically updated from the baseline tab in the google spreadsheet).
+
+### 1 Column Table
+![1 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1119350798&format=image)
+
+### 5 Column Table
+![5 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=599223098&format=image)
+
+### 10 Column Table
+![10 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=82695414&format=image)
+
+### 20 Column Table
+![20 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=1573199483&format=image)
+
+### 50 Column Table
+![50 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=448568670&format=image)
+
+### 100 Column Table
+![100 column table](https://docs.google.com/spreadsheets/d/e/2PACX-1vQy9Zm4I322Tdf5uoiFFJx6Oi3Z4AMq7He3fUUtsEQVQIdTGfWgjxFD6k8PAy9wBjvFkqaG26oBgNTP/pubchart?oid=2101488258&format=image)
+
diff --git a/test/results/benchmark-results-2018-12-17 b/test/results/benchmark-results-2018-12-17
new file mode 100644
index 0000000..8d40754
--- /dev/null
+++ b/test/results/benchmark-results-2018-12-17
@@ -0,0 +1,48 @@
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	1	0.06731581688	0.005270230559	1	1	0.09322199821	0.008088911233	1	1	0.01541593075	0.00846248027
+10	1	0.06453447342	0.003110529879	10	1	0.0952757597	0.01068078746	10	1	0.01273214817	0.001517273708
+100	1	0.06692070961	0.004081653457	100	1	0.09462814331	0.00550010348	100	1	0.01279251575	0.0007315880067
+1000	1	0.0703766346	0.002271640626	1000	1	0.09908235073	0.0085850761	1000	1	0.01575729847	0.001170010368
+10000	1	0.1229094744	0.005485221564	10000	1	0.1375562668	0.009702295105	10000	1	0.04378418922	0.001448525422
+100000	1	0.598156023	0.01721054649	100000	1	0.522838521	0.01662262184	100000	1	0.3162255287	0.01030908105
+1000000	1	5.372911286	0.0425664739	1000000	1	4.312362194	0.04878944441	1000000	1	3.042521834	0.02222183573
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	5	0.06542704105	0.001973147455	1	5	0.09278903008	0.007920553711	1	5	0.01264638901	0.0009375946825
+10	5	0.06713621616	0.003302711249	10	5	0.09266264439	0.006464956796	10	5	0.01264002323	0.0005921679139
+100	5	0.07043097019	0.003513428229	100	5	0.09614286423	0.006232406135	100	5	0.01298532486	0.001484074702
+1000	5	0.07853364944	0.002677513043	1000	5	0.1007899046	0.009419248049	1000	5	0.01899263859	0.0005582728364
+10000	5	0.1847445965	0.006918806414	10000	5	0.151746726	0.007045195955	10000	5	0.07659320831	0.00297289199
+100000	5	1.206378174	0.01569912364	100000	5	0.6551784992	0.02468335852	100000	5	0.6256412745	0.009538934388
+1000000	5	11.4774132	0.2737370571	1000000	5	5.54825387	0.06392730387	1000000	5	6.174384165	0.0396257937
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	10	0.06635277271	0.003224367089	1	10	0.09342534542	0.003372803039	1	10	0.01265852451	0.00115658081
+10	10	0.06949725151	0.004236749478	10	10	0.09139561653	0.00361962951	10	10	0.01304826736	0.0009077163448
+100	10	0.07332832813	0.003211229764	100	10	0.09613847733	0.002976111632	100	10	0.01362993717	0.0003077883843
+1000	10	0.09426920414	0.004147375078	1000	10	0.10503757	0.004323166227	1000	10	0.02448859215	0.001551123656
+10000	10	0.26318748	0.007391059562	10000	10	0.1713474512	0.004400747258	10000	10	0.1165221453	0.004626763279
+100000	10	1.939086366	0.01711379803	100000	10	0.8509856939	0.01451489164	100000	10	1.03131845	0.0154166
+1000000	10	19.16211414	0.3417997674	1000000	10	7.636127377	0.06577367856	1000000	10	10.22023973	0.0443451077
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	20	0.06688520908	0.003686408801	1	20	0.0937997818	0.00504618112	1	20	0.01299088001	0.00130498302
+10	20	0.06709973812	0.003909120415	10	20	0.09303014278	0.004256698801	10	20	0.01291837692	0.001043654863
+100	20	0.0813845396	0.005158197903	100	20	0.1016526461	0.004238640414	100	20	0.01500227451	0.001216417242
+1000	20	0.1107584953	0.006723338286	1000	20	0.1139468193	0.005867712372	1000	20	0.03420743942	0.003094073019
+10000	20	0.4188146114	0.01474904378	10000	20	0.2173264027	0.005747071741	10000	20	0.1986592293	0.006588276071
+100000	20	3.461091924	0.1043205869	100000	20	1.287664986	0.0221862172	100000	20	1.829260516	0.01414616471
+1000000	20	33.20876031	0.3190789024	1000000	20	11.84579525	0.1406809832	1000000	20	18.15644448	0.1474355796
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	50	0.06706497669	0.003487010206	1	50	0.09036362171	0.00392337182	1	50	0.0134802103	0.001043321639
+10	50	0.0721385479	0.00526657204	10	50	0.09356541634	0.003705587568	10	50	0.01397790909	0.001008071038
+100	50	0.1015130758	0.003524910234	100	50	0.1168865919	0.002810940717	100	50	0.01766057014	0.0008818513382
+1000	50	0.1666964769	0.006661858999	1000	50	0.1373265505	0.004538848823	1000	50	0.05760366917	0.003787637225
+10000	50	0.8726647139	0.04817920962	10000	50	0.3499189854	0.006489403179	10000	50	0.4113406658	0.00551681222
+100000	50	7.659929824	0.1190133198	100000	50	2.486357236	0.04149367418	100000	50	4.023236489	0.02935989293
+1000000	50	75.64912643	1.036366669	1000000	50	23.88283024	0.4251339799	1000000	50	40.02736287	0.3879349969
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev
+1	100	0.06666021347	0.001720503522	1	100	0.09272692204	0.005532725603	1	100	0.01451745033	0.0009589603269
+10	100	0.0746655941	0.004541222011	10	100	0.09874138832	0.007172096503	10	100	0.0155831337	0.001020332488
+100	100	0.1330797672	0.004335602846	100	100	0.1412571669	0.008253862291	100	100	0.02391133308	0.001714142787
+1000	100	0.2642062426	0.01022737492	1000	100	0.1779050112	0.006555498616	1000	100	0.09285030365	0.002734967858
+10000	100	1.570353174	0.01475258288	10000	100	0.5818499565	0.01616512044	10000	100	0.779653573	0.01021001276
+100000	100	14.70140581	0.3328709764	100000	100	4.601756811	0.05434568891	100000	100	7.700500083	0.06577229359
+1000000	100	148.4634018	7.316550329	1000000	100	44.62859902	0.4333388333	1000000	100	77.977897	0.7301257528
diff --git a/test/results/benchmark-results-2019-12-02 b/test/results/benchmark-results-2019-12-02
new file mode 100644
index 0000000..e9dca7f
--- /dev/null
+++ b/test/results/benchmark-results-2019-12-02
@@ -0,0 +1,48 @@
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	1	0.0734721899033	0.00342279013601	1	1	0.10051469802856446	0.004675328349147358	1	1	0.0173349380493	0.0059959206152	1	1	0.011228728294372558	0.0010877179127881723
+10	1	0.0746278762817	0.00468414353387	10	1	0.10234739780426025	0.00510078119096311	10	1	0.014651632309	0.00217845165708	10	1	0.011713194847106933	0.0017938878071954913
+100	1	0.0754479169846	0.00367546265314	100	1	0.10537784099578858	0.0035228973459241267	100	1	0.0151803731918	0.00224971341816	100	1	0.014325213432312012	0.0017290050723256997
+1000	1	0.0827184200287	0.00332749977518	1000	1	0.10914053916931152	0.0037876126560058765	1000	1	0.0185441970825	0.00154583625692	1000	1	0.02007620334625244	0.003841671637009388
+10000	1	0.130123448372	0.00398276082559	10000	1	0.1471630811691284	0.0056107748124805115	10000	1	0.0520985126495	0.00227488114922	10000	1	0.06009321212768555	0.0018045935981669575
+100000	1	0.612298583984	0.0185709541475	100000	1	0.5399166822433472	0.02213469033463378	100000	1	0.337541723251	0.0116086194325	100000	1	0.43014986515045167	0.005839166941421165
+1000000	1	5.59862473011	0.0905480166939	1000000	1	4.39980182647705	0.0884813733818434	1000000	1	3.17139401436	0.0444820658987	1000000	1	4.267914342880249	0.11698217726499018
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	5	0.0694166183472	0.00307281713923	1	5	0.10455994606018067	0.008311956974184905	1	5	0.0131057500839	0.00132499528147	1	5	0.010100650787353515	0.0008495662523858508
+10	5	0.070539522171	0.00196090167509	10	5	0.10231781005859375	0.0050317627429269955	10	5	0.014551782608	0.00205475359694	10	5	0.010378241539001465	0.00042382931551291064
+100	5	0.0742300033569	0.00302154129771	100	5	0.10598726272583008	0.006187299813626734	100	5	0.0150702953339	0.0019565274703	100	5	0.011428117752075195	0.001054487577015793
+1000	5	0.087014746666	0.00431789522004	1000	5	0.11044230461120605	0.00632368195279581	1000	5	0.0254506826401	0.00232872935772	1000	5	0.023230981826782227	0.0013638854413874789
+10000	5	0.187808656693	0.00512575898848	10000	5	0.16487712860107423	0.010076056131490768	10000	5	0.0847299337387	0.00413949339091	10000	5	0.103983473777771	0.002566703779142417
+100000	5	1.24647183418	0.0307551525876	100000	5	0.6653818368911744	0.017578506494383438	100000	5	0.647140431404	0.00484863670427	100000	5	0.9367039680480957	0.047583277674755294
+1000000	5	11.6488220453	0.222469120228	1000000	5	5.654011297225952	0.08764196721029975	1000000	5	6.31902601719	0.0585787838282	1000000	5	8.689867305755616	0.20061665098923728
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	10	0.0732004165649	0.00524696166897	1	10	0.10312862396240234	0.006586403661254048	1	10	0.0135506629944	0.00154682138063	1	10	0.010560154914855957	0.0012062897475765952
+10	10	0.0719322681427	0.00337980529655	10	10	0.10103726387023926	0.005139217305955802	10	10	0.0143553495407	0.00141737842486	10	10	0.01032114028930664	0.0007034635668424652
+100	10	0.0793414115906	0.0047871454186	100	10	0.10384261608123779	0.004850772126615192	100	10	0.0148341178894	0.00143514697436	100	10	0.012691855430603027	0.0009712232784515944
+1000	10	0.098956155777	0.00314928094914	1000	10	0.11434323787689209	0.0052855049216250505	1000	10	0.0275867700577	0.00200118141767	1000	10	0.034005475044250486	0.001425221820132235
+10000	10	0.273002624512	0.00871803130738	10000	10	0.18594975471496583	0.008426937757921716	10000	10	0.126932358742	0.00620581702113	10000	10	0.19539403915405273	0.00401993825688173
+100000	10	2.03795661926	0.0744729489785	100000	10	0.8921735525131226	0.02259783771356152	100000	10	1.04192745686	0.0149334046633	100000	10	1.8566447257995606	0.08845727656371252
+1000000	10	19.7247032404	0.66605468687	1000000	10	7.7266138076782225	0.10439505885940377	1000000	10	10.2687769413	0.0682723749151	1000000	10	18.230213975906373	0.9985511456352485
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	20	0.0691435098648	0.00232124488632	1	20	0.10138421058654785	0.0059525018871562215	1	20	0.014045715332	0.00116432652787	1	20	0.01030263900756836	0.0014017045651869215
+10	20	0.072181892395	0.00306747549853	10	20	0.1015845537185669	0.002972749458583174	10	20	0.013680934906	0.000910383697657	10	20	0.010272622108459473	0.0006771441928063938
+100	20	0.0876452922821	0.00404266708701	100	20	0.11257178783416748	0.005037816595320132	100	20	0.0164495944977	0.00154197876987	100	20	0.015861248970031737	0.0010913014445132199
+1000	20	0.116324877739	0.00424430086321	1000	20	0.12467055320739746	0.005266059173902396	1000	20	0.0401841163635	0.00349693991299	1000	20	0.05414586067199707	0.0018546178376686003
+10000	20	0.427709841728	0.0133665186407	10000	20	0.23156797885894775	0.011922511384004917	10000	20	0.204241681099	0.00279346321711	10000	20	0.4071432828903198	0.007885384401472337
+100000	20	3.53898899555	0.145285257829	100000	20	1.2966086864471436	0.020653793768142525	100000	20	1.83605823517	0.0237800648849	100000	20	3.930004286766052	0.10273588479658016
+1000000	20	34.4587288141	0.882682659759	1000000	20	12.197622799873352	0.38353366422310053	1000000	20	18.2444090366	0.13051035911	1000000	20	39.0564279794693	1.6574754268177938
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	50	0.0733374118805	0.00664954688392	1	50	0.1022254228591919	0.003813871305051591	1	50	0.0140640974045	0.00200548518545	1	50	0.010728073120117188	0.001559272868841953
+10	50	0.0744789838791	0.00281544448238	10	50	0.10403745174407959	0.00443303536428869	10	50	0.0147454023361	0.00148454350858	10	50	0.011526155471801757	0.0009739846530934967
+100	50	0.108049821854	0.00568025365269	100	50	0.124765944480896	0.0053517076254703125	100	50	0.0197550296783	0.0027634472647	100	50	0.023745250701904298	0.0011332207108003655
+1000	50	0.176298546791	0.00606499912326	1000	50	0.144227933883667	0.006044948688936146	1000	50	0.0628623962402	0.00234005612486	1000	50	0.10975453853607178	0.0016775696306391506
+10000	50	0.891832685471	0.0473835751534	10000	50	0.3600123167037964	0.007161627385086743	10000	50	0.427824187279	0.0069294988011	10000	50	0.9561779499053955	0.009561843743429211
+100000	50	7.77155239582	0.101585372824	100000	50	2.6051604032516478	0.0878884131862843	100000	50	4.03805820942	0.0408635603507	100000	50	9.653993058204652	0.2270682921633226
+1000000	50	78.4816464186	2.09936579528	1000000	50	25.2284695148468	0.7603793472233193	1000000	50	40.3812385321	0.160958877387	1000000	50	95.60885927677154	1.751052379968784
+lines	columns	py2-q_mean	py2-q_stddev	lines	columns	py3-q_mean	py3-q_stddev	lines	columns	textql_mean	textql_stddev	lines	columns	octosql_mean	octosql_stddev
+1	100	0.0753021240234	0.00621659499913	1	100	0.10945203304290771	0.009525392011882291	1	100	0.0166680812836	0.00209574297652	1	100	0.011182069778442383	0.0008911394102061748
+10	100	0.0799629449844	0.00431510030729	10	100	0.11035494804382324	0.0023842770363513535	10	100	0.0161526203156	0.000944392845294	10	100	0.01617884635925293	0.0019236690464235176
+100	100	0.141041827202	0.00473456838003	100	100	0.16715807914733888	0.013143750801118107	100	100	0.0285645484924	0.00499557241643	100	100	0.040108680725097656	0.003924007500439625
+1000	100	0.272631931305	0.00899845563948	1000	100	0.19988524913787842	0.004237481791359729	1000	100	0.101301050186	0.00408546610666	1000	100	0.22757408618927003	0.004142308919104949
+10000	100	1.61969444752	0.0338148564151	10000	100	0.6452171087265015	0.02634034109327908	10000	100	0.795653438568	0.0110871290583	10000	100	2.1363088846206666	0.03476917431930926
+100000	100	14.9034232616	0.177666674893	100000	100	5.090956687927246	0.17384895247428786	100000	100	7.78771924973	0.0423887501838	100000	100	21.054430747032164	0.24453457049973806
+1000000	100	147.973981094	2.41177161281	1000000	100	47.093635368347165	1.4756281250192291	1000000	100	78.1040684938	0.212101848957	1000000	100	211.90868167877198	1.6401403292528614
diff --git a/test/test-suite b/test/test-suite
index e17afcd..836676c 100755
--- a/test/test-suite
+++ b/test/test-suite
@@ -8,24 +8,25 @@
 # to be executed from the current folder
 #
 #
+from __future__ import print_function
 
-import unittest
+import codecs
+import locale
+import os
 import random
-import json
-from json import JSONEncoder
-from subprocess import PIPE, Popen, STDOUT
 import sys
-import os
 import time
+import unittest
+from gzip import GzipFile
+from subprocess import PIPE, Popen
 from tempfile import NamedTemporaryFile
-import locale
-import pprint
+
 import six
 from six.moves import range
-import codecs
 
 sys.path.append(os.path.join(os.path.abspath(os.path.dirname(sys.argv[0])),'..','bin'))
-from qtextasdata import QTextAsData,QOutput,QOutputPrinter,QInputParams
+from qtextasdata import QTextAsData, QInputParams
+import itertools
 
 # q uses this encoding as the default output encoding. Some of the tests use it in order to 
 # make sure that the output is correctly encoded
@@ -2343,6 +2344,160 @@ class BasicModuleTests(AbstractQTestCase):
         self.assertTrue(table_structure.materialized_files['my_data'].is_stdin)
 
 
+class BenchmarkAttemptResults(object):
+    def __init__(self, attempt, lines, columns, duration,return_code):
+        self.attempt = attempt
+        self.lines = lines
+        self.columns = columns
+        self.duration = duration
+        self.return_code = return_code
+
+    def __str__(self):
+        return "{}".format(self.__dict__)
+    __repr__ = __str__
+
+class BenchmarkResults(object):
+    def __init__(self, lines, columns, attempt_results, mean, stddev):
+        self.lines = lines
+        self.columns = columns
+        self.attempt_results = attempt_results
+        self.mean = mean
+        self.stddev = stddev
+
+    def __str__(self):
+        return "{}".format(self.__dict__)
+    __repr__ = __str__
+
+class BenchmarkTests(AbstractQTestCase):
+
+    BENCHMARK_DIR = './_benchmark_data'
+
+    def _ensure_benchmark_data_dir_exists(self):
+        try:
+            os.mkdir(BenchmarkTests.BENCHMARK_DIR)
+        except Exception as e:
+            pass
+
+    def _create_benchmark_file_if_needed(self):
+        self._ensure_benchmark_data_dir_exists()
+
+        if os.path.exists('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR)):
+            return
+
+        g = GzipFile('unit-file.csv.gz')
+        d = g.read().decode('utf-8')
+        f = open('{}/benchmark-file.csv'.format(BenchmarkTests.BENCHMARK_DIR), 'w')
+        for i in range(100):
+            f.write(d)
+        f.close()
+
+    def _prepare_test_file(self, lines, columns):
+
+        filename = '{}/_benchmark_data__lines_{}_columns_{}.csv'.format(BenchmarkTests.BENCHMARK_DIR,lines, columns)
+
+        if os.path.exists(filename):
+            return filename
+
+        c = ['c{}'.format(x + 1) for x in range(columns)]
+
+        # write a header line
+        ff = open(filename,'w')
+        ff.write(",".join(c))
+        ff.write('\n')
+        ff.close()
+
+        r, o, e = run_command('head -{} {}/benchmark-file.csv | ../bin/q -d , "select {} from -" >> {}'.format(lines, BenchmarkTests.BENCHMARK_DIR, ','.join(c), filename))
+        self.assertEqual(r, 0)
+        return filename
+
+    def _decide_result(self,attempt_results):
+
+        failed = list(filter(lambda a: a.return_code != 0,attempt_results))
+
+        if len(failed) == 0:
+            mean = sum([x.duration for x in attempt_results]) / len(attempt_results)
+            sum_squared = sum([(x.duration - mean)**2 for x in attempt_results])
+            ddof = 0
+            pvar = sum_squared / (len(attempt_results) - ddof)
+            stddev = pvar ** 0.5
+        else:
+            mean = None
+            stddev = None
+
+        return BenchmarkResults(
+            attempt_results[0].lines,
+            attempt_results[0].columns,
+            attempt_results,
+            mean,
+            stddev
+        )
+
+    def _perform_test_performance_matrix(self,name,generate_cmd_function):
+        results = []
+
+        self._create_benchmark_file_if_needed()
+        for columns in [1, 5, 10, 20, 50, 100]:
+            for lines in [1, 10, 100, 1000, 10000, 100000, 1000000]:
+                attempt_results = []
+                for attempt in range(10):
+                    filename = self._prepare_test_file(lines, columns)
+                    if DEBUG:
+                        print("Testing {}".format(filename))
+                    t0 = time.time()
+                    r, o, e = run_command(generate_cmd_function(filename,lines,columns))
+                    duration = time.time() - t0
+                    attempt_result = BenchmarkAttemptResults(attempt, lines, columns, duration, r)
+                    attempt_results += [attempt_result]
+                    if DEBUG:
+                        print("Results: {}".format(attempt_result.__dict__))
+                final_result = self._decide_result(attempt_results)
+                results += [final_result]
+
+        series_fields = [six.u('lines'),six.u('columns')]
+        value_fields = [six.u('mean'),six.u('stddev')]
+
+        all_fields = series_fields + value_fields
+
+        output_filename = '{}.benchmark-results'.format(name)
+        output_file = open(output_filename,'w')
+        for columns,g in itertools.groupby(sorted(results,key=lambda x:x.columns),key=lambda x:x.columns):
+            x = six.u("\t").join(series_fields + [six.u('{}_{}').format(name, f) for f in value_fields])
+            print(x,file=output_file)
+            for result in g:
+                print(six.u("\t").join(map(str,[getattr(result,f) for f in all_fields])),file=output_file)
+        output_file.close()
+
+        print("results have been written to : {}".format(output_filename))
+        if DEBUG:
+            print("RESULTS FOR {}".format(name))
+            print(open(output_filename,'r').read())
+
+    def test_q_matrix(self):
+        venv = os.path.basename(os.environ.get('VIRTUAL_ENV') or 'unknown-virtual-env')
+
+        def generate_q_cmd(data_filename,line_count,column_count):
+            if column_count == 1:
+                additional_params = '-c 1'
+            else:
+                additional_params = ''
+            return '../bin/q -H -d , {} "select count(*) from {}"'.format(additional_params, data_filename)
+        self._perform_test_performance_matrix(venv,generate_q_cmd)
+
+    def test_textql_matrix(self):
+        def generate_textql_cmd(data_filename,line_count,column_count):
+            return 'textql -dlm , -sql "select count(*)" {}'.format(data_filename)
+        self._perform_test_performance_matrix('textql',generate_textql_cmd)
+
+    def test_octosql_matrix(self):
+        config_fn = self.random_tmp_filename('octosql', 'config')
+        def generate_octosql_cmd(data_filename,line_count,column_count):
+            j = """dataSources:\n  - name: bmdata\n    type: csv\n    config:\n      path: "{}"\n""".format(data_filename)
+            f = open(config_fn,'w')
+            f.write(j)
+            f.close()
+            return 'octosql -c {} -o table "select count(*) from bmdata a"'.format(config_fn)
+        self._perform_test_performance_matrix('octosql',generate_octosql_cmd)
+
 def suite():
     tl = unittest.TestLoader()
     basic_stuff = tl.loadTestsFromTestCase(BasicTests)
diff --git a/test/unit-file.csv.gz b/test/unit-file.csv.gz
new file mode 100644
index 0000000..ade23b7
--- /dev/null
+++ b/test/unit-file.csv.gz