summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorHarel Ben-Attia <harelba@gmail.com>2014-07-12 19:18:05 -0400
committerHarel Ben-Attia <harelba@gmail.com>2014-07-12 19:18:05 -0400
commitf24723d21de94df4f300c2f3655dd27585012e4b (patch)
treeb712a00b0c8d375f2910651ceb1ad1d176e18694
parent09ed012991cf7e9da6113744fd41ee542992f1f2 (diff)
Moved all docs to the new q web site
README now contains just a link to the web site itself.
-rw-r--r--README.markdown143
1 files changed, 1 insertions, 142 deletions
diff --git a/README.markdown b/README.markdown
index 104d248..aad8202 100644
--- a/README.markdown
+++ b/README.markdown
@@ -1,148 +1,7 @@
# q - Text as Data
q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files).
-Main features:
-* Seamless multi-table SQL support, including joins. filenames are just used instead of table names (use - for stdin)
-* Automatic column name and column type detection (Allows working more naturally with the data)
-* Full encoding support (input, output and query)
+q's web site is [here](http://harelba.github.io/q/).
-## Examples
-A beginner's tutorial can be found [here](examples/EXAMPLES.markdown).
-
-__Example 1:__
-
- q -H -t "select count(distinct(uuid)) from ./clicks.csv"
-
-__Output 1:__
-```bash
-229
-```
-
-__Example 2:__
-
- q -H -t "select request_id,score from ./clicks.csv where score > 0.7 order by score desc limit 5"
-
-__Output 2:__
-```bash
-2cfab5ceca922a1a2179dc4687a3b26e 1.0
-f6de737b5aa2c46a3db3208413a54d64 0.986665809568
-766025d25479b95a224bd614141feee5 0.977105183282
-2c09058a1b82c6dbcf9dc463e73eddd2 0.703255121794
-```
-
-__Example 3:__
-
- q -t -H "select strftime('%H:%M',date_time) hour_and_minute,count(*) from ./clicks.csv group by hour_and_minute"
-
-__Output 3:__
-```bash
-07:00 138148
-07:01 140026
-07:02 121826
-```
-
-__Usage Example 4:__
-
- q -t -H "select hashed_source_machine,count(*) from ./clicks.csv group by hashed_source_machine"
-
-__Output 4:__
-```bash
-47d9087db433b9ba.domain.com 400000
-```
-
-__Example 5 (total size per user/group in the /tmp subtree):__
-
- sudo find /tmp -ls | q "select c5,c6,sum(c7)/1024.0/1024 as total from - group by c5,c6 order by total desc"
-
-__Output 5:__
-```bash
-mapred hadoop 304.00390625
-root root 8.0431451797485
-smith smith 4.34389972687
-```
-
-__Example 6 (top 3 user ids with the largest number of owned processes, sorted in descending order):__
-
-Note the usage of the autodetected column name UID in the query.
-
- ps -ef | q -H "select UID,count(*) cnt from - group by UID order by cnt desc limit 3"
-
-__Output 6:__
-```bash
-root 152
-harel 119
-avahi 2
-```
-
-## Installation
-Current stable version is `1.4.0`.
-
-Requirements: Just Python 2.5 and up or Python 2.4 with sqlite3 module installed. Python 3.x is not supported yet.
-
-### Mac Users
-Make sure you run `brew update` first and then just run `brew install q`.
-
-Thanks [@stuartcarnie](https://github.com/stuartcarnie) for the initial homebrew formula
-
-### RPM-Base Linux distributions
-Download the version `1.4.0` RPM here **[here](https://github.com/harelba/packages-for-q/raw/master/rpms/q-text-as-data-1.4.0-1.noarch.rpm)**.
-
-Install using `rpm -ivh <rpm-name>`.
-
-RPM Releases also contain a man page. Just enter `man q`.
-
-**NOTE** In Version `1.4.0`, the RPM package name has been changed to `q-text-as-data`. If you already have the old version, just remove it with `rpm -e q` before installing.
-
-### Manual installation (very simple, since there are no dependencies)
-
-1. Download the main q executable from **[here](https://raw.github.com/harelba/q/1.4.0/bin/q)** into a folder in the PATH.
-2. Make the file executable.
-
-For `Windows` machines, also download q.bat **[here](https://raw.github.com/harelba/q/1.4.0/bin/q.bat)** into the same folder and use it to run q.
-
-### Debian-based Linux distributions
-If you're interested in Debian packaing, please drop me a line to harelba@gmail.com.
-
-## Overview
-Have you ever stared at a text file on the screen, hoping it would have been a database so you could ask anything you want about it? I had that feeling many times, and I've finally understood that it's not the _database_ that I want. It's the language - SQL.
-
-SQL is a declarative language for data, and as such it allows me to define what I want without caring about how exactly it's done. This is the reason SQL is so powerful, because it treats data as data and not as bits and bytes (and chars).
-
-The goal of this tool is to provide a bridge between the world of text files and of SQL.
-
-## Usage
-q's basic usage is very simple:`q <flags> <query>`, but it has lots of features under the hood and in the flags that can be passed to the command.
-
-Simplest execution is q "SELECT * FROM myfile" which prints the entire file.
-
-Complete information can be found [here](doc/USAGE.markdown)
-
-## Implementation
-Some implementation details can be found [here](doc/IMPLEMENTATION.markdown)
-
-## Limitations
-* No checks and bounds on data size
-* Spaces in file names are not supported yet. I'm working on it.
-* It is possible that some rare cases of subqueries are not supported yet. Please open an issue if you find such a case. This will be fixed once the tool performs its own full-blown SQL parsing.
-
-## Future Ideas
-* Faster reuse of previous data loading
-* Allow working with external DB
-* Real parsing of the SQL, allowing smarter execution of queries.
-* Smarter batch insertion to the database
-* Provide mechanisms beyond SELECT - INSERT and CREATE TABLE SELECT and such.
-
-## Rationale
-Some information regarding the rationale for this tool and related philosophy can be found [here](doc/RATIONALE.markdown)
-
-## Change log
-History of changes can be found [here](doc/CHANGELOG.markdown)
-
-## Contact
-Any feedback/suggestions/complaints regarding this tool would be much appreciated. Contributions are most welcome as well, of course.
-
-Harel Ben-Attia, harelba@gmail.com, [@harelba](https://twitter.com/harelba) on Twitter
-
-q on twitter: #qtextasdata