diff options
Diffstat (limited to 'docs/content/3.manual/manual.yml')
-rw-r--r-- | docs/content/3.manual/manual.yml | 688 |
1 files changed, 688 insertions, 0 deletions
diff --git a/docs/content/3.manual/manual.yml b/docs/content/3.manual/manual.yml new file mode 100644 index 00000000..20793976 --- /dev/null +++ b/docs/content/3.manual/manual.yml @@ -0,0 +1,688 @@ +headline: jq Manual +sections: + - title: Basics + body: "{ some *intro* text \n\n\n}\n" + entries: + - title: "`.`" + body: | + + The absolute simplest (and least interesting) jq expression + is `.`. This is a jq expression that takes its input and + produces it unchanged as output. + + Since jq by default pretty-prints all output, this trivial + program can be a useful way of formatting JSON output from, + say, `curl`. + + examples: + - program: '.' + input: '"Hello, world!"' + output: ['"Hello, world!"'] + + - title: "`.foo`" + body: | + + The simplest *useful* jq expression is .foo. When given a + JSON object (aka dictionary or hash) as input, it produces + the value at the key "foo", or null if there\'s none present. + + examples: + - program: '.foo' + input: '{"foo": 42, "bar": "less interesting data"}' + output: [42] + - program: '.foo' + input: '{"notfoo": true, "alsonotfoo": false}' + output: ['null'] + + - title: "`.[foo]`" + body: | + + You can also look up fields of an object using syntax like + `.["foo"]` (.foo above is a shorthand version of this). This + one works for arrays as well, if the key is an + integer. Arrays are zero-based (like javascript), so .[2] + returns the third element of the array. + + examples: + - program: '.[0]' + input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]' + output: ['{"name":"JSON", "good":true}'] + + - program: '.[2]' + input: '[{"name":"JSON", "good":true}, {"name":"XML", "good":false}]' + output: ['null'] + + - title: "`.[]`" + body: | + + If you use the `.[foo]` syntax, but omit the index + entirely, it will return *all* of the elements of an + array. Running `.[]` with the input `[1,2,3]` will produce the + numbers as three seperate results, rather than as a single + array. + + examples: + - program: '.[]' + input: '[{name":"JSON", "good":true}, {"name":"XML", "good":false}]' + output: + - '{"name":"JSON", "good":true}' + - '{"name":"XML", "good":false}' + + - program: '.[]' + input: '[]' + output: [] + + - title: "`,`" + body: | + + If two jq expressions are separated by a comma, then the + input will be fed into both and there will be multiple + outputs: first, all of the outputs produced by the left + expression, and then all of the outputs produced by the + right. For instance, jq expression `.foo, .bar`, produces + both the "foo" fields and "bar" fields as separate outputs. + + examples: + - program: '.foo, .bar' + input: '{"foo": 42, "bar": "something else", "baz": true}' + output: ['42', '"something else"'] + + - program: "[.user, .projects[]]" + input: '{"user":"stedolan", "projects": ["jq", "wikiflow"]}' + output: ['"stedolan"', '"jq"', '"wikiflow"'] + + - program: '.[4,2]' + input: '["a","b","c","d","e"]' + output: ['"d"', '"c"'] + + - title: "`|`" + body: | + The | operator combines two jq expressions by feeding the output(s) of + the one on the left into the input of the one on the right. It\'s + pretty much the same as the Unix shell\'s pipe, if you\'re used to + that. + + If the one on the left produces multiple results, the one on + the right will be run for each of those results. So, the + expression `.[] | .foo` retrieves the "foo" field of each + element of the input array. + + examples: + - program: '.[] | .name' + input: '[{name":"JSON", "good":true}, {"name":"XML", "good":false}]' + output: ['"JSON"', '"XML"'] + + - title: Types and Values + body: | + + jq supports the same set of datatypes as JSON - numbers, + strings, booleans, arrays, objects (which in JSON-speak are + hashes with only string keys), and "null". + + Booleans, null, strings and numbers are written the same way as + in javascript. Just like everything else in jq, these simple + values take an input and produce an output - `42` is a valid jq + expression that takes an input, ignores it, and returns 42 + instead. + + entries: + - title: Array construction - `[]` + body: | + + As in JSON, `[]` is used to construct arrays, as in + `[1,2,3]`. The elements of the arrays can be any jq + expression. All of the results produced by all of the + expressions are collected into one big array. You can use it + to construct an array out of a known quantity of values (as + in `[.foo, .bar, .baz]`) or to "collect" all the results of a + jq expression into an array (as in `[.items[].name]`) + + Once you understand the "," operator, you can look at jq\'s array + syntax in a different light: the expression [1,2,3] is not using a + built-in syntax for comma-separated arrays, but is instead applying + the `[]` operator (collect results) to the expression 1,2,3 (which + produces three different results). + + If you have a jq expression `X` that produces four results, + then the expression `[X]` will produce a single result, an + array of four elements. + + examples: + - program: "[.user, .projects[]]" + input: '{"user":"stedolan", "projects": ["jq", "wikiflow"]}' + output: ['["stedolan", "jq", "wikiflow"]'] + - title: Objects - `{}` + body: | + + Like JSON, `{}` is for constructing objects (aka + dictionaries or hashes), as in: `{"a": 42, "b": 17}`. + + If the keys are "sensible" (all alphabetic characters), then + the quotes can be left off. The value can be any expression + (although you may need to wrap it in parentheses if it\'s a + complicated one), which gets applied to the {} expression\'s + input (remember, all jq expressions have an input and an + output). + + {foo: .bar} + + will produce the JSON object `{"foo": 42}` if given the JSON + object `{"bar":42, "baz":43}`. You can use this to select + particular fields of an object: if the input is an object + with "user", "title", "id", and "content" fields and you + just want "user" and "title", you can write + + {user: .user, title: .title} + + Because that\'s so common, there\'s a shortcut syntax: `{user, title}`. + + If one of the expressions produces multiple results, + multiple dictionaries will be produced. If the input\'s + + {"user":"stedolan","titles":["JQ Primer", "More JQ"]} + + then the expression + + {user, title: .titles[]} + + will produce two outputs: + + {"user":"stedolan", "title": "JQ Primer"} + {"user":"stedolan", "title": "More JQ"} + + Putting parentheses around the key means it will be evaluated as an + expression. With the same input as above, + + {(.user): .titles} + + produces + + {"stedolan": ["JQ Primer", "More JQ"]} + + examples: + - program: '{user, title: .titles[]}' + input: '{"user":"stedolan","titles":["JQ Primer", "More JQ"]}' + output: + - '{"user":"stedolan", "title": "JQ Primer"}' + - '{"user":"stedolan", "title": "More JQ"}' + - program: '{(.user): .titles}' + input: '{"user":"stedolan","titles":["JQ Primer", "More JQ"]}' + output: ['{"stedolan": ["JQ Primer", "More JQ"]}'] + + - title: Builtin operators and functions + body: | + + Some jq operator (for instance, `+`) do different things + depending on the type of their arguments (arrays, numbers, + etc.). However, jq never does implicit type conversions. If you + try to add a string to an object you'll get an error message and + no result. + + entries: + - title: Addition - `+` + body: | + + The operator `+` takes two jq expressions, applies them both + to the same input, and adds the results together. What + "adding" means depends on the types involved: + + - **Numbers** are added by normal arithmetic. + + - **Arrays** are added by being concatenated into a larger array. + + - **Strings** are added by being joined into a larger string. + + - **Objects** are added by merging, that is, inserting all + the key-value pairs from both objects into a single + combined object. If both objects contain a value for the + same key, the object on the right of the `+` wins. + + examples: + - program: '.a + 1' + input: '{"a": 7}' + output: '{"a": 8}' + - program: '.a + .b' + input: '{"a": [1,2], "b": [3,4]}' + output: ['[1,2,3,4]'] + - program: '{a: 1} + {b: 2} + {c: 3} + {a: 42}' + input: 'null' + output: ['{"a": 42, "b": 2, "c": 3}'] + + - title: Subtraction - `-` + body: | + + As well as normal arithmetic subtraction on numbers, the `-` + operator can be used on arrays to remove all occurences of + the second array's elements from the first array. + + examples: + - program: '4 - .a' + input: '{"a":3}' + output: ['1'] + - program: . - ["xml", "yaml"] + input: '["xml", "yaml", "json"]' + output: ['["json"]'] + + - title: Multiplication, division - `*` and `/` + body: | + + These operators only work on numbers, and do the expected. + + examples: + - program: '10 / . * 3' + input: 5 + output: [6] + + - title: `length` + body: | + + The builtin function `length` gets the length of various + different types of value: + + - The length of a **string** is the number of Unicode + codepoints it contains (which will be the same as its + JSON-encoded length in bytes if it's pure ASCII). + + - The length of an **array** is the number of elements. + + - The length of an **object** is the number of key-value pairs. + + - The length of **null** is zero. + + examples: + - program: '.[] | length' + input: '[[1,2], "string", {"a":2}, null]' + output: [2, 6, 1, 0] + + - title: `tonumber` + body: | + + The `tonumber` function parses its input as a number. It + will convert correctly-formatted strings to their numeric + equivalent, leave numbers alone, and give an error on all other input. + + examples: + - program: '.[] | tonumber' + input: '[1, "1"]' + output: [1,1] + + - title: `tostring` + body: | + + The `tostring` function prints its input as a + string. Strings are left unchanged, and all other values are + JSON-encoded. + + examples: + - program: '.[] | tostring' + input: '[1, "1", [1]]' + output: ['"1"', '"1"', '"[1]"'] + + - title: "String interpolation - `@(text)`" + body: | + + jq supports an alternative syntax for strings. Instead of + "foo", you can write `@(foo)`. When using this syntax, + `%(expression)` may be used to insert the value of + `expression` into the string (converted with `tostring`). + + String interpolation does not occur for normal double-quoted + strings (like `"foo"`) in order to be fully compatible with + JSON. + + All of the usual JSON escapes (`\n`, `\r` and the like) work + inside `@()`-quoted strings, as well as `\%` and `\)` if + those characters are needed literally. + + examples: + - program: '@(The input was %(.), which is one less than %(.+1))' + input: '42' + output: ['"The input was 42, which is one less than 43"'] + + + + + + + - title: Conditionals and Comparisons + entries: + - title: `==`, `!=` + body: | + + The expression 'a == b' will produce 'true' if the result of a and b + are equal (that is, if they represent equivalent JSON documents) and + 'false' otherwise. In particular, strings are never considered equal + to numbers. If you're coming from Javascript, jq's == is like + Javascript's === - considering values equal only when they have the + same type as well as the same value. + + != is "not equal", and 'a != b' returns the opposite value of 'a == b' + + examples: + - program: '.[] == 1' + input: '[1, 1.0, "1", "banana"]' + output: ['[true, true, false, false]'] + + - title: if-then-else + body: | + + `if A then B else C end` will act the same as `B` if `A` + produces a value other than false or null, but act the same + as `C` otherwise. + + Checking for false or null is a simpler notion of + "truthiness" than is found in Javascript or Python, but it + means that you'll sometimes have to be more explicit about + the condition you want: you can't test whether, e.g. a + string is empty using `if .name then A else B end`, you'll + need something more like 'if (.name | count) > 0 then A else + B end' instead. + + If the condition A produces multiple results, it is + considered "true" if any of those results is not false or + null. If it produces zero results, it's considered false. + + More cases can be added to an if using `elif A then B` syntax. + + examples: + - program: |- + if . == 0 then + "zero" + elif . == 1 then + "one" + else + "many" + end + input: 2 + output: ['"many"'] + + - title: and/or/not + body: | + + jq supports the normal Boolean operators and/or/not. They have the + same standard of truth as if expressions - false and null are + considered "false values", and anything else is a "true value". + + If an operand of one of these operators produces multiple + results, the operator itself will produce a result for each input. + + `not` is in fact a builtin function rather than an operator, + so it is called as a filter to which things can be piped + rather than with special syntax, as in `.foo and .bar | + not`. + + These three only produce the values "true" and "false", and + so are only useful for genuine Boolean operations, rather + than the common Perl/Python/Ruby idiom of + "value_that_may_be_null or default". If you want to use this + form of "or", picking between two values rather than + evaluating a condition, see the "//" operator below. + + examples: + - program: '42 and "a string"' + input: 'null' + output: ['true'] + - program: '(true, false) or false' + input: 'null' + output: ['true', 'false'] + - program: '(true, false) and (true, false)' + input: 'null' + output: ['true', 'false', 'false', 'false'] + - program: '[true, false | not]' + input: 'null' + output: ['[false, true]'] + + - title: Alternative operator - `//` + body: | + + A jq expression of the form `a // b` produces the same + results as `a`, if `a` produces results other than `false` + and `null`. Otherwise, `a // b` produces the same results as `b`. + + This is useful for providing defaults: `.foo or 1` will + evaluate to `1` if there's no `.foo` element in the + input. It's similar to how `or` is sometimes used in Python + (jq's `or` operator is reserved for strictly Boolean + operations). + + examples: + - program: '.foo // 42' + input: '{"foo": 19}' + output: [19] + - program: '.foo // 42' + input: '{}' + output: [42] + + - title: Variables and Functions + body: | + Variables are an absolute necessity in most programming languages, but + they're relegated to an "advanced feature" in jq. + + In most languages, variables are the only means of passing around + data. If you calculate a value, and you want to use it more than once, + you'll need to store it in a variable. To pass a value to another part + of the program, you'll need that part of the program to define a + variable (as a function parameter, object member, or whatever) in + which to place the data. + + It is also possible to define functions in jq, although this is + is a feature whose biggest use is defining jq's standard library + (many jq functions such as `map` and `find` are in fact written + in jq). + + entries: + - title: Variables + body: | + + In jq, all jq expressions have an input and an output, so manual + plumbing is not necessary to pass a value from one part of a program + to the next. Many expressions, for instance `a + b`, pass their input + to two distinct subexpressions (here `a` and `b` are both passed the + same input), so variables aren't usually necessary in order to use a + value twice. + + For instance, calculating the average value of an array of numbers + requires a few variables in most languages - at least one to hold the + array, perhaps one for each element or for a loop counter. In jq, it's + simply `add / length` - the `sum` expression is given the array and + produces its sum, and the `count` expression is given the array and + produces its length. + + So, there's generally a cleaner way to solve most problems in jq that + defining variables. Still, sometimes they do make things easier, so jq + lets you define variables using `expression as $variable`. All + variable names start with `$`. Here's a slightly uglier version of the + array-averaging example: + + length as $array_length | add / $array_length + + We'll need a more complicated problem to find a situation where using + variables actually makes our lives easier. + + + Suppose we have an array of blog posts, with "author" and "title" + fields, and another object which is used to map author usernames to + real names. Our input looks like: + + {"posts": [{"title": "Frist psot", "author": "anon"}, + {"title": "A well-written article", "author": "person1"}], + "realnames": {"anon": "Anonymous Coward", + "person1": "Person McPherson"}} + + We want to produce the posts with the author field containing a real + name, as in: + + {"title": "Frist psot", "author": "Anonymous Coward"} + {"title": "A well-written article", "author": "Person McPherson"} + + We use a variable, $names, to store the realnames object, so that we + can refer to it later when looking up author usernames: + + .realnames as $names | .posts[] | {title, author: $names[.author]} + + The expression "asdf as $x" runs asdf, puts the result in $x, and + returns the original input. Apart from the side-effect of binding the + variable, it has the same effect as ".". + + Variables are scoped over the rest of the expression that defines + them, so + + .realnames as $names | (.posts[] | {title, author: $names[.author]}) + + will work, but + + (.realnames as $names | .posts[]) | {title, author: $names[.author]} + + won't. + + examples: + - program: '.bar as $x | .foo | . + $x' + input: '{"foo":10, "bar":200}' + output: ['210'] + + - title: 'Defining Functions' + body: | + + You can give a jq expression a name using "def" syntax: + + def increment: . + 1; + + From then on, `increment` is usable as a filter just like a + builtin function (in fact, this is how some of the builtins + are defined). A function may take arguments: + + def map(f): [.[] | f]; + + Arguments are passed as jq expressions, not as values. The + same argument may be referenced multiple times with + different inputs (here `f` is run for each element of the + input array). Arguments to a function work more like + callbacks than like value arguments. + + If you want the value-argument behaviour for defining simple + functions, you can just use a variable: + + def addvalue(f): f as $value | map(, + $value); + + With that definition, `addvalue(.foo)` will add the current + input's `.foo` field to each element of the array. + + examples: + - program: 'def addvalue(f): map(. + [$value]); addvalue(.[0])' + input: '[[1,2],[10,20]]' + output: ['[[1,2,1], [10,20,10]]'] + - program: 'def addvalue(f): f as $x | map (. + $x); addvalue(.[0])' + input: '[[1,2],[10,20]]' + output: ['[[1,2,[1,2]], [10,20,[1,2]]]'] + + + - title: Assignment + body: | + + Assignment works a little differently in jq than in most + programming languages. jq doesn't distinguish between references + to and copies of something - two objects or arrays are either + equal or not equal, without any further notion of being "the + same object" or "not the same object". + + If an object has two fields which are arrays, `.foo` and `.bar`, + and you append something to `.foo`, then `.bar` will not get + bigger. Even if you've just set `.bar = .foo`. If you're used to + programming in languages like Python, Java, Ruby, Javascript, + etc. then you can think of it as though jq does a full deep copy + of every object before it does the assignment (for performance, + it doesn't actually do that, but that's the general idea). + + entries: + - title: "`=`" + body: | + + The jq expression `.foo = 1` will take as input an object + and produce as output an object with the "foo" field set to + 1. There is no notion of "modifying" or "changing" something + in jq - all jq values are immutable. For instance, + + .foo = .bar | .foo.baz = 1 + + will not have the side-effect of setting .bar.baz to be set + to 1, as the similar-looking program in Javascript, Python, + Ruby or other languages would. Unlike these languages (but + like Haskell and some other functional languages), there is + no notion of two arrays or objects being "the same array" or + "the same object". They can be equal, or not equal, but if + we change one of them in no circumstances will the other + change behind our backs. + + This means that it's impossible to build circular values in + jq (such as an array whose first element is itself). This is + quite intentional, and ensures that anything a jq program + can produce can be represented in JSON. + + - title: "`|=`" + body: | + As well as the assignment operator '=', jq provides the "update" + operator '|=', which takes a jq expression on the right-hand side and + works out the new value for the property being assigned to by running + the old value through this expression. For instance, .foo |= .+1 will + build an object with the "foo" field set to the input's "foo" plus 1. + + This example should show the difference between '=' and '|=': + + Provide input '{"a": {"b": 10}, "b": 20}' to the programs: + + .a = .b + .a |= .b + + The former will set the "a" field of the input to the "b" field of the + input, and produce the output {"a": 20}. The latter will set the "a" + field of the input to the "a" field's "b" field, producing {"a": 10}. + + - title: "`+=`, `-=`, `*=`, `/=`, `//=`" + body: | + + jq has a few operators of the form `a op= b`, which are all + equivalent to `a |= . op b`. So, `+= 1` can be used to increment values. + + examples: + - program: .foo += 1 + input: '{"foo": 42}' + output: ['{"foo": 43}'] + + - title: Complex assignments + body: | + Lots more things are allowed on the left-hand side of a jq assignment + than in most langauges. We've already seen simple field accesses on + the left hand side, and it's no surprise that array accesses work just + as well: + + .posts[0].title = "JQ Manual" + + What may come as a surprise is that the expression on the left may + produce multiple results, referring to different points in the input + document: + + .posts[].comments |= . + ["this is great"] + + That example appends the string "this is great" to the "comments" + array of each post in the input (where the input is an object with a + field "posts" which is an array of posts). + + When jq encounters an assignment like 'a = b', it records the "path" + taken to select a part of the input document while executing a. This + path is then used to find which part of the input to change while + executing the assignment. Any jq expression may be used on the + left-hand side of an equals - whichever paths it selects from the + input will be where the assignment is performed. + + This is a very powerful operation. Suppose we wanted to add a comment + to blog posts, using the same "blog" input above. This time, we only + want to comment on the posts written by "stedolan". We can find those + posts using the "select" function described earlier: + + .posts[] | select(.author == "stedolan") + + The paths provided by this operation point to each of the posts that + "stedolan" wrote, and we can comment on each of them in the same way + that we did before: + + (.posts[] | select(.author == "stedolan") | .comments) |= . + ["terrible."] |