PQL EXTRACT()
The Extract()
query allows the user to extract the values of specified fields for specified records. The Extract()
query is similar the SELECT
argument in SQL – see example 1 below for the PQL equivalent to SELECT * FROM table
in SQL.
The task of retrieving all data for a record is generally more expensive in FeatureBase as compared to relational databases because of the way data is stored. For that reason, using the Extract()
query for many fields and records is not advisable.
Extract()
does not currently support filters on the backing rows call – i.e. Rows()
arguments should look exactly like Rows(FIELD_NAME)
. Additionally, contrary to the standard definition of Rows(), any field type can be used as the argument to Rows()
when Rows()
is used as an argument to Extract()
.
Time fields are treated as Set fields – i.e. the Extract()
output will include all values for a record that were set at any time.
Call Definition
Extract(ROW_CALL, ROWS_CALL, ... )
Mandatory Arguments
ROW_CALL
: the row call used to filter recordsROWS_CALL
: any rows call (set of record IDs / keys). Again, contrary to the standard definition of Rows(), any field type can be used as the argument toRows()
whenRows()
is used as an argument toExtract()
.
Optional Arguments
...
: Any number of additional rows calls separated by commas. contrary to the standard definition of Rows(), any field type can be used as the argument toRows()
whenRows()
is used as an argument toExtract()
.
Returns
An object with:
- a list of the selected fields under the “fields” key
- a list of “column” / “rows” pair under the “columns” key
- “column” contains the record ID / key
- “rows” contains the field values associated with the record in “column”
Field values are typed as such:
Field Type | type | JSON Value Type |
---|---|---|
Mutex (unkeyed) | "uint64" | positive integer ID or null |
Mutex (keyed) | "string" | string or null |
Integer | "int64" | integer or null |
Integer (foreign-index) | "string" | string or null |
Decimal | "decimal" | FeatureBase decimal value or null |
Set (unkeyed) | "[]uint64" | array of positive integer IDs |
Set (keyed) | "[]string" | array of strings |
Time (unkeyed) | "[]uint64" | array of positive integer IDs |
Time (keyed) | "[]string" | array of strings |
Examples
Data:
Index: customer (non keyed index)
_id | age (Int) | has_purchased (Set) | last_purchase (Timestamp)
-----+-----------+---------------------+---------------------------
0 | 23 | ["brand1","brand2"] | 2021-01-05T08:30:00Z
1 | 31 | ["brand1","brand3"] | 2020-09-12T12:30:00Z
2 | 28 | ["brand1","brand3"] | 2021-08-06T16:15:00Z
3 | 19 | [] | null
4 | 25 | ["brand1","brand4"] | 2021-10-01T20:45:00Z
5 | 40 | ["brand4"] | 2022-01-13T11:00:00Z
Example 1
Extract all the data from the customer index (i.e. the PQL equivalent to SQL SELECT * FROM customer)?
Query
[customer]Extract(All(), Rows(age), Rows(has_purchased), Rows(last_purchase))
Tabular Response
_id | age | has_purchased | last_purchase
-----+-----+---------------------+------------------------
0 | 23 | ["brand1","brand2"] | "2021-01-05T08:30:00Z"
1 | 31 | ["brand1","brand3"] | "2020-09-12T12:30:00Z"
2 | 28 | ["brand1","brand3"] | "2021-08-06T16:15:00Z"
3 | 19 | [] | null
4 | 25 | ["brand1","brand4"] | "2021-10-01T20:45:00Z"
5 | 40 | ["brand4"] | "2022-01-13T11:00:00Z"
HTTP Response
{
"results": [
{
"fields": [
{
"name": "age",
"type": "int64"
},
{
"name": "has_purchased",
"type": "[]string"
},
{
"name": "last_purchase",
"type": "timestamp"
}
],
"columns": [
{
"column": 0,
"rows": [
23,
[
"brand1",
"brand2"
],
"2021-01-05T08:30:00Z"
]
},
{
"column": 1,
"rows": [
31,
[
"brand1",
"brand3"
],
"2020-09-12T12:30:00Z"
]
},
{
"column": 2,
"rows": [
28,
[
"brand1",
"brand3"
],
"2021-08-06T16:15:00Z"
]
},
{
"column": 3,
"rows": [
19,
[],
null
]
},
{
"column": 4,
"rows": [
25,
[
"brand1",
"brand4"
],
"2021-10-01T20:45:00Z"
]
},
{
"column": 5,
"rows": [
40,
[
"brand4"
],
"2022-01-13T11:00:00Z"
]
}
]
}
]
}
Explanation
Similar to SELECT * FROM customer
, we get all the data in the index.
Example 2
Output the list of brands each customer has purchased from for customers younder than 26.
[customer]Extract(Row(age<26), Rows(has_purchased))
Tabular Response
_id | has_purchased
-----+---------------------
0 | ["brand1","brand2"]
3 | []
4 | ["brand1","brand4"]
HTTP Response
{
"results": [
{
"fields": [
{
"name": "has_purchased",
"type": "[]string"
}
],
"columns": [
{
"column": 0,
"rows": [
[
"brand1",
"brand2"
]
]
},
{
"column": 3,
"rows": [
[]
]
},
{
"column": 4,
"rows": [
[
"brand1",
"brand4"
]
]
}
]
}
]
}
Explanation
Row(age<26)
returns records [0,3,4]
. Extract(..., Rows(has_purchased))
pulls the list of brands those records / customers have purchased from.