PQL EXTRACT()

The Extract() query allows the user to extract the values of specified fields for specified records. The Extract() query is similar the SELECT argument in SQL – see example 1 below for the PQL equivalent to SELECT * FROM table in SQL.

The task of retrieving all data for a record is generally more expensive in FeatureBase as compared to relational databases because of the way data is stored. For that reason, using the Extract() query for many fields and records is not advisable.

Extract() does not currently support filters on the backing rows call – i.e. Rows() arguments should look exactly like Rows(FIELD_NAME). Additionally, contrary to the standard definition of Rows(), any field type can be used as the argument to Rows() when Rows() is used as an argument to Extract().

Time fields are treated as Set fields – i.e. the Extract() output will include all values for a record that were set at any time.

Call Definition

Extract(ROW_CALL, ROWS_CALL, ... )

Mandatory Arguments

ROW_CALL : the row call used to filter records
ROWS_CALL : any rows call (set of record IDs / keys). Again, contrary to the standard definition of Rows(), any field type can be used as the argument to Rows() when Rows() is used as an argument to Extract().

Optional Arguments

... : Any number of additional rows calls separated by commas. contrary to the standard definition of Rows(), any field type can be used as the argument to Rows() when Rows() is used as an argument to Extract().

Returns

An object with:

a list of the selected fields under the “fields” key
a list of “column” / “rows” pair under the “columns” key
- “column” contains the record ID / key
- “rows” contains the field values associated with the record in “column”

Field values are typed as such:

Field Type	`type`	JSON Value Type
Mutex (unkeyed)	`"uint64"`	positive integer ID or `null`
Mutex (keyed)	`"string"`	string or `null`
Integer	`"int64"`	integer or `null`
Integer (foreign-index)	`"string"`	string or `null`
Decimal	`"decimal"`	FeatureBase decimal value or `null`
Set (unkeyed)	`"[]uint64"`	array of positive integer IDs
Set (keyed)	`"[]string"`	array of strings
Time (unkeyed)	`"[]uint64"`	array of positive integer IDs
Time (keyed)	`"[]string"`	array of strings

Examples

Data:

Index: customer (non keyed index)

 _id | age (Int) | has_purchased (Set) | last_purchase (Timestamp)
-----+-----------+---------------------+---------------------------
 0   |    23     | ["brand1","brand2"] | 2021-01-05T08:30:00Z
 1   |    31     | ["brand1","brand3"] | 2020-09-12T12:30:00Z
 2   |    28     | ["brand1","brand3"] | 2021-08-06T16:15:00Z
 3   |    19     | []                  | null
 4   |    25     | ["brand1","brand4"] | 2021-10-01T20:45:00Z
 5   |    40     | ["brand4"]          | 2022-01-13T11:00:00Z

Example 1

Extract all the data from the customer index (i.e. the PQL equivalent to SQL SELECT * FROM customer)?

Query

[customer]Extract(All(), Rows(age), Rows(has_purchased), Rows(last_purchase))

Tabular Response

 _id | age |    has_purchased    |     last_purchase
-----+-----+---------------------+------------------------
 | 23  | ["brand1","brand2"] | "2021-01-05T08:30:00Z"
 | 31  | ["brand1","brand3"] | "2020-09-12T12:30:00Z"
 | 28  | ["brand1","brand3"] | "2021-08-06T16:15:00Z"
 | 19  | []                  | null
 | 25  | ["brand1","brand4"] | "2021-10-01T20:45:00Z"
 | 40  | ["brand4"]          | "2022-01-13T11:00:00Z"

HTTP Response

{
  "results": [
    {
      "fields": [
        {
          "name": "age",
          "type": "int64"
        },
        {
          "name": "has_purchased",
          "type": "[]string"
        },
        {
          "name": "last_purchase",
          "type": "timestamp"
        }
      ],
      "columns": [
        {
          "column": 0,
          "rows": [
            23,
            [
              "brand1",
              "brand2"
            ],
            "2021-01-05T08:30:00Z"
          ]
        },
        {
          "column": 1,
          "rows": [
            31,
            [
              "brand1",
              "brand3"
            ],
            "2020-09-12T12:30:00Z"
          ]
        },
        {
          "column": 2,
          "rows": [
            28,
            [
              "brand1",
              "brand3"
            ],
            "2021-08-06T16:15:00Z"
          ]
        },
        {
          "column": 3,
          "rows": [
            19,
            [],
            null
          ]
        },
        {
          "column": 4,
          "rows": [
            25,
            [
              "brand1",
              "brand4"
            ],
            "2021-10-01T20:45:00Z"
          ]
        },
        {
          "column": 5,
          "rows": [
            40,
            [
              "brand4"
            ],
            "2022-01-13T11:00:00Z"
          ]
        }
      ]
    }
  ]
}

Explanation

Similar to SELECT * FROM customer, we get all the data in the index.

Example 2

Output the list of brands each customer has purchased from for customers younder than 26.

[customer]Extract(Row(age<26), Rows(has_purchased))

Tabular Response

 _id |    has_purchased
-----+---------------------
 0   | ["brand1","brand2"]
 3   | []
 4   | ["brand1","brand4"]

HTTP Response

{
  "results": [
    {
      "fields": [
        {
          "name": "has_purchased",
          "type": "[]string"
        }
      ],
      "columns": [
        {
          "column": 0,
          "rows": [
            [
              "brand1",
              "brand2"
            ]
          ]
        },
        {
          "column": 3,
          "rows": [
            []
          ]
        },
        {
          "column": 4,
          "rows": [
            [
              "brand1",
              "brand4"
            ]
          ]
        }
      ]
    }
  ]
}

Explanation

Row(age<26) returns records [0,3,4]. Extract(..., Rows(has_purchased)) pulls the list of brands those records / customers have purchased from.