PQL ROW()
The Row()
query returns a set of record IDs / keys for records that have been filtered in some way. You can think of Row()
as serving a similar function to the WHERE
clause in SQL.
There are two different call definitions for Row()
. One for Set, Mutex, and Time fields and another for Int, Decimal, and Timestamp fields. For more information on field types in FeatureBase, go here.
Call Definition
Set, Mutex, and Time Fields
Row(FIELD = FIELD_VALUE, from = TIMESTAMP, to = TIMESTAMP)
Note: Time fields are similar to Set fields except there is one or more date and time associated with each relationship between a record and field value. For that reason, you can only use the from and to arguments on Time fields.
Mandatory Arguments
FIELD
: the name of a field in the index being queriedFIELD_VALUE
: a field value within a Set, Mutex or Time field, FIELD
Optional Arguments
from
: the start date and time when querying Time fields (TIMESTAMP
formatted like'2006-01-02T00:00:00Z'
).TIMESTAMP
here is an inclusive value - i.e. records with relationships made on or after this time will be included in the result – for Time fields only.to
: the end date and time when querying Time fields (TIMESTAMP
formatted like'2006-01-02T00:00:00Z'
).TIMESTAMP
here is an exclusive value - i.e. records with relationships made before this time (but not on this time) will be included in the result – for Time fields only.
Returns
- list of record IDs or record keys
Int, Decimal, Timestamp Fields
Row(VALUE OPERATOR FIELD OPERATOR VALUE)
Mandatory Arguments
FIELD
: the name of a field in the index being queriedVALUE
: a integer, decimal, or timestamp value - consistent with the field type of FIELD (e.g. 1, 10.2, ‘2006-01-02T05:01:00Z’) – the firstVALUE
is optionalOPERATOR
: >, <, <=, >=, ==, or != – the firstOPERATOR
is optional
Optional Arguments
Returns
- list of record IDs or record keys
Examples
Raw Event Data:
Each row below represents a purchase event by a customer in some system. There is an event ID (_id
), customer ID (customer_id
), brand that customer purchased from (brand
), and the time the purchase was made (time
).
_id | brand | customer_id | time
-----+--------+-------------+----------------------
1 | brand1 | 0 | 2021-01-02T02:45:00Z
2 | brand2 | 0 | 2021-01-05T08:30:00Z
3 | brand3 | 1 | 2019-07-24T11:15:00Z
4 | brand1 | 1 | 2020-09-12T12:30:00Z
5 | brand1 | 2 | 2021-07-07T18:25:00Z
6 | brand3 | 2 | 2021-08-06T16:15:00Z
7 | brand1 | 4 | 2020-10-01T20:45:00Z
8 | brand4 | 4 | 2021-10-01T20:45:00Z
9 | brand4 | 5 | 2022-01-13T11:00:00Z
10 | brand1 | 0 | 2022-03-15T02:55:00Z
Raw Customer Data:
Each row represents a customer. Note the _id
column in this data corresponse to a customer_id
column in the data above.
_id | age
----+-----
0 | 23
1 | 31
2 | 28
3 | 19
4 | 25
5 | 40
The raw data above can be ingested to FeatureBase in at least three ways.
- An events index and a seperate users index. In this case, the two indexes would look like raw data looks above. One index would be built around purchase events (i.e. each record is a different purchase event) and one index would be built around users (i.e. each record is a different user).
- A users index where the time associated with purchase relationships (customer / brand) doesn’t matter. Here each record would be a user and we’d use a Set field to track which users have bought from which brands. Note we lose the time information when using a Set field. This was done in the
customer_set
index below. - A users index where the time associated with each relationship (customer / brand) does matter. Here each record would be a user and a Time field to track which users have bought from which brands. This was done in the
customer_time
index below.
FeatureBase Data
Index: customer_set (non-keyed index)
_id | age (Int) | has_purchased (Set) | last_purchase (Timestamp)
-----+-----------+---------------------+---------------------------
0 | 23 | ["brand1","brand2"] | 2021-01-05T08:30:00Z
1 | 31 | ["brand1","brand3"] | 2020-09-12T12:30:00Z
2 | 28 | ["brand1","brand3"] | 2021-08-06T16:15:00Z
3 | 19 | [] | null
4 | 25 | ["brand1","brand4"] | 2021-10-01T20:45:00Z
5 | 40 | ["brand4"] | 2022-01-13T11:00:00Z
Index: customer_time (non-keyed index)
_id | age (Int) | has_purchased (Time)
-----+-----------+---------------------
0 | 23 | ["brand1","brand2"]
1 | 31 | ["brand1","brand3"]
2 | 28 | ["brand1","brand3"]
3 | 19 | []
4 | 25 | ["brand1","brand4"]
5 | 40 | ["brand4"]
In both indexes above, a customer table was create with a customer ID as the record ID. Some information about the customer (e.g. age) was included. In the first index (customer_set
), a Set field is used to store purchase history. Here we’re tacking the brands each customer has purchased from but there is no time associated with the relationship. In the second index (customer_time
), a Time field is used to store purchase history. This means there is at least one time associated with each relationship that can be used for range queries.
Example 1
What customers have purchased from brand1?
Query
[customer_set]Row(has_purchased=brand1)
Tabular Response
_id
-----
0
1
2
4
HTTP Response
{
"results": [
{
"columns": [
0,
1,
2,
4
]
}
]
}
Explanation
Customers 0, 1, 2, and 4 are the only customer that had a purchase event with brand1.
Example 2
Which customers have purchased from brand3 during 2021?
[customer_time]Row(has_purchased=brand3, from='2021-01-01', to='2022-01-01')
Tabular Response
_id
-----
2
HTTP Response
{
"results": [
{
"columns": [
2
]
}
]
}
Explanation
Note the primary index is customer_time
. In that index, the has_purchased
field is a Time field. This allows us to use the from / to arguments to search for relationship made during a given time window. This is not possible in the customer_set
index which uses a Set field for the has_purchased
field. Looking at the raw events data, only customer 2 purchased from brand 3 during 2021.
Example 3
What customer are between the age of 20 and 30?
[customer_set]Row(20 < age < 30)
Tabular Response
_id
-----
0
2
4
HTTP Response
{
"results": [
{
"columns": [
0,
2,
4
]
}
]
}
Explanation
Customer 0, 2, and 4 are between the ages of 20 and 30.
Example 4
What customers made their last purchase in after the start 2022?
[customer_set]Row(last_purchase > "2022-01-01T00:00:00Z")
Tabular Response
_id
-----
5
HTTP Response
{
"results": [
{
"columns": [
5
]
}
]
}
Explanation
Only customer 5’s last purchase was made after the start of 2022.