Join Group First

🛑 👉🏻👉🏻👉🏻 INFORMATION AVAILABLE CLICK HERE👈🏻👈🏻👈🏻

Возможно, сайт временно недоступен или перегружен запросами. Подождите некоторое время и попробуйте снова.
Если вы не можете загрузить ни одну страницу – проверьте настройки соединения с Интернетом.
Если ваш компьютер или сеть защищены межсетевым экраном или прокси-сервером – убедитесь, что Firefox разрешён выход в Интернет.

Время ожидания ответа от сервера docs.microsoft.com истекло.

db=# SELECT * FROM event ;
id | account | type | happened_at | data
----+---------+------------+------------------------+-----------------------------------
1 | 1 | created | 2019-08-01 15:14:13+03 | {"credit": 0, "delta_balance": 0}
2 | 1 | deposited | 2019-08-01 15:15:15+03 | {"delta_balance": 100}
3 | 1 | withdraw | 2019-08-02 09:35:33+03 | {"delta_balance": -50}
4 | 1 | credit_set | 2019-08-03 16:14:12+03 | {"credit": 50}
5 | 1 | withdraw | 2019-08-03 14:45:44+03 | {"delta_balance": -30}
6 | 1 | credit_set | 2019-08-03 16:14:12+03 | {"credit": 100}
7 | 1 | withdraw | 2019-08-03 16:15:09+03 | {"delta_balance": -50}
(7 rows)

CREATE TABLE event (
id serial primary key ,
account int ,
type text ,
happened_at timestamp with time zone ,
data jsonb
);

SET dateformat YMD ;

INSERT INTO event (
id , account , type , happened_at , data
)
VALUES
( 1 , 1 , 'created' , '2019-08-01 15:14:13' , '{"delta_balance": 0, "credit": 0}' ),
( 2 , 1 , 'deposited' , '2019-08-01 15:15:15' , '{"delta_balance": 100}' ),
( 3 , 1 , 'withdraw' , '2019-08-02 09:35:33' , '{"delta_balance": -50}' ),
( 4 , 1 , 'credit_set' , '2019-08-03 16:14:12' , '{"credit": 50}' ),
( 5 , 1 , 'withdraw' , '2019-08-03 14:45:44' , '{"delta_balance": -30}' ),
( 6 , 1 , 'credit_set' , '2019-08-03 16:14:12' , '{"credit": 100}' ),
( 7 , 1 , 'withdraw' , '2019-08-03 16:15:09' , '{"delta_balance": -50}' )
;

db=# SELECT
account ,
SUM (( data -> 'delta_balance' ) :: int ) AS balance
FROM
event
GROUP BY
account ;

account | balance
---------+---------
1 | -30

-- Oracle
SELECT
account ,
MAX ( CASE WHEN type = 'credit_set' THEN data ELSE null END )
KEEP ( DENSE_RANK LAST ORDER BY id ) AS credit
FROM
event
GROUP BY
account ;

db=# SELECT
account ,
LAST_VALUE ( data ) OVER (
PARTITION BY account ORDER BY id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS credit
FROM
event
WHERE
type = 'credit_set'
GROUP BY
account ;

ERROR: column "event.data" must appear in the GROUP BY clause or be used in an aggregate function
LINE 3: LAST_VALUE(data) OVER (

db=# SELECT
account ,
LAST_VALUE ( data ) OVER (
PARTITION BY account ORDER BY id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS credit
FROM
event
WHERE
type = 'credit_set' ;

account | credit
---------+-----------------
1 | {"credit": 100}
1 | {"credit": 100}

db=# SELECT DISTINCT ON ( account )
account ,
data -> 'credit' AS credit
FROM
event
WHERE
type = 'credit_set'
ORDER BY
account ,
id DESC ;

account | credit
---------+--------
1 | 100

db=# SELECT
a . account ,
a . balance ,
b . credit
FROM
(
SELECT
account ,
SUM (( data -> 'delta_balance' ) :: int ) AS balance
FROM
event
GROUP BY
account

) AS a
JOIN
(

SELECT DISTINCT ON ( account )
account ,
data -> 'credit' AS credit
FROM
event
WHERE
type = 'credit_set'
ORDER BY
account ,
id DESC

) AS b

ON a . account = b . account ;

account | balance | credit
---------+---------+--------
1 | -30 | 100

QUERY PLAN
---------------------------------------------------------------------------------------------
Hash Join (cost=44.53..49.07 rows=4 width=44)
Hash Cond: (event.account = b.account)
-> HashAggregate (cost=25.00..27.00 rows=200 width=12)
Group Key: event.account
-> Seq Scan on event (cost=0.00..17.50 rows=750 width=36)
-> Hash (cost=19.49..19.49 rows=4 width=36)
-> Subquery Scan on b (cost=19.43..19.49 rows=4 width=36)
-> Unique (cost=19.43..19.45 rows=4 width=40)
-> Sort (cost=19.43..19.44 rows=4 width=40)
Sort Key: event_1.account, event_1.id DESC
-> Seq Scan on event event_1 (cost=0.00..19.39 rows=4 width=40)
Filter: (type = 'credit_set'::text)

db#=> SELECT
account ,
SUM (( data -> 'delta_balance' ) :: int ) AS balance ,
( MAX ( ARRAY [ id , ( data -> 'credit' ) :: int ]) FILTER ( WHERE type = 'credit_set' ))[ 2 ] AS credit
FROM
event
GROUP BY
account ;

account | balance | credit
---------+---------+--------
1 | -30 | 100

db=# SELECT greatest ( ARRAY [ 1 , 200 ], ARRAY [ 2 , 100 ]);
greatest
----------
{2,100}

db=# SELECT greatest ( ARRAY [ 1 , 200 ], ARRAY [ 1 , 201 ]);
greatest
----------
{1,201}
(1 row)

MAX ( ARRAY [ id , ( data -> 'credit' ):: int ])

MAX ( ARRAY [ id , ( data -> 'credit' ):: int ]) FILTER ( WHERE type = 'credit_set' )

db#=> SELECT
account ,
MAX ( ARRAY [ id , ( data -> 'credit' ) :: int ]) FILTER ( WHERE type = 'credit_set' ))
FROM
event
GROUP BY
account ;

account | max
---------+---------
1 | {6,100}

( MAX ( ARRAY [ id , ( data -> 'credit' ):: int ]) FILTER ( WHERE type = 'credit_set' ))[ 2 ]

QUERY PLAN
---------------------------------------------------------------
HashAggregate (cost=32.50..34.50 rows=200 width=16)
Group Key: account
-> Seq Scan on event (cost=0.00..17.50 rows=750 width=72)

db=# SELECT
account ,
SUM (( data -> 'delta_balance' ) :: int ) AS balance ,
( MAX (
ARRAY [ happened_at , ( data -> 'credit' ) :: int ]
) FILTER ( WHERE type = 'credit_set' ))[ 2 ] AS credit
FROM
event
GROUP BY
account ;

ERROR: ARRAY types timestamp with time zone and integer cannot be matched
LINE 4: (MAX(ARRAY[happened_at, (data-> 'credit' ) :: int ]) FILTER ...

db=# SELECT
account ,
SUM (( data -> 'delta_balance' ) :: int ) AS balance ,
( MAX (
ARRAY [ EXTRACT ( 'EPOCH' FROM happened_at ), ( data -> 'credit' ) :: int ]
) FILTER ( WHERE type = 'credit_set' ))[ 2 ] AS credit
FROM
event
GROUP BY
account ;

account | balance | credit
---------+---------+--------
1 | -30 | 100

( MAX ( ARRAY [
id ,
( data -> 'credit' ):: int ,
EXTRACT ( 'EPOCH' FROM happened_at )
]) FILTER ( WHERE type = 'credit_set' ))[ 2 :]

db=# SELECT
account ,
(( ARRAY_AGG ( data ORDER BY happened_at DESC )
FILTER ( WHERE type = 'credit_set' ))[ 1 ] -> 'credit' ) :: int AS credit
FROM
event
GROUP BY account ;

account | credit
---------+--------
1 | 50

QUERY PLAN
------------------------------------------------------------------
GroupAggregate ( cost = 1.17..1.36 rows = 7 width = 8 )
Group Key : account
-> Sort ( cost = 1.17..1.19 rows = 7 width = 76 )
Sort Key : account
-> Seq Scan on event ( cost = 0.00..1.07 rows = 7 width = 76 )

SELECT DISTINCT ON ( account )
account ,
FIRST_VALUE (( data -> 'credit' ):: int ) OVER w ,
LAST_VALUE (( data -> 'credit' ):: int ) OVER w ,
SUM (( data -> 'credit' ):: int ) OVER w
FROM
event
WHERE
type = 'credit_set'
WINDOW w AS (
PARTITION BY account
ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)

db=# SELECT DISTINCT ON ( account )
account ,
LAST_VALUE (( data -> 'credit' ) :: int ) OVER (
PARTITION BY account
ORDER BY (
CASE
WHEN type = 'credit_set' THEN happened_at
ELSE null
END
) ASC NULLS FIRST
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) as credit ,
SUM ( COALESCE (( data -> 'delta_balance' ) :: int , 0 )) OVER ( PARTITION BY account ) AS balance
FROM
event ;

account | credit | balance
---------+--------+---------
1 | 100 | -30

Unique ( cost = 1.19..1.55 rows = 7 width = 24 )
-> WindowAgg ( cost = 1.19..1.54 rows = 7 width = 24 )
-> WindowAgg ( cost = 1.19..1.38 rows = 7 width = 48 )
-> Sort ( cost = 1.19..1.20 rows = 7 width = 44 )
Sort Key : account , ( CASE WHEN ( type = 'credit_set' :: text )
THEN happened_at ELSE NULL :: timestamp with time zone
END ) NULLS FIRST
-> Seq Scan on event ( cost = 0.00..1.09 rows = 7 width = 44 )

SELECT
account ,
SUM ( COALESCE (( data ->> 'delta_balance' ):: int , 0 )) AS balance ,
LAST (( data ->> 'credit' ):: int ) FILTER ( WHERE type = 'credit_set' ) AS credit
FROM
event
GROUP BY
account ;

SELECT
account ,
SUM ( COALESCE (( data ->> 'delta_balance' ):: int , 0 )) AS balance ,
LAST (( data ->> 'credit' ):: int , 1 ORDER BY happened_at )
FILTER ( WHERE type = 'credit_set' ) AS credit
FROM
event
GROUP BY
account ;

I recently had to produce reports on a table containing events of a user's account balance. The user can deposit and withdraw from their account, and support personnel can set the account's credit, which is the maximum amount the user can overdraw.
The table looked roughly like this:
To get the current balance of an account, we sum the changes in delta_balance :
The data field contains information specific to each type of event. To extract the value of delta_balance from the data column we use the arrow operator provided by PostgreSQL .
The result of the query shows that the current balance of account 1 is -30. This means the account is in overdraft. To check if this is within the allowed range, we need to compare it to the credit set for this account. The credit for account 1 was initially set to 0 when the account was created. The credit was then adjusted twice, and is currently set to 100.
To get the current state of an account, we need its aggregated balance and the latest credit that was set for it.
In Oracle there is a function called last we can be use to get the last credit_set event. A query using last might look like this:
PostgreSQL also has a LAST_VALUE analytic function. Analytics functions cannot be used in a group by the way aggregate functions do:
The error tells us that the data field used in the analytic function must be used in the group by. This is not really what we want. To use PostgreSQL's LAST_VALUE function, we need to remove the group by:
These are not exactly the results we need. Analytic, or window functions, operate on a set of rows, and not in a group by.
PostgreSQL doesn't have a built-in function to obtain the first or last value in a group using group by.
To get the last value in a group by, we need to get creative!
The plain SQL solution is to divide and conquer. We already have a query to get the current balance of an account. If we write another query to get the credit for each account, we can join the two together and get the complete state of an account.
To get the last event for each account in PostgreSQL we can use DISTINCT ON :
Great! Using DISTINCT ON we got the last credit set for each account.
The next step is to join the two queries, and get the complete account state:
Before we move on, let's take a glance at the execution plan:
The event table is being scanned twice, once for each subquery. The DISTINCT ON subquery also requires a sort by account and id . The two subqueries are then joined using a hash-join.
PostgreSQL is unable to combine the two subqueries into a single scan of the table. If the event table is very large, performing two full table scans, and a sort and a hash join, might become slow and consume a lot of memory.
It's tempting to use common table expression (CTE) to make the query more readable. But, CTE's are currently optimization fences , and using it here will most definitely prevent PostgreSQL from performing any optimization that involves both subqueries.
Using good ol' SQL got us the answer, but it took two passes on the table. We can do better with the following trick:
This is so much simpler than the previous question, so let's break it down.
To understand what exactly is going on here, we first need to understand how PostgreSQL compares arrays :
Array comparisons compare the array contents element-by-element, using the default B-tree comparison function for the element data type.
When comparing arrays, PostgreSQL will go element by element and compare the values according to their type. To demonstrate:
The first element of the second array (2) is larger than the first element of the first array (1), so it's the greatest.
The first elements of both arrays are equal (1), so PostgreSQL moves on to the next element. In this case, the second element of the second array (201) is the greatest.
Using this feature of PostgreSQL, we construct an array where the first element is the value to sort by, and the second element is the value we want to keep. In our case, we want to get the credit by the max id :
Not all events set credit, so we need to restrict the result to credit_set events:
The result of this expression is an array:
We only want the second element, the value of credit :
And this is it! This way we can get the last credit set for each account.
Next, let's examine the execution plan:
The main benefit of this approach is that it only needs one pass of the table and no sort.
This approach is very useful, but it has some restrictions:
All elements must be of the same type.
PostgreSQL does not support arrays with different types of elements . As an example, if we wanted to get the last credit set by date, and not by id:
PostgreSQL tells us that an array cannot contain both timestamps and integers.
We can overcome this restriction in some cases by casting one of the elements:
We converted the timestamp to epoch , which is the number of seconds since 1970. Once both elements are of the same type, we can use the array trick.
Query might consume a little more memory.
This one if a bit of a stretch, but as we demonstrated in the
past ,
large group and sort keys consume more memory in joins and sorts. Using the array trick, the group key is an array, which is a bit larger than the plain fields we usually sort by.
Also, the array trick can be used to "piggyback" more than one value:
This query will return both the last credit set, and the date in which it was set. The entire array is used for sorting, so the more values we put in the array, the larger the group key gets.
The array trick is very useful, and can significantly simplify complicated queries and improve performance. We use it to produce reports on time series data, and to generate read models from event tables.
I'm pretty sure using user-defined aggregate function in PostgreSQL it should be possible to create a function with the signature MAX_BY(key, value) . I haven't had time to dig deep into custom aggregate functions, but if any of the readers do, please share your implementation and i'll be happy to post it here.
In the few days following the publication of this article, I received several suggestions and comments from readers. This is a summary of the comments I received, and my thoughts on them.
One commenter on Reddit suggested using ARRAY_AGG :
This approach obviously works, and it doesn't require the key and the value to be of the same type, which is a big limitation of the array trick.
The downside to this approach is that it requires a sort which might become expensive with very large data sets:
Another commenter on Reddit suggested using window functions in combination with DISTINCT ON . This is the original suggestion:
The query uses both DISTINCT ON and window functions. It works by calculating the aggregates using the window function on the entire set (all rows of the account), and then fetch the first or last row using DISTINCT ON .
To make the window functions behave like a "group by" and calculate the aggregates on the entire set, the bound is defined as BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING , meaning "for the entire partition".
To avoid repeating the window for every aggregate, the query uses a WINDOW clause to define a named window that can be used multiple times in the query.
This query however, is not really working because the where clause is restricted to events with type credit_set . To get the complete status of the account, we also need to aggregate the balance of all events.
To actually make this approach work, we need to make the following adjustments:
The plan also gotten more complicated:
Two sorts, and several aggregates. The bottom line, in my opinion, is that this approach is harder to maintain and it yields a significantly more complicated plan. I wouldn't use it in this
https://docs.microsoft.com/en-us/dotnet/csharp/linq/perform-grouped-joins
https://hakibenita.com/sql-group-by-first-last-value
Vr Box Porn Hd
Dog Zoo Porn Women S
Naked Barefoot Porno
Perform grouped joins (LINQ in C#) | Microsoft Docs
How to Get the First or Last Value in a Group Using Group ...
4 Ways to Join Only The First Row in SQL | Sisense
SQL Join With First Matching Rows. Choosing the Best ...
SQL join tables with group by and order by - w3resource
Join a group in Outlook - support.microsoft.com
GroupJoin - Joining Operator - TutorialsTeacher
How To Join An Existing Group Of Friends | www ...
How to combine first and last name in Excel
Join Group First

Join Group First

Report Page