8000 feat(coderd/database): track user status changes over time by SasSwart · Pull Request #16019 · coder/coder · GitHub
[go: up one dir, main page]

Skip to content

feat(coderd/database): track user status changes over time #16019

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
cd953a3
add user_status_changes table
SasSwart Dec 14, 2024
ec16728
add GetUserStatusCountsByDay
SasSwart Dec 14, 2024
0d97e82
rename unused variable
SasSwart Dec 14, 2024
eb6e249
Test GetUserStatusCountsByDay
SasSwart Dec 17, 2024
7b2c259
make gen
SasSwart Dec 17, 2024
89177b2
fix dbauthz tests
SasSwart Dec 17, 2024
877517f
do the plumbing to get sql, api and frontend talking to one another
SasSwart Dec 23, 2024
0b3e0e6
rename migration
SasSwart Dec 23, 2024
6462cc2
move aggregation logic for GetUserStatusChanges into the SQL
SasSwart Dec 24, 2024
ccd0cdf
use window functions for efficiency
SasSwart Dec 24, 2024
12a274f
ensure we use the same time zone as the start_time param
SasSwart Dec 24, 2024
fcfd76e
ensure we use the same time zone as the start_time param
SasSwart Dec 24, 2024
7c0cade
make gen
SasSwart Dec 24, 2024
ecffc8b
update field names and fix tests
SasSwart Dec 24, 2024
f3a2ce3
exclude deleted users from the user status graph
SasSwart Dec 27, 2024
5067a63
GetUserStatusChanges now passes all querier tests
SasSwart Jan 2, 2025
254d436
renumber migrations
SasSwart Jan 2, 2025
3e86522
add partial fixture for CI
SasSwart Jan 3, 2025
2e49e4c
fix migration numbers
SasSwart Jan 3, 2025
ff59729
rename and document sql function
SasSwart Jan 3, 2025
b1ad074
make gen
SasSwart Jan 3, 2025
de4081f
Remove unwanted comments from the generated interface
SasSwart Jan 9, 2025
4de334f
review notes
SasSwart Jan 9, 2025
8fca0c5
make gen
SasSwart Jan 9, 2025
b06179c
remove frontend changes
SasSwart Jan 9, 2025
213b288
rename GetUserStatusCountsOverTime to GetUserStatusCounts
SasSwart Jan 9, 2025
9457ac8
fix tests
SasSwart Jan 9, 2025
63128a3
Update coderd/database/queries/insights.sql
SasSwart Jan 10, 2025
89f0a11
Provide basic durability against multiple deletions
SasSwart Jan 13, 2025
89ebab2
Provide basic durability against multiple deletions
SasSwart Jan 13, 2025
012f14c
populate the user_deleted_table
SasSwart Jan 13, 2025
c2efd97
formatting
SasSwart Jan 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add GetUserStatusCountsByDay
  • Loading branch information
SasSwart committed Jan 9, 2025
commit ec16728f3677328a069f007b2145d0f48820fcf4
7 changes: 7 additions & 0 deletions coderd/database/dbauthz/dbauthz.go
Original file line number Diff line number Diff line change
Expand Up @@ -2421,6 +2421,13 @@ func (q *querier) GetUserNotificationPreferences(ctx context.Context, userID uui
return q.db.GetUserNotificationPreferences(ctx, userID)
}

func (q *querier) GetUserStatusCountsByDay(ctx context.Context, arg database.GetUserStatusCountsByDayParams) ([]database.GetUserStatusCountsByDayRow, error) {
if err := q.authorizeContext(ctx, policy.ActionRead, rbac.ResourceSystem); err != nil {
return nil, err
}
return q.db.GetUserStatusCountsByDay(ctx, arg)
}

func (q *querier) GetUserWorkspaceBuildParameters(ctx context.Context, params database.GetUserWorkspaceBuildParametersParams) ([]database.GetUserWorkspaceBuildParametersRow, error) {
u, err := q.db.GetUserByID(ctx, params.OwnerID)
if err != nil {
Expand Down
9 changes: 9 additions & 0 deletions coderd/database/dbmem/dbmem.go
Original file line number Diff line number Diff line change
Expand Up @@ -5669,6 +5669,15 @@ func (q *FakeQuerier) GetUserNotificationPreferences(_ context.Context, userID u
return out, nil
}

func (q *FakeQuerier) GetUserStatusCountsByDay(ctx context.Context, arg database.GetUserStatusCountsByDayParams) ([]database.GetUserStatusCountsByDayRow, error) {
err := validateDatabaseType(arg)
if err != nil {
return nil, err
}

panic("not implemented")
}

func (q *FakeQuerier) GetUserWorkspaceBuildParameters(_ context.Context, params database.GetUserWorkspaceBuildParametersParams) ([]database.GetUserWorkspaceBuildParametersRow, error) {
q.mutex.RLock()
defer q.mutex.RUnlock()
Expand Down
7 changes: 7 additions & 0 deletions coderd/database/dbmetrics/querymetrics.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 15 additions & 0 deletions coderd/database/dbmock/dbmock.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions coderd/database/querier.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

103 changes: 103 additions & 0 deletions coderd/database/queries.sql.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

68 changes: 68 additions & 0 deletions coderd/database/queries/insights.sql
Original file line number Diff line number Diff line change
Expand Up @@ -771,3 +771,71 @@ SELECT
FROM unique_template_params utp
JOIN workspace_build_parameters wbp ON (utp.workspace_build_ids @> ARRAY[wbp.workspace_build_id] AND utp.name = wbp.name)
GROUP BY utp.num, utp.template_ids, utp.name, utp.type, utp.display_name, utp.description, utp.options, wbp.value;

-- name: GetUserStatusCountsByDay :many
WITH dates AS (
-- Generate a series of dates between start and end
SELECT
day::date
FROM
generate_series(
date_trunc('day', @start_time::timestamptz),
date_trunc('day', @end_time::timestamptz),
'1 day'::interval
) AS day
),
initial_statuses AS (
-- Get the status of each user right before the start date
SELECT DISTINCT ON (user_id)
user_id,
new_status as status
FROM
user_status_changes
WHERE
changed_at < @start_time::timestamptz
ORDER BY
user_id,
changed_at DESC
),
relevant_changes AS (
-- Get only the status changes within our date range
SELECT
date_trunc('day', changed_at)::date AS day,
user_id,
new_status as status
FROM
user_status_changes
WHERE
changed_at >= @start_time::timestamptz
AND changed_at <= @end_time::timestamptz
),
daily_status AS (
-- Combine initial statuses with changes
SELECT
d.day,
COALESCE(rc.status, i.status) as status,
COALESCE(rc.user_id, i.user_id) as user_id
FROM
dates d
CROSS JOIN
initial_statuses i
LEFT JOIN
relevant_changes rc
ON
rc.day = d.day
AND rc.user_id = i.user_id
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried about the number of CTEs and performance, but I think this is fine for now and let's not prematurely optimize. Just raising this so you're aware that in some cases a CTE performs worse than a pure query with joins and subqueries. That's mainly because the result of a CTE lacks indexes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about performance too, but user count and user status changes is going to be way less data than say workspace builds. Like on the order of 1000s. 🤷

Might be worth a benchmark, but we don't have a good way to populate "large" datasets atm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did it this way for legibility. Trying to write the optimal solution here made my own query inscrutable even to myself. I don't consider this to be on a particularly hot path. We do have metrics for this query. If it needs to be optimised, we can do so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just raising this so you're aware that in some cases a CTE performs worse than a pure query with joins and subqueries

Also these CTEs were designed to reduce the amount of data being handled as early as possible. We use the existing indices while we have them to identify exactly the data we need ASAP and then once we need to join it all the hope is that we've brought down the volume low enough that it's performant regardless of the lack of indices on CTEs.

SELECT
day,
status,
COUNT(*) AS count
FROM
daily_status
WHERE
status IS NOT NULL
GROUP BY
day,
status
ORDER BY
day ASC,
status ASC;
0