Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Messages category correlation index is a separate index #45

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aaronjensen
Copy link
Contributor

message_store=> explain analyze select * from message_store.messages where message_store.category(stream_name) = 'someCategory' and message_store.category(metadata->>'correlationStreamName') = 
'someCorrelationCategory' order by global_position limit 1;

 Limit  (cost=0.42..1.59 rows=1 width=775) (actual time=0.026..0.031 rows=1 loops=1)
   ->  Index Scan using messages_category on messages  (cost=0.42..120534.43 rows=103760 width=775) (actual time=0.022..0.024 rows=1 loops=1)
         Index Cond: (((message_store.category((stream_name)::character varying))::text = 'someCategory'::text) AND ((message_store.category(((metadata ->> 'correlationStreamName'::text))::char
acter varying))::text = 'someCorrelationCategory'::text))
 Planning Time: 0.710 ms
 Execution Time: 0.980 ms
(5 rows)

message_store=> explain analyze select * from message_store.messages where message_store.category(stream_name) = 'someCategory' and message_store.category(metadata->>'correlationStreamName') = 
'someOtherCorrelationCategory' order by global_position limit 1;

 Limit  (cost=0.42..11.44 rows=1 width=775) (actual time=37.120..37.129 rows=1 loops=1)
   ->  Index Scan using messages_category on messages  (cost=0.42..11502.05 rows=1044 width=775) (actual time=37.114..37.117 rows=1 loops=1)
         Index Cond: (((message_store.category((stream_name)::character varying))::text = 'someCategory'::text) AND ((message_store.category(((metadata ->> 'correlationStreamName'::text))::char
acter varying))::text = 'someOtherCorrelationCategory'::text))
 Planning Time: 0.150 ms
 Execution Time: 37.170 ms
(5 rows)

message_store=> CREATE INDEX messages_category_correlation ON message_store.messages (
message_store(>   message_store.category(stream_name),
message_store(>   message_store.category(metadata->>'correlationStreamName'),
message_store(>   global_position
message_store(> );

CREATE INDEX


message_store=> explain analyze select * from message_store.messages where message_store.category(stream_name) = 'someCategory' and message_store.category(metadata->>'correlationStreamName') = 
'someCorrelationCategory' order by global_position limit 1;

 Limit  (cost=0.42..1.59 rows=1 width=775) (actual time=0.097..0.103 rows=1 loops=1)
   ->  Index Scan using messages_category on messages  (cost=0.42..120534.43 rows=103760 width=775) (actual time=0.091..0.093 rows=1 loops=1)
         Index Cond: (((message_store.category((stream_name)::character varying))::text = 'someCategory'::text) AND ((message_store.category(((metadata ->> 'correlationStreamName'::text))::char
acter varying))::text = 'someCorrelationCategory'::text))
 Planning Time: 0.321 ms
 Execution Time: 0.143 ms
(5 rows)

message_store=> explain analyze select * from message_store.messages where message_store.category(stream_name) = 'someCategory' and message_store.category(metadata->>'correlationStreamName') = 
'someOtherCorrelationCategory' order by global_position limit 1;

 Limit  (cost=0.55..2.58 rows=1 width=775) (actual time=0.028..0.031 rows=1 loops=1)
   ->  Index Scan using messages_category_correlation on messages  (cost=0.55..2115.43 rows=1044 width=775) (actual time=0.025..0.026 rows=1 loops=1)
         Index Cond: (((message_store.category((stream_name)::character varying))::text = 'someCategory'::text) AND ((message_store.category(((metadata ->> 'correlationStreamName'::text))::char
acter varying))::text = 'someOtherCorrelationCategory'::text))
 Planning Time: 0.107 ms
 Execution Time: 0.057 ms
(5 rows)

@aaronjensen
Copy link
Contributor Author

aaronjensen commented Oct 30, 2022

100,000 Writes Benchmark

All units in ms

master

2584
2753
2760
2557
2593

Avg: 2649 +/- 99

This branch

2775
2995
2858
3001
2796

Avg: 2885 +/- 108

Result

1-17.3% (8.9% avg) slower (stats here is probably bad, I just compared the edges of avg +/- std dev)

@aaronjensen
Copy link
Contributor Author

aaronjensen commented Oct 30, 2022

Timing when invoking get_category_messages. Note that the closer "position" is to the end of the category, the less impact the new index has (because the global position filter narrows the search space down enough that the correlation stream name index matters less).

This means that active category/correlation combos likely won't see much benefit from this index addition. The only thing that would is category/correlation combos that are infrequent but have a very active category w/ other (or no) correlations.

With New Index

message_store=> explain analyze select * from message_store.get_category_messages('someCategory', batch_size => 1000, correlation => 'someOtherCorrelationCategory', "position" => 9999);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_category_messages  (cost=0.25..10.25 rows=1000 width=184) (actual time=7.150..9.539 rows=1000 loops=1)
 Planning Time: 0.051 ms
 Execution Time: 14.867 ms
(3 rows)

message_store=> explain analyze select * from message_store.get_category_messages('someCategory', batch_size => 1000, correlation => 'someOtherCorrelationCategory', "position" => 9999);
                                                         QUERY PLAN                                                          
-----------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_category_messages  (cost=0.25..10.25 rows=1000 width=184) (actual time=6.805..7.773 rows=1000 loops=1)
 Planning Time: 0.057 ms
 Execution Time: 8.665 ms
(3 rows)

Without New Index

message_store=> drop index message_store.messages_category_correlation;
DROP INDEX
message_store=> explain analyze select * from message_store.get_category_messages('someCategory', batch_size => 1000, correlation => 'someOtherCorrelationCategory', "position" => 9999);
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_category_messages  (cost=0.25..10.25 rows=1000 width=184) (actual time=58.951..60.637 rows=1000 loops=1)
 Planning Time: 0.041 ms
 Execution Time: 62.777 ms
(3 rows)

message_store=> explain analyze select * from message_store.get_category_messages('someCategory', batch_size => 1000, correlation => 'someOtherCorrelationCategory', "position" => 9999);
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_category_messages  (cost=0.25..10.25 rows=1000 width=184) (actual time=40.870..42.443 rows=1000 loops=1)
 Planning Time: 0.049 ms
 Execution Time: 43.981 ms
(3 rows)

message_store=> explain analyze select * from message_store.get_category_messages('someCategory', batch_size => 1000, correlation => 'someOtherCorrelationCategory', "position" => 9999);
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Function Scan on get_category_messages  (cost=0.25..10.25 rows=1000 width=184) (actual time=32.055..33.033 rows=1000 loops=1)
 Planning Time: 0.112 ms
 Execution Time: 33.952 ms
(3 rows)

@sbellware
Copy link
Contributor

Side note on correlation data: it remains an open issue as to whether the correlationStreamName should be its own property on the messages table, and not an Eventide-specific piece of metadata. The jury is still out on that issue. The current implementation is a useful one, but should be considered a placeholder - though one that might be permanent.

We'll know by the time the next product generation of Message DB goes into development next year.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants