Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate entries #224

Open
N-R-K opened this issue Jun 7, 2024 · 8 comments · May be fixed by #227
Open

Deduplicate entries #224

N-R-K opened this issue Jun 7, 2024 · 8 comments · May be fixed by #227

Comments

@N-R-K
Copy link
Contributor

N-R-K commented Jun 7, 2024

Currently, duplicate snips get multiple entries clogging up the list of clips. This is especially problematic when you're using a small number of max clips (e.g 32). It'd be nice to avoid duplicate entries from being registered.

@cdown
Copy link
Owner

cdown commented Jun 7, 2024

If they are contiguous indexes they should not duplicate, that's handled by is_possible_partial. Could you give an example?

@N-R-K
Copy link
Contributor Author

N-R-K commented Jun 8, 2024

Could you give an example?

Sure, here's a test case:

$ xclip -sel c <<< "dup"
$ xclip -sel c <<< "clone"
$ xclip -sel c <<< "dup"
$ xclip -sel c <<< "clone"

Now if I open clipmenu, there's 2 "dup" and 2 "clone" entries in there.

image

@cdown
Copy link
Owner

cdown commented Jun 8, 2024

Yeah, these are not contiguous entries. This is actually intentional: one of the pieces of feedback from "old" clipmenu was that deduplication broke the sense of time of copying. In fact we have code specifically to preserve this behaviour.

@N-R-K
Copy link
Contributor Author

N-R-K commented Jun 8, 2024

deduplication broke the sense of time of copying

I'm not sure what this means, but I'm assuming it means the following:

  1. You have a snip "A" at index 10
  2. You copy "A" again
  3. "A" remains at index 10 instead of becoming index 0

Is that correct? If yes, then it does not seem like people want duplicate entries but rather they want new duplicate's "timestamp" to be bumped/renewed.

I haven't looked into src/store.c yet, but I can start digging to see how to make that work if I understood the problem correctly.

@cdown
Copy link
Owner

cdown commented Jun 8, 2024

It means you have entries:

A
B
C
D
E

You then copy B again. In your suggestion, the new entries are:

A
C
D
E
B

But the desired behaviour is actually:

A
B
C
D
E
B

Obviously one can make an argument in favour of either approach, but we used to do deduplication across the entire line cache and people would complain that their clips "disappeared", when in fact they just moved around after being copied again.

@N-R-K
Copy link
Contributor Author

N-R-K commented Jun 8, 2024

but we used to do deduplication across the entire line cache and people would complain that their clips "disappeared", when in fact they just moved around after being copied again.

Ah I see, so they do want duplicated entries.

Obviously one can make an argument in favour of either approach

Seems like there's no "objective" right behavior here, so would a config var to do de-duplication be acceptable then? It would serve both cases, user gets to pick the behavior he prefers.

@cdown
Copy link
Owner

cdown commented Jun 8, 2024

A config option is fine as long as it doesn't blow up the code complexity/size too much. Thanks!

@N-R-K
Copy link
Contributor Author

N-R-K commented Jun 10, 2024

I've poked a bit into this. The clip_store thing seems pretty delicate and I haven't looked through it fully, but something like the following seems to be working, in pseudo-code:

// in cs_content_add
if (dupe && do_dedup) {
	_drop_ ... = cs_ref(..);
	int dup_index = find_hash(cs->snips);
	move_to_end(cs->snips, dup_index);
	return 1;
}

Is this okay? Or will it cause some problem.

N-R-K added a commit to N-R-K/clipmenu that referenced this issue Jun 11, 2024
uses a linear search to find the duplicate and a memmove to move
it over to the end.

not perticularly efficient since each snip is 256 bytes while
we're only interested in the first 8. and so iterating over it
linearly like this isn't very cache friendly. the memmove worst
case can also be around 250KiB with the default config.

Closes: cdown#224
@N-R-K N-R-K linked a pull request Jun 11, 2024 that will close this issue
N-R-K added a commit to N-R-K/clipmenu that referenced this issue Jun 18, 2024
N-R-K added a commit to N-R-K/clipmenu that referenced this issue Jun 18, 2024
N-R-K added a commit to N-R-K/clipmenu that referenced this issue Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants