Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: word wrapping in preview #1159

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

ArtyomArtamonov
Copy link

@ArtyomArtamonov ArtyomArtamonov commented Jun 16, 2024

This PR tries to add feature described in #1089

I tried to implement feature without introducing new dependencies, like it was asked in #1142. Also I used #1142 commits as baseline to my changes (configuration from yazi.toml, highlight signature).

I tried to separate new logic with two methods: before_after and before_after_wrapped. before_after is almost the same with minor change in readability no_more_scroll.
before_after_wrapped is more complicated. I had to read whole file first and apply logic afterwards. We have to decide if file is plaintext or not because we only apply indentation (tab_size in config) to plaintext files (in this method at least).
After that I chunk every line of file in chunks by column width using unicode-width crate as was said. I tried my solution in some cases including ascii characters, cyrrilic alphabet, emojis. All of them looks fine.

It's my first PR using rust, I tried to come up with optimized solution, but this is as far as I could go. If you have any suggestions how we can improve it, please tell, I'm open to suggestions.

@mskvsk
Copy link

mskvsk commented Jun 17, 2024

Thank you for the PR! It works perfectly, the only thing I would change is hard wrap to soft wrap.

Hard wrap example:
The quick brown fox jum
ps over the lazy dog

Soft wrap example:
The quick brown fox
jumps over the lazy dog

As an extra suggestion, the crate I have used previously provides some algorithm that tries to optimize the wrapping so it looks more even. But that's completely optional, just regular soft wrap will be enough.

Appreciate the solution!

@ArtyomArtamonov ArtyomArtamonov marked this pull request as draft June 18, 2024 16:26
@ArtyomArtamonov ArtyomArtamonov marked this pull request as ready for review June 18, 2024 21:51
@ArtyomArtamonov
Copy link
Author

Instead of hard wrap I tried to implement soft wrap. It tries to wrap lines on characters ,.; . If a line does not contain any of these characters, we use the old logic - hard wrapping

@mskvsk
Copy link

mskvsk commented Jun 19, 2024

Works like a charm, thank you.

Please, DM me your Ethereum wallet so I could send you 50 USDT (let me know if you prefer USDC).

As far as I am concerned, the problem is solved so looking forward to @sxyazi 's review.

Comment on lines 203 to 205
} else if !line.ends_with(b"\n") {
line.push(b'\n')
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this branch isn't in the original code - can you explain why you added it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every line in original code had \n at the end, but if we want to wrap long line, we have to place \n at the end of each piece of smaller line. There should not be any issues with old solution because all lines there already have \n at the end

false
}

fn chunk_by_width(line: Vec<u8>, width: usize) -> Vec<Vec<u8>> {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please combine all instances of String::from_utf8 and String::from_utf8_lossy in this function into a single one, as it is a time-consuming operation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to optimize this solution a bit by using two buffers: one is string and second is Vec. First one is used only to calculate width every iteration. Second is used to take slices when slicing a line

let mut buf = vec![];
// If we want to indent plain text, we have to decide if it is plain
while reader.read_until(b'\n', &mut buf).await.is_ok() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need both while and for loops here? It seems like while will keep reading until the end of the file without stopping early, I think we should exit the loop as soon as we have enough lines for preview.

Is it possible to move if long_line.is_empty() || lines_handled > skip + area.height as usize { up into the while loop and combine the while and for into one loop?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I thought we have to go line by line twice, but now it seems it's possible to do it once. I tried to refactor code according to this. Now we traverse a file once, like before

}
long_lines.push(mem::take(&mut buf));
buf.clear()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
buf.clear()

This is unnecessary. After mem::take, buf will be initialized as a new Vec.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 198 to 202
if line.as_str().ends_with("\r\n") {
let mut chars = line.chars();
chars.next_back();
chars.next_back();
line = chars.as_str().to_string();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the original code here. The new implementation using to_string() will cause unnecessary memory reallocations. Since \r\n are known to be ASCII characters, there's no need to treat them as Unicode chars.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e.

		if line.ends_with("\r\n") {
			line.pop();
			line.pop();
			line.push('\n');

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

chars.next_back();
chars.next_back();
line = chars.as_str().to_string();
} else if !line.as_str().ends_with('\n') {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else if !line.as_str().ends_with('\n') {
} else if !line.ends_with('\n') {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also fixed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants