Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output one-dimensional lists for CSVReader #902

Open
SpeedoPasanen opened this issue Jun 4, 2024 · 1 comment
Open

Output one-dimensional lists for CSVReader #902

SpeedoPasanen opened this issue Jun 4, 2024 · 1 comment

Comments

@SpeedoPasanen
Copy link

SpeedoPasanen commented Jun 4, 2024

Use case: Read PDF, DOC, CSV etc from a buffer or string without a fs.

I think there's also a need for a new CSVReader that can output one-dimensional lists like this:

A1: A2
B1: B2

A1: A3
B1: B3
Etc... And still as one row per doc or joined to one doc.

I think modifying PapaCSVReader for this is not possible, because its constructor has too many boolean arguments, and adding one more would make it even more confusing. Would need to change it to a single config object with clear overloads, which would be a breaking change.

Off topic but something to think about: it's confusing that the PDF reader is assigning id_ to the docs it produces but other readers are not. I think either all readers should do it (if configured in some clear manner to do do so) or none should do it. Hidden functionality like that is potentially dangerous.

@marcusschiesser
Copy link
Collaborator

Thanks for your good feedback @SpeedoPasanen.

Agree, I just unified ID and metadata for readers and added to read the content of a file by using Buffer, see 73819bf
Luckily this also is a non-breaking change.

About your other request: Agree, PapaCSVReader has too many parameters, a single config object would help. I think it's an acceptable breaking change. You're welcome to send a PR and add your feature.

@marcusschiesser marcusschiesser changed the title Extract utility functions from file readers for standalone usage and make a better CSVReader Output one-dimensional lists for CSVReader Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants