Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Train feature for Crews #686

Open
wants to merge 22 commits into
base: main
Choose a base branch
from
Open

Conversation

pythonbyte
Copy link
Collaborator

The objective of this PR is to add training functionality for Crews.

With this functionality it will be possible to run the command crewai train -n 2 and 2 training iterations will be run so that future tasks can have greater quality and consistency.

@atb29
Copy link

atb29 commented May 26, 2024

The objective of this PR is to add training functionality for Crews.

With this functionality it will be possible to run the command crewai train -n 2 and 2 training iterations will be run so that future tasks can have greater quality and consistency.

Isn't that achievable by the memory feature instead??

@joaomdmoura
Copy link
Owner

@atb29 this is memory on steroids, it uses some of the memory work but allows you to train the crew before shipping, something we are testing internally, super excited about it, running some benchmarks now

self, output: AgentFinish, human_feedback: str | None = None
) -> None:
"""Function to handle the process of the training data."""
training_data_filename = "training_data.pkl"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have these pkl file names in consts, maybe? This way we have a single place to find them.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, I'll add it. Thanks @gvieira


if (
not self.should_ask_for_human_input
and PickleHandler(training_data_filename).load()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked how you've used := above to prevent double-calling the load(). Should we try to do the same here?

Copy link
Collaborator Author

@pythonbyte pythonbyte Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks, I'll change.

human_feedback = self._ask_human_input(output.return_values["output"])

if self.crew._train:
self._training_handler(output, human_feedback)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above, you've used a verb to name the method (_ask_human_input). This is perfect, because that's exactly what it does – there's even no need for documentation on the method in this case. For _training_handler however, it doesn't feel the same. How could we name this method to make it more clear what it's doing?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, I didn't have the proper time to review it, but I agree.
I'll review and change it to make it more intuitive.

training_data := PickleHandler(TRAINING_DATA_FILE).load()
and not self.should_ask_for_human_input
):
training_data = PickleHandler(TRAINING_DATA_FILE).load()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this line anymore, right!?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing 🤦🏽

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could swear I had this file renamed to properly match the convention: file_handler.py. 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want me to rename it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think it makes sense as we don't use pascalCase for file names.

file.write(message + "\n")


class PickleHandler:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I got it right, this has started as a generic handler/helper for dealing with pickle marshaling, right!? However, I believe it got really tied up to the training concept – we have a lot of train-related functions. Should we maybe move this to a training module maybe, renaming it to something like TrainingHandler or something like that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point, makes sense. I could have the PickleHandler as a utility file handler class.
And add a TrainingHandler that can Inherit the class and add the save_trained_data/append methods, what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be perfect! Loved the idea!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants