Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load GUFF from assets? #13

Open
LondonX opened this issue Feb 21, 2024 · 1 comment
Open

Load GUFF from assets? #13

LondonX opened this issue Feb 21, 2024 · 1 comment
Assignees
Labels
good first issue Good for newcomers

Comments

@LondonX
Copy link

LondonX commented Feb 21, 2024

Hi,

Say I put my guff in assets and added in yaml like this

flutter:
  assets:
    - assets/bin/

Can or will it be possible to use it like talkAsync, without dump the file into App's files dir?

@BrutalCoding
Copy link
Owner

BrutalCoding commented Feb 24, 2024

I would suggest to not include big files as assets, because of the following (on top of my head) issues:

  • Compile time increases due to asset files being copied from dev machine to target device (e.g. your phone)
  • You will run into available memory issues during debug sessions, requiring you to increase the java memory space from the default 2048 MB to 4096 MB for example in Android projects. This file for example requires a change: gradle.properties with line org.gradle.jvmargs=-Xmx2048M -Dkotlin.daemon.jvm.options\="-Xmx2048M". Note that this is limited, don't expect to be able to add yi-34b GGUF for example, I think the Android build process will just crash no matter how much you increase that jvm arg.
  • Your app size will increase, e.g. a model of 2 GB means an increase of 2 GB to your app.
  • App stores have a max file size limit, I commented this before in another issue but forgot the details. I guess it's between 500 MB to 1.5 GB maybe, depending on whether its the Play Store or App Store. Thus you're already limited by these restrictions, even if there wouldn't be any technical difficulties on your dev machine.

Instead, I suggest you to look at 2 other approaches:

  1. BYOM (Just made this up): Bring Your Our Model. Aka, like my example app where users need to use the file picker to select their own GGUF file. This will then be copied to the app's cache folder. You can proof this by looking at the file size of the app on Android before, -and after selecting a GGUF. Spoiler alert: My example app doesn't delete the previous selected GGUF, thus if you used my app with 5 different GGUFS and each were 2 GB's big, the app size will be [default app size] + (5x 2GB = 10 GB) big.

  2. Like how games work, simply let your app download the asset (your GGUF) while the app is running.

  • Pros: The app size stays small, thus users will be more likely to download/try your app.
  • Cons: Users can't use the app right away the very first time, because your app needs to download the GGUF file first. Background downloads are probably limited, especially on iOS, thus you might need to ask your users to keep the app open while [amount] GB's is being downloaded.. Not a good experience but there's no other solution I can think of. Games can get away by downloading delta's / small chunks of the game, but for LLM inference, you will need the whole file at once before you can interact with it.

I hope this answer helps you in some way. Otherwise, shoot me another question.

@BrutalCoding BrutalCoding self-assigned this Feb 24, 2024
@BrutalCoding BrutalCoding added the good first issue Good for newcomers label Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants