-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large table runs into memory limit #15
Comments
Hi @dyenzer , yeah there is definitely a scalability concern that you've addressed above. It's not that the SQL transactions are loaded into memory - these are actually written to disk (to a temporary file in working directory) before they are loaded into the Target DB via the DB's BULK IMPORT tool (i.e. mysqlimport, COPY FROM, etc...). The issue is that all data from the source database is loaded into memory, and is then iterated over row-by-row to create the .sql file that is bulk loaded into the database. It shouldn't be too difficult of a fix, but we basically need to use generators to "stream" the data through ETLAlchemy, rather than naively loading all of the data into a list (current implementation). I foresee the need to iterate over each table's rows twice:
I will leave this open, as I may have some time coming up this summer (June) to implement these changes. In the meantime, feel free to take a crack at it yourself and open up a PR! |
@seanharr11 Hey when you will fix this issue. |
Hi i am using mssql to fetch record of approx 25 Million |
@seanharr11 When u will resovle this issue ? |
@Anmol-Tuple this is a pretty big undertaking...It will take considerable time, and without tests in place it will be very difficult to ensure this won't break etlalchemy. This is, sadly, the by-product of me not knowing a great deal about building maintainable OSS 2 years ago. That said, if you'd like to take a shot at beginning to implement this, on a new branch, feel free to while I develop tests. I would encourage this, as I would encourage ANY other maintainer to help in this effort (especially to develop tests!). |
@seanharr11 |
Hi @rajrohith you can try the tool initially, and see how it works. If the program exits with a memory dump (i.e. the OS kills the process) then you'll know the tool won't support the size of your tables. There is definitely a need to add functionality to support large tables (i.e. by writing to disk intermittently and only loading buffers of data). This should be done on a feature-branch, and merged in once tests are in place for |
Hi @seanharr11 This is a great project. Thank you for your efforts. I am trying to transfer an Oracle 11.2 database to MySQL 5.7, but some of my tables are very big and do not fit into memory. I have tried to rewrite the migrate() method to a two-phase analyze and write operation. I am planning to send you a pull request, but there are some issues with my code:
Best regards @josteinl |
I've had the same issue migrating a database with a table that was over 20GB in size. The easiest thing to do is increase your swap space. This can be done dynamically/temporarily by doing the following:
In my situation, increasing swap to ~110GB "resolved" this issue and allowed the table to be migrated. You may need to go higher than that. My rule of thumb is:
And by |
First I'd like to say what a great tool this is. It would have saved a lot of time if it had worked for me out the gate (I know nothing about SQL/DBA and inherited a rather large database to manage). Now on to the problem...
I have a table that's 105 Million rows large that I'm trying to transfer to a newer database (Oracle 11g to MySQL 5.7). It looks like your tool loads all of the SQL transactions into memory, and with 40GB RAM / 20GB Swap on the target server, it will reach row 60 Million before it simply says 'Killed'. I'm not sure if this is something you can easily fix, like having the option to write to file instead of memory, but I wanted to at least make you aware of it as a limitation.
Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.
The text was updated successfully, but these errors were encountered: