It seems to be impossible to perform client-side paging full table scans within MySQL.
For example, say you want to take a 1GB file and page through it 10MB at a time.
With a flat file I could just read 10MB off disk, read the next 10MB, etc.
This would be amazingly efficient as you could write the data sequentially, without updating indexes, and then read the data back out by byte offset.
You could page through MySQL tables by adding an index to a table:
SELECT * FROM FOO WHERE PTR > 10000 LIMIT 10000;
for example … but I REALLY want to avoid an index step because it is not cheap and only required since MySQL doesn’t support cursors.
This index slows down the import stage and I have to buffer the index data in memory which is just a waste.
I could use LIMIT with OFFSET but this isn’t going to be amazingly efficient because it will either require us to use a temporary on disk table or force us to read each row off disk.
Technically, one could implement offset efficiently with a fixed with table since you can compute the byte offset as row_width*N but I don’t think MySQL implements this optimization.
Further, my tables are are variable width.
If MySQL supported cursors then I could just tell MySQL to perform a table scan and read the data from a cursor.
I’m implementing something similar to a Map Reduce job in MySQL and Java which would work VERY well on Hadoop but is nearly impossible to implement with MySQL efficiently since I can’t page through data without over-indexing my tables.
Thoughts? I would love it if there’s some stupid solution that I’m just missing.
I suspect this will become more and more of a problem as more developers want to just flat out Map Reduce their data.
I was wrong. The solution is HANDLER. Ryan Thiessen nailed it in the comments.