Best way to export large ResultSet


#1

Hi all, we have an option in our application to export data. The table has a lot of information, and also relations (that I need to export, for example, for a user, I want to export the company name he belongs to). The resultset is about 50,000 records. I have tried many workarounds: disable query buffer ($query->enableBufferedResults(false);, that throws an error beause relations: Cannot execute queries while other unbuffered queries are active), use $q->each, etc. without success, I always get a memory exhausted error (with 256M); I wouldn’t like to increase PHP memory, but find a better way to do this. Any idea? Thanks! (CakePHP 3.6, PHP 7.2, Ubuntu 18.04 x64)


#2

Chunk them, Id 1 - 1000, then the next ID that’s higher than 1000 -2000.

Write a small routine and use some if statements with some less than and greater than statements.


#3

or instead of manual id compare use sql/cake limit or try paginator and write limited data to file in loop.
above of https://book.cakephp.org/3.0/en/orm/query-builder.html#selecting-specific-fields


#4

I would just paginate query and increase php execution time.
You can do it directly in model like:

$cnt = $yourQuery->count();
$recordsLimit = 1000;
$pages = ceil($cnt / $recordsLimit);

 for( $page = 1; $page <= $pages; $page++) {
    $yourQuery->limit($limit)->page($page);
    $result = $query->toArray();
 }

#5

I’d also try to limit the related data to only the fields you really need - like this:

$users = $this->Users->find()
    ->select(['id', 'name', 'customer_alias' => 'Customers.alias', 'certificate'])
    ->contain([
	'Roles' => [
		'fields' => ['Roles.name', 'RolesUsers.user_id']
	],
	'Customers'
    ])
    ->all()
;

This can drastically shrink the fetched result size and therefore the used memory.


#6

Thanks all for your tips. I’ll try to implement these workarounds and give you some feedback.


#7

You’re likely at some point to hit a timeout (php, apache/nginx, or the user browser).

The best way is to trigger a task, make the export via cli and alert the user that the task is finished, with a link to the generated export.

you can use queues, eg: cakephp-queue