Query builder inWhere and notInWhere conditions slow query

I have a problem with phalcon query builder. When I use inWhere/notInWhere conditions, performaces go down and execution of queries is very very slow.

Following two examples of the same query, the first with notInWhere, the second with array implosion:

$arrayIDs; // array contains about 16k integer IDs. Result of this query: SELECT DISTINCT(Table2.idTable2) AS idTable2 FROM NAMESPACE\idTable2 AS idTable2 WHERE idTable2.Attr1 IS NOT NULL OR idTable2.Attr2 = 1;

// VERSION 1
$builder = $modelsManager->createBuilder()
    ->columns(array('COUNT(*) as CountTable1'))
    ->from(array('Table1'=>'NAMESPACE\Table1'))
     ->where('Table1.Attr1 = :Attr1:', array('Attr1' => 1))
     ->notInWhere('Table1.idTable1', $arrayIDs);
$builder->getQuery()->execute()->setHydrateMode(Resultset::HYDRATE_ARRAYS); //execution completed in more than 30 seconds.
// VERSION 2
$query = 'SELECT COUNT(*) AS CountTable1 FROM Table1 WHERE Table1.Attr1 = 1 AND Table1.idTable1 NOT IN (' . implode(',', $arrayIDs) . ')';
$modelsManager->executeQuery($query); //execution completed in more than 30 seconds.

Instead if I write the query selecting IDs and Count in single statement, the execution is faster.

Following the example code:

// VERSION 3
$query = 'SELECT COUNT(*) AS CountTable1 FROM NAMESPACE\Table1 AS Table1 WHERE Table1.Attr1 = 1 AND Table1.idTable1 NOT IN (SELECT DISTINCT(Table2.idTable2) AS idTable2 FROM NAMESPACE\Table2 AS Table2 WHERE Table2.Attr1 IS NOT NULL OR Table2.Attr2 = 1)';
$$modelsManager->executeQuery($query);

Where is the problem? What am I doing wrong? It is possible to obtain the same performaces with query builder?

Finally, is normal that query with inWhere and notInWhere conditions is so slow with query builder?

System configuration:
  • Phalcon: 2.0.10
  • PHP: 5.6.26
  • MySql: 5.7.12
  • OS: CentOS 7.4
edited 16d ago

Passing 16 thousand ID's into query builder is simply crazy. You shoudnt be suprised it's working slow. It will be slow in every query builder you would use (like Doctrine DBAL).



66.8k

maybe its just the ping ? are php server and mysql close to each other ( same machine maybe ) ?



183
edited 16d ago

>maybe its just the ping ? are php server and mysql close to each other ( same machine maybe ) ?

php server and mysq are on different machines but on the same network, so no ping problem.

Are you sure this is is about phalcon? First test it with pdo without framework then on database itself. I know that some old mysqldb old version has sometimes huge performance problems, I had on old version queries for 3 mins after update few ms



183

The problem does not seem to be mysql. I tried to write the query directly on mysql by replacing the id select to exclude a string containing the 16k IDs (like version 2), and the performance is great, 57ms.

Example of launched query:

SELECT COUNT(*) AS CountTable1 FROM Table1 WHERE Table1.Attr1 = 1 AND Table1.idTable1 NOT IN (1,2,3, ....., 16000)

Is there any serious reason to do it this way? Why don't you use subquery like in Version 3 (or even better - JOIN)?

Passing such huge numbers to NOT IN closure is never a good thing: first you need to fetch it from DB, then pass back to mysql server in form of SQL query.

It's much faster if whole thing is at mysql server side.