Solved thread

This post is marked as solved. If you think the information contained on this thread must be part of the official documentation, please contribute submitting a pull request to its repository.

Insert 1000 - 2000 thousands rows

Hello,

As the titel said, i am reading an XML file, and needs to insert 1000 - 2000 thousands rows into a database.

How would be the best way to do this, and is there any server issues I need to be aware of?

Thanks!

Use prepared statements - for the start and you should be fine.

Using raw sql, for example db service and batch insert i think.



42.2k
Accepted
answer
edited Jan '17

Prepared statements offer two major benefits:

  • The query only needs to be parsed (or prepared) once, but can be executed multiple times with the same or different parameters. When the query is prepared, the database will analyze, compile and optimize its plan for executing the query. For complex queries this process can take up enough time that it will noticeably slow down an application if there is a need to repeat the same query many times with different parameters. By using a prepared statement the application avoids repeating the analyze/compile/optimize cycle. This means that prepared statements use fewer resources and thus run faster.
  • The parameters to prepared statements don't need to be quoted; the driver automatically handles this. If an application exclusively uses prepared statements, the developer can be sure that no SQL injection will occur (however, if other portions of the query are being built up with unescaped input, SQL injection is still possible).

http://php.net/manual/en/pdo.prepared-statements.php

@Jonathan, thanks for posting that info. More people should know about these :)



44.7k

You will not be able to use Phalcon model hooks if you do raw SQL inserts and its only appropriate if there are only database entries (no related data on the file system or other processing). Also it might be a good idea to flesh out other parts of your application (depending on how critical this is to development) before doing this optimization because it will take far less programming to do it with Phalcon models then to write that lower level code.

Also I'll take this opportunity to plug my Phalcon Cookbook where in Chapter 4 we discuss some approaches from full Phalcon ORM down to leveraging Phalcon models meta data to assist with field and table names so that you don't have to hard code these constants into your SQL. I think that its very safe to say that the book would save you 2-3 hours of reinventing the wheel.

Thank you all who made a reply, it really helps alot.

I not that good at phalcon yet, so let's say I foreach the xml file and it have 2000 loops, could this be the way to do it?

foreach(){ $phql = "INSERT INTO Cars (name, brandid, year, style) " . "VALUES (:name:, :brandid:, :year:, :style)"; $manager->executeQuery( $phql, [ "name" => "Lamborghini Espada", "brand_id" => 7, "year" => 1969, "style" => "Grand Tourer", ] );

}

Thank you again!



1.7k
edited Feb '17

Move $phql above the loop and in loop just bind parameters and execute query. Thats why it is called prepare statement

edited Feb '17

Even better create query object outside of loop and just execute it in loop. But faster way will be to use raw query + batch insert like insert 10-50 rows at once 100-20 times.

I don't see why we discuss this further. Prepared statements are the way to go, there's no need to make thing more complicated out of no reason. The very basic point of prepared statements is exactly that - build SQL statement once, and execute it many times. The PDO itself will pass that buffer to the RDBMS engine and you don't have to worry about it, RDBMS will take care of the batch inserts.

I have used this before when inserting from Excel files with 50.000+ rows. Dont know if its the best or correct way but it works wonders:

 $db = $this->getDi()->getShared('db');
 $db->begin();
 foreach($content AS $row){
    $db->query("INSERT INTO table (field1, field2, field3) VALUES (?, ?, ?)",  array($row['field1'], $row['field2'], $row['field3']));
 }
 $db->commit();

@cabo1956 You're using transactions. It sure works, but that's much slower and should be only used when you have exact need for transactions, i.e. mission critical operation where you can afford as much slowdown as required.

Thank you all for answering!

I ended up with this code:

$phql = "INSERT INTO Cars (name, brandid, year, style) " . "VALUES (:name:, :brandid:, :year:, :style)";

foreach(){

$manager->executeQuery( $phql, [ "name" => "Lamborghini Espada", "brand_id" => 7, "year" => 1969, "style" => "Grand Tourer", ] );

}

Hope this is what you guys mean!

edited Feb '17

Well it's not bad, you can even optimize it further by:

$phql = "INSERT INTO Cars (name, brandid, year, style) " . "VALUES (:name:, :brandid:, :year:, :style)";
$query = $manager->createQuery($phql);

foreach(){

$query->execute([ "name" => "Lamborghini Espada", "brand_id" => 7, "year" => 1969, "style" => "Grand Tourer", ] );

}

Or just use raw sql.

Thank you!

Would it make any diffrence to use Raw SQL ?