We have moved our forum to GitHub Discussions. For questions about Phalcon v3/v4/v5 you can visit here and for Phalcon v6 here.

Data consistency with findFirst then ->create()

I'm consistently having an issue in a production environment where I'm doing a query to see if the row exists, if so update, if not insert.

Here is the code

$hasExistingVisit = \Model\UserVisit::findFirst(array(
    'fromUserId = :fromUserId: AND toUserId = :toUserId:',
    "bind" => array(
        'fromUserId' => $this->currentUser->userId,
        'toUserId' => $this->profileUser->userId,
    ),
));

if($hasExistingVisit) {
    $hasExistingVisit->visitCount++;
    $hasExistingVisit->seen = 0;
    $hasExistingVisit->lastVisit = new \Phalcon\Db\RawValue('NOW()');               
    $hasExistingVisit->save();              
} else {
    $newVisit = new \Model\UserVisit;
    $newVisit->fromUserId = $this->currentUser->userId;
    $newVisit->toUserId = $this->profileUser->userId;
    $newVisit->lastVisit = new \Phalcon\Db\RawValue('NOW()');
    $newVisit->create();
}

About .001% of the time it will get this error:

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '977070-935378' for key 'PRIMARY'

I've tried everything, including using $hasExistingVisit !== false

This is probably a symptom of another problem, but I'm not calling it twice, and even if it were a race condition, its directly after the query, so it would be unlikely to trigger an error.

Any help/thoughts would be appreciated, thank you!

Simply - if people will refresh two times before adding to database in first time this condition will be ture. But when actually making query it can be already there - this is why you have error. For this case you simply should put this has existing visit to some kind of cache to be honest.

What about locking beforehand?

Altenatively you could go down to raw SQL and do a REPLACE or INSERT...IGNORE or INSERT ... ON DUPLICATE KEY UPDATE. This page explains the differences: https://chartio.com/resources/tutorials/how-to-insert-if-row-does-not-exist-upsert-in-mysql/

If it were me, I'd do an INSERT...IGNORE. In this case, the data is only going to be off by one, and you don't really need to record both visits anyway.

Answering myself after several years: the challenge with this problem was actually an unavoidable race condition. The solution for these kinds of problems are to publish the change to a queue and run the queries by a worker which guarantees they will run in order and not have a race condition as long as the worker consumer is single-threaded.