Solved thread

This post is marked as solved. If you think the information contained on this thread must be part of the official documentation, please contribute submitting a pull request to its repository.

PHP5-FPM on Nginx crash Zend/zend_language_scanner.c: no such file or directory

I try to post this question here, maybe the problem is caused by Phalcon itself, even though I don't think that.

Our current configuration:

  • Debian 8.6
  • Nginx web server
  • PHP5-FPM v5.6.27-0+deb8u1
  • Zend Engine v2.6.0 with Zend OPcache v7.0.6-dev

We use Phalcon framework v2.0.11 (and v2.0.13 on test environment, where we experience the same errors). In order to work, Phalcon needs to be compiled and then it becomes a PHP pre-loaded module (phalcon.so). During compilation, it invokes Zend libraries among the others.

We also use Memcached (as service and as PHP module).

The application runs normally except that Nginx randomly throws:

502 Bad Gateway

errors during navigation. Reloading the page (F5) or pressing "Back" browser button, the page gets loaded without any error.

Sometimes, 502 errors are more frequent than other moments, apparently regardless of the load or traffic on the server.

The only errors we can read from the logs are not eloquent at all:

php5-fpm.log:

WARNING: [pool www] child 2183 exited on signal 7 (SIGBUS) after 0.120012 seconds from start
WARNING: [pool www] child 1391 exited on signal 7 (SIGBUS) after 59.871442 seconds from start
WARNING: [pool www] child 12836 exited on signal 7 (SIGBUS - core dumped) after 560.364868 seconds from start
WARNING: [pool www] child 10874 exited on signal 7 (SIGBUS - core dumped) after 38.964131 seconds from start
...
...

nginx/error.log:

[error] 8428#0: *368771 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: xxx.xxx.xxx.xxx, server: xxxxxx.xxxxxxxxx.xxx, request: "POST /abc/def_ghi HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "xxxxxx.xxxxxxxxx.xxx", referrer: "https://xxxxxx.xxxxxxxxx.xxx/abc"
...
...

After days of reasearch, we tried applying all the suggestions we found on internet. The parameters we modified, tested and checked on Nginx and php-fpm are, among the others:

(on php.ini)
output_buffering
max_execution_time
memory_limit

(on www.conf)
listen = /var/run/php5-fpm.sock or listen = 127.0.0.1:9000
pm = ondemand/static/etc.....
pm.max_children 500/30/1/100/etc....
pm.start_servers = 30/50/1/etc......
pm.min_spare_servers
pm.max_spare_servers
pm.max_requests

(on nginx virtual server conf file)
fastcgi_pass
fastcgi_buffers
fastcgi_buffer_size
fastcgi_connect_timeout
fastcgi_send_timeout
fastcgi_read_timeout

Every combination with any value on the above parameters didn't change anything on the "502" errors behaviour. They keep appearing sometimes.

So, we tried dumping the core of PHP on process crash with GDB. We found out that every time a 502 error is thrown, we receive always the same identical dump (I mean, with the same error). This is an example of a dump:

GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/sbin/php5-fpm...Reading symbols from /usr/lib/debug/.build-id/d4/62618919aec6e5b126ad219b9d08046ef6b875.debug...done.
done.
[New LWP 17814]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `php-fpm: pool www                                                       '.
Program terminated with signal SIGBUS, Bus error.
#0  lex_scan ([email protected]=0x7fff14a7b0b8) at Zend/zend_language_scanner.c:1082
1082    Zend/zend_language_scanner.c: no such file or directory.

The error is this one:

#0  lex_scan ([email protected]=0x7fff14a7b0b8) at Zend/zend_language_scanner.c:1082
1082    Zend/zend_language_scanner.c: no such file or directory.

If we search this error on the internet, we find little or nothing.

We tried recompiling Phalcon, after an upgrade of PHP (from PHP 5.6.24 to 5.6.27), but the error keeps appearing.

We honestly can't understand what else we should do to have an explanation on this error and understand how to solve it definitively.

Thank you for your help.

Post result from gdb. It looks like you are missing some package from PHP. Like you don't have full php or something.



4.3k

Post result from gdb. It looks like you are missing some package from PHP. Like you don't have full php or something.

You can see results from gdb in my post. I understand that I could have some PHP packages missing or something like that, but even upgrading to PHP 5.6.27 didn't solve the problem.

edited Nov '16

I mean backtrace. Post result from bt command in gdb.

You are using socket or tcp/ip ? Try to switch to tcp/ip maybe.

Provide please steps to reproduce. What exactly I should to do on new machine to reproduce

edited Nov '16

I mean backtrace. Post result from bt command in gdb.

You are using socket or tcp/ip ? Try to switch to tcp/ip maybe.

That's very less likely to be the issue here.

Random segfaults have deeper meaning. Actually potential issues with default kernel /proc/sys/net/core/somaxconn parameter would report different errors in FPM/nginx log, and not SEGFAULTs.

I'm using Phalcon v2.0.13 as well with manually compiled PHP 5.6.28 w/o any issue. So, in this case of OP it must be something buggy in the app itself.

It could also be that Memcached has gone away at some point during app execution.



4.3k
Accepted
answer

After all the tests we conducted, we found out that the problem is related to Volt with option compileAlways set to true.

https://github.com/phalcon/cphalcon/issues/1949

https://github.com/phalcon/cphalcon/issues/11507

We noticed that configuring PHP-FPM with option pm.max_children more than 1 (which means that PHP-FPM can satisfy more requests at the same time), the "502"s (Zend/zend_language_scanner.c error) are verifying. But, setting the option pm.max_children = 1 (so the PHP-FPM is serving at most 1 request at a time) apparently prevents the 502 from veryfing.

After some more research on the internet, we discovered the two bugs linked above. And setting Volt with compileAlways=false (with pm.max_children > 1) actually solved the problem.

So it seems that Volt can't compile the same template on concurrent requests.

Keeping compileAlways to true in production environment was of course a carelessness mistake. The option is false now and everything works well.

But that doesn't count.

The point is:

  • Why did we have to get to the core dumps to discover the problem?
  • Why isn't Volt throwing an Exception when it can't compile a template?
  • It obviously is a bug: why those two bugs were ignored?

Thank you



4.3k

I opened a new issue:

#12506