We have moved our forum to GitHub Discussions. For questions about Phalcon v3/v4/v5 you can visit here and for Phalcon v6 here.

Capturing any charachter

Hello

I'm having a problem with capturing any character (utf-8). I've tried:

/{oneLetter:[^/]}
/{oneLetter:.}
/{oneLetter:[\p{L}\p{M}0-9]{1}}

It looks like regex is compiled without u flag. Using just /{oneLetter} is matching any character but also a longer string.

Is it possible to achieve this with current Router?



9.2k

Hey Piotr,

you want to match the route if it is something like: https://www.examplae.de/u/ but not https://www.examplae.de/uv/

??

Hello Richi,

Something like that but with an UTF character which is not one byte.



9.2k
edited Mar '14

Bad News:

I tried to match one UTF-8 letter

$router->add(
        "#^/([Ե])/(/.*)*$#",
        array(
            "char" => 1,
            "controller" => "Index",
            "params" => 2
         )
    );

I tested it with this route:
https://localhost/Ե/
BUT the problem is not phalcon. The browser will ansi-parse and urlencode before sending the whole URI. So the URI will look like this:
https://localhost/%D4%B5/
Which wil render to this:
https://localhost/Ôµ/

I'm sorry, but you might need a translation table and a very long regex to accept UTF-8. This might help you for your plans: https://www.utf8-zeichentabelle.de/

edited Mar '14

Hello Richi

The problem IS Phalcon. It's because by default Phalcon does not add unicode flag to Regex. Try yours when you add u flag at the end of your regex (and setting mb_internal_encoding to utf8). I was able to make it working by defining 2 routes, one using standard way of defining routes and adding second route only for matching with u flag. This is not what i would like thought, because it's hard to maintain so many routes.