特别转义序列
------------
PCRE 的转义符号例如 `\d`,`\s` 以及 `\w` 等需要特别注意,因为在字符串语义中,反斜线字符 `\` 会被 Lua 语言解析器和 Nginx 配置文件解析器在执行前同时处理掉,所以以下代码片段将无法按预期运行:
```nginx
# nginx.conf
? location /test {
? content_by_lua '
? local regex = "\d+" -- 这里是错的!!
? local m = ngx.re.match("hello, 1234", regex)
? if m then ngx.say(m[0]) else ngx.say("not matched!") end
? ';
? }
# 结果为 "not matched!"
```
为避免这个问题,需要双重转义反斜线符号:
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = "\\\\d+"
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# 结果为 "1234"
```
这里的 `\\\\d+`,先被 Nginx 配置文件解析器处理成 `\\d+` ,再被 Lua 语言解析器处理成 `\d+`,之后才被执行。
或者,正则表达式模板可以使用 Lua 字符串"长括号"语义写出,其语法形式为 `[[...]]`,在这种情况下,反斜线仅需为 Nginx 配置文件解析器转义一次。
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = [[\\d+]]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# 结果为 to "1234"
```
这里,`[[\\d+]]` 被 Nginx 配置文件解析器处理成 `[[\d+]]`,符合预期。
注意,当正则表达式模板中包括 `[...]` 序列时,Lua 语言中“更长的长括号”形式 `[=[...]=]` 是必要的。如果需要,可以将`[=[...]=]` 作为默认形式。
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = [=[[0-9]+]=]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# 结果为 "1234"
```
还有一种转义 PCRE 序列的方法是把 Lua 代码放到外部脚本文件中,通过各种 `*_by_lua_file` 指令执行。在这种方法中,反斜线仅被 Lua 语言解析器处理,因此只需要转义一次。
```lua
-- test.lua
local regex = "\\d+"
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
-- 结果为 "1234"
```
在外部脚本文件中,PCRE 序列如果使用“长括号”形式 Lua 字符串,则无需修改。
```lua
-- test.lua
local regex = [[\d+]]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
-- 结果为 "1234"
```
> English source:
PCRE sequences such as `\d`, `\s`, or `\w`, require special attention because in string literals, the backslash character, `\`, is stripped out by both the Lua language parser and by the Nginx config file parser before processing. So the following snippet will not work as expected:
```nginx
# nginx.conf
? location /test {
? content_by_lua '
? local regex = "\d+" -- THIS IS WRONG!!
? local m = ngx.re.match("hello, 1234", regex)
? if m then ngx.say(m[0]) else ngx.say("not matched!") end
? ';
? }
# evaluates to "not matched!"
```
To avoid this, *double* escape the backslash:
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = "\\\\d+"
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# evaluates to "1234"
```
Here, `\\\\d+` is stripped down to `\\d+` by the Nginx config file parser and this is further stripped down to `\d+` by the Lua language parser before running.
Alternatively, the regex pattern can be presented as a long-bracketed Lua string literal by encasing it in "long brackets", `[[...]]`, in which case backslashes have to only be escaped once for the Nginx config file parser.
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = [[\\d+]]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# evaluates to "1234"
```
Here, `[[\\d+]]` is stripped down to `[[\d+]]` by the Nginx config file parser and this is processed correctly.
Note that a longer from of the long bracket, `[=[...]=]`, may be required if the regex pattern contains `[...]` sequences.
The `[=[...]=]` form may be used as the default form if desired.
```nginx
# nginx.conf
location /test {
content_by_lua '
local regex = [=[[0-9]+]=]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
';
}
# evaluates to "1234"
```
An alternative approach to escaping PCRE sequences is to ensure that Lua code is placed in external script files and executed using the various `*_by_lua_file` directives.
With this approach, the backslashes are only stripped by the Lua language parser and therefore only need to be escaped once each.
```lua
-- test.lua
local regex = "\\d+"
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
-- evaluates to "1234"
```
Within external script files, PCRE sequences presented as long-bracketed Lua string literals do not require modification.
```lua
-- test.lua
local regex = [[\d+]]
local m = ngx.re.match("hello, 1234", regex)
if m then ngx.say(m[0]) else ngx.say("not matched!") end
-- evaluates to "1234"
```
[返回目录](#nginx-api-for-lua)
- Name yuansheng-8.4 WenMing(√)
- Status yuansheng-8.6 WenMing(√)
- Version
- Synopsis yuansheng-8.6 WenMing(√)
- Description yuansheng-8.16 WenMing(√)
- Typical Uses yuansheng-8.16 WenMing(√)
- Nginx Compatibility yuansheng-8.17 WenMing(√)
- Installation yuansheng-8.17 WenMing(√)
- C Macro Configurations yuansheng-8.17 WenMing(√)
- Installation on Ubuntu 11.10 yuansheng-8.18
- Community
- English Mailing List
- Chinese Mailing List
- Code Repository yuansheng-8.20 WenMing(√)
- Bugs and Patches yuansheng-8.20 WenMing(√)
- Lua/LuaJIT bytecode support yuansheng-8.31
- System Environment Variable Support yuansheng-9.1
- HTTP 1.0 support lance-2015.8.13
- Statically Linking Pure Lua Modules yuansheng-9.1
- Nginx Worker内的数据共享 lance-2015.8.5
- Known Issues
- TCP socket connect operation issues yuansheng-9.2
- Lua Coroutine Yielding/Resuming yuansheng-9.2
- Lua Variable Scope yuansheng-9.2
- Locations Configured by Subrequest Directives of Other Modules lance-2015.8.12
- Cosockets Not Available Everywhere yuansheng-9.2
- 特别转义序列 lance-2015.8.5
- Mixing with SSI Not Supported yuansheng-9.2
- SPDY Mode Not Fully Supported yuansheng-9.2
- Missing data on short circuited requests yuansheng-9.2
- TODO yuansheng-9.3
- Changes yuansheng-9.3 WenMing(√)
- Test Suite yuansheng-9.3
- Copyright and License yuansheng-9.3
- See Also yuansheng-9.3
- Directives yuansheng-9.3
- Nginx API for Lua yuansheng-9.3
- Obsolete Sections yuansheng-9.3
- lua_use_default_type hambut 2015.8.5
- lua_code_cache hambut 2015.8.5
- lua_regex_cache_max_entries yuansheng-9.3
- lua_regex_match_limit yuansheng-9.3
- lua_package_path yuansheng-9.4
- lua_package_cpath yuansheng-9.4
- init_by_lua yuansheng-9.4
- init_by_lua_file yuansheng-9.4
- init_worker_by_lua yuansheng-9.6
- init_worker_by_lua_file yuansheng-9.6
- set_by_lua yuansheng-9.6
- set_by_lua_file yuansheng-9.6
- content_by_lua dengshiyong 2015.8.12 WenMing(√)
- content_by_lua_file yuansheng-9.19
- rewrite_by_lua yuansheng-9.19
- rewrite_by_lua_file yuansheng-9.19
- access_by_lua yuansheng-9.27
- access_by_lua_file yuansheng-9.28
- header_filter_by_lua liujinxuan 2015.9.1
- header_filter_by_lua_file yuansheng-9.28
- body_filter_by_lua yuansheng-9.28
- body_filter_by_lua_file yuansheng-9.28
- log_by_lua yuansheng-9.28
- log_by_lua_file yuansheng-9.28
- lua_need_request_body yuansheng-9.28
- lua_shared_dict lance-2015.8.20
- lua_socket_connect_timeout yuansheng-9.28
- lua_socket_send_timeout yuansheng-9.28
- lua_socket_send_lowat yuansheng-9.28
- lua_socket_read_timeout yuansheng-9.28
- lua_socket_buffer_size yuansheng-9.28
- lua_socket_pool_size yuansheng-9.28
- lua_socket_keepalive_timeout yuansheng-9.28
- lua_socket_log_errors yuansheng-9.28
- lua_ssl_ciphers yuansheng-9.29
- lua_ssl_crl yuansheng-9.29
- lua_ssl_protocols yuansheng-9.29
- lua_ssl_trusted_certificate yuansheng-9.29
- lua_ssl_verify_depth yuansheng-9.29
- lua_http10_buffering yuansheng-9.29
- rewrite_by_lua_no_postpone yuansheng-9.29
- lua_transform_underscores_in_response_headers yuansheng-9.29
- lua_check_client_abort yuansheng-9.29
- lua_max_pending_timers yuansheng-9.29
- lua_max_running_timers yuansheng-9.29
- ngx.arg lance-2015.8.19
- ngx.var.VARIABLE lance-2015.8.19
- Core constants lance-2015.8.14 WenMing(√)
- HTTP method constants lance-2015.8.13
- HTTP status constants lance-2015.8.13
- Nginx log level constants lance-2015.8.13
- print lance-2015.8.14
- ngx.ctx lance-2015.8.14
- ngx.location.capture lance-2015.8.11
- ngx.location.capture_multi lance-2015.8.11
- ngx.status lance-2015.8.18 yuansheng(√)
- ngx.header.HEADER lance-2015.9.6 yuansheng(√)
- ngx.resp.get_headers lance-2015.9.7 yuansheng(√)
- ngx.req.start_time lance-2015.9.9 yuansheng(√)
- ngx.req.http_version lance-2015.9.9 yuansheng(√)
- ngx.req.raw_header lance-2015.9.9 yuansheng(√)
- ngx.req.get_method lance-2015.9.9 yuansheng(√)
- ngx.req.set_method lance-2015.9.9 yuansheng(√)
- ngx.req.set_uri lance-2015.9.9
- ngx.req.set_uri_args lance-2015.9.10
- ngx.req.get_uri_args lance-2015.9.10
- ngx.req.get_post_args lance-2015.9.10
- ngx.req.get_headers lance-2015.9.11
- ngx.req.set_header lance-2015.9.14
- ngx.req.clear_header lance-2015.9.14
- ngx.req.read_body lance-2015.9.16
- ngx.req.discard_body lance-2015.9.24
- ngx.req.get_body_data lance-2015.9.24
- ngx.req.get_body_file lance-2015.9.28
- ngx.req.set_body_data lance-2015.9.28
- ngx.req.set_body_file lance-2015.9.28
- ngx.req.init_body lance-2015.9.28
- ngx.req.append_body lance-2015.9.28
- ngx.req.finish_body lance-2015.9.28
- ngx.req.socket yuansheng-10.12
- ngx.exec yuansheng-10.12
- ngx.redirect yuansheng-10.12
- ngx.send_headers yuansheng-10.12
- ngx.headers_sent yuansheng-10.12
- ngx.print lance-2015.8.7
- ngx.say lance-2015.8.7
- ngx.log lance-2015.8.13
- ngx.flush lance-2015.8.13
- ngx.exit lance-2015.8.13
- ngx.eof lance-2015.8.18
- ngx.sleep lance-2015.8.18
- ngx.escape_uri lance-2015.8.18
- ngx.unescape_uri lance-2015.8.18
- ngx.encode_args lance-2015.8.18
- ngx.decode_args lance-2015.8.18
- ngx.encode_base64 hambut-2015.9.9
- ngx.decode_base64 hambut-2015.9.9
- ngx.crc32_short hambut-2015.9.9
- ngx.crc32_long hambut-2015.9.9
- ngx.hmac_sha1 hambut-2015.9.9
- ngx.md5 hambut-2015.9.9
- ngx.md5_bin hambut-2015.9.9
- ngx.sha1_bin hambut-2015.9.9
- ngx.quote_sql_str hambut-2015.9.9
- ngx.today bells-2015.8.8
- ngx.time bells-2015.8.22
- ngx.now bells-2015.8.22
- ngx.update_time bells-2015.8.16
- ngx.localtime bells-2015.8.22
- ngx.utctime bells-2015.8.22
- ngx.cookie_time yuansheng-10.12
- ngx.http_time yuansheng-10.10
- ngx.parse_http_time yuansheng-10.10
- ngx.is_subrequest yuansheng-10.8
- ngx.re.match lance-2015.8.6
- ngx.re.find lance-2015.8.6
- ngx.re.gmatch lance-2015.8.6
- ngx.re.sub lance-2015.8.6
- ngx.re.gsub lance-2015.8.6
- ngx.shared.DICT lance-2015.8.10
- ngx.shared.DICT.get lance-2015.8.10
- ngx.shared.DICT.get_stale lance-2015.8.10
- ngx.shared.DICT.set lance-2015.8.10
- ngx.shared.DICT.safe_set lance-2015.8.10
- ngx.shared.DICT.add lance-2015.8.10
- ngx.shared.DICT.safe_add lance-2015.8.10
- ngx.shared.DICT.replace lance-2015.8.10
- ngx.shared.DICT.delete lance-2015.8.10
- ngx.shared.DICT.incr lance-2015.8.10
- ngx.shared.DICT.flush_all lance-2015.8.10
- ngx.shared.DICT.flush_expired lance-2015.8.10
- ngx.shared.DICT.get_keys lance-2015.8.10
- ngx.socket.udp yuansheng-10.8
- udpsock:setpeername yuansheng-10.8
- udpsock:send yuansheng-10.8
- udpsock:receive yuansheng-10.8
- udpsock:close yuansheng-10.8
- udpsock:settimeout yuansheng-10.8
- ngx.socket.tcp yuansheng-10.7
- tcpsock:connect yuansheng-10.7
- tcpsock:sslhandshake yuansheng-10.7
- tcpsock:send yuansheng-10.7
- tcpsock:receive yuansheng-10.5
- tcpsock:receiveuntil yuansheng-10.5
- tcpsock:close yuansheng-10.5
- tcpsock:settimeout yuansheng-10.5
- tcpsock:setoption yuansheng-10.5
- tcpsock:setkeepalive yuansheng-10.5
- tcpsock:getreusedtimes yuansheng-10.5
- ngx.socket.connect yuansheng-10.4
- ngx.get_phase yuansheng-10.2
- ngx.thread.spawn yuansheng-10.2
- ngx.thread.wait yuansheng-10.4
- ngx.thread.kill yuansheng-10.4
- ngx.on_abort yuansheng-10.2
- ngx.timer.at yuansheng-10.1
- ngx.config.debug yuansheng-9.30
- ngx.config.prefix yuansheng-9.30
- ngx.config.nginx_version yuansheng-9.30
- ngx.config.nginx_configure yuansheng-9.30
- ngx.config.ngx_lua_version yuansheng-9.30
- ngx.worker.exiting yuansheng-9.30
- ngx.worker.pid yuansheng-9.30
- ndk.set_var.DIRECTIVE yuansheng-9.30
- coroutine.create yuansheng-9.30
- coroutine.resume yuansheng-9.30
- coroutine.yield yuansheng-9.30
- coroutine.wrap yuansheng-9.30
- coroutine.running yuansheng-9.30
- coroutine.status yuansheng-9.30