比ngx_http_substitutions_filter_module 更强大的替换模块sregex的replace-filter-nginx-module

之前写过nginx反代替换的教程（传送门），使用了ngx_http_substitutions_filter_module模块。不过这货只能替换同一行，具有局限性-_-#

现在一个更强大的替换模块来了……replace-filter-nginx-module

下面只翻译一下，再加个安装教程，因为我自己也没弄懂怎样玩= =

 

1.安装此模块需要先安装sregex运行库

apt-get update;
apt-get install git make gcc -y  
#Centos改成yum
git clone https://github.com/agentzh/sregex
cd sregex
make
make install
cd ..
git clone https://github.com/agentzh/replace-filter-nginx-module
wget http://nginx.org/download/nginx-1.2.6.tar.gz
tar zxvf nginx-1.2.6.tar.gz
cd nginx-1.2.6
./configure --add-module=../replace-filter-nginx-module  #自行加其他编译参数
make
make install
nginx.conf的用法举例：


location /t {
    default_type text/html;
    echo abc;
    replace_filter 'ab|abc' X;
}
 
location / {
    # proxy_pass/fastcgi_pass/...
 
    # caseless global substitution:
    replace_filter 'd+' 'blah blah' 'ig';
    replace_filter_types text/plain text/css;
}
Syntax语法：

 
 
^             匹配起始行数
$             匹配末尾行数
 
A match only at beginning of stream
z match only at end of stream
 
 match a word boundary
B match except at a word boundary
 
. match any char
C match a single C-language char (octet)
 
[ab0-9] character classes (positive)
[^ab0-9] character classes (negative)
 
d match a digit character ([0-9])
D match a non-digit character ([^0-9])
 
s match a whitespace character ([ f

	])
S match a non-whitespace character ([^ f

	])
 
h match a horizontal whitespace character
H match a character that isn't horizontal whitespace
 
v match a vertical whitespace character
V match a character that isn't vertical whitespace
 
w match a "word" character ([A-Za-z0-9_])
W match a non-"word" character ([^A-Za-z0-9_])
 
cK control char (example: VT)
 
N match a character that isn't a newline
 
ab concatenation; first match a, and then b
a|b alternation; match a or b
 
(a) capturing parentheses
(?:a) non-capturing parantheses
 
a? match 1 or 0 times, greedily
a* match 0 or more times, greedily
a+ match 1 or more times, greedily
 
a?? match 1 or 0 times, not greedily
a*? match 0 or more times, not greedily
a+? match 1 or more times, not greedily
 
a{n} match exactly n times
a{n,m} match at least n but not more than m times, greedily
a{n,} match at least n times, greedily
 
a{n}? match exactly n times, not greedily (redundant)
a{n,m}? match at least n but not more than m times, not greedily
a{n,}? match at least n times, not greedily
 

作者信息：
Yichun “agentzh” Zhang (章亦春) agentzh@gmail.com

Syntax Supported

The following Perl 5 regex syntax features have already been implemented.

^             match the beginning of lines
$             match the end of lines

A            match only at beginning of stream
z            match only at end of stream

            match a word boundary
B            match except at a word boundary

.             match any char
C            match a single C-language char (octet)

[ab0-9]       character classes (positive)
[^ab0-9]      character classes (negative)

d            match a digit character ([0-9])
D            match a non-digit character ([^0-9])

s            match a whitespace character ([ f

	])
S            match a non-whitespace character ([^ f

	])

h            match a horizontal whitespace character
H            match a character that isn't horizontal whitespace

v            match a vertical whitespace character
V            match a character that isn't vertical whitespace

w            match a "word" character ([A-Za-z0-9_])
W            match a non-"word" character ([^A-Za-z0-9_])

cK           control char (example: VT)

N            match a character that isn't a newline

ab            concatenation; first match a, and then b
a|b           alternation; match a or b

(a)           capturing parentheses
(?:a)         non-capturing parantheses

a?            match 1 or 0 times, greedily
a*            match 0 or more times, greedily
a+            match 1 or more times, greedily

a??           match 1 or 0 times, not greedily
a*?           match 0 or more times, not greedily
a+?           match 1 or more times, not greedily

a{n}          match exactly n times
a{n,m}        match at least n but not more than m times, greedily
a{n,}         match at least n times, greedily

a{n}?         match exactly n times, not greedily (redundant)
a{n,m}?       match at least n but not more than m times, not greedily
a{n,}?        match at least n times, not greedily

The following escaping sequences are supported:

	          tab

          newline

          return
f          form feed
a          alarm
e          escape
          backspace (in character class only)
x{}, x00  character whose ordinal is the given hexadecimal number
o{}, 00  character whose ordinal is the given octal number

Escaping a regex meta character yields the literal character itself, like { and ..

Only the octet mode is supported; no multi-byte character encoding love (yet).

Installation

make
make install

Gnu make and gcc are required. (On operating systems like FreeBSD and Solaris, you should typegmake instead of make here.)

It will build libsregex.so (or libsregex.dylib on Mac OS X), libsregex.a, and the command-line utility sregex-cli and install them into the prefix /usr/local/ by default.

If you want to install into a custom location, then just specify the PREFIX variable like this:

make PREFIX=/opt/sregex
make install PREFIX=/opt/sregex

If you are building a binary package (like an RPM package), then you will find the DESTDIR variable handy, as in

make PREFIX=/opt/sregex
make install PREFIX=/opt/sregex DESTDIR=/path/to/my/build/root

If you run make distclean before make, then you also need bison 2.7+ for generating the regex parser files.

Synopsis

    location /t {
        default_type text/html;
        echo abc;
        replace_filter 'ab|abc' X;
    }

    location / {
        # proxy_pass/fastcgi_pass/...

        # caseless global substitution:
        replace_filter 'd+' 'blah blah' 'ig';
        replace_filter_types text/plain text/css;
    }

    location /a {
        # proxy_pass/fastcgi_pass/root/...

        # remove line-leading spaces and line-trailing spaces,
        # as well as blank lines:
        replace_filter '^s+|s+$' '' g;
    }

    location /b {
        # proxy_pass/fastcgi_pass/root/...

        # only remove line-leading spaces and line-trailing spaces:
        replace_filter '^[ f	]+|[ f	]+$' '' g;
    }

    location ~ '.cpp$' {
        # proxy_pass/fastcgi_pass/root/...

        replace_filter_types text/plain;

        # skip C/C++ string literals:
        replace_filter "'(?:\\[^
]|[^'
])*'" $& g;
        replace_filter '"(?:\\[^
]|[^"
])*"' $& g;

        # remove all those ugly C/C++ comments:
        replace_filter '/*.*?*/|//[^
]*' '' g;
    }