regexp

Definition：

用于对字符串进行查找，大小写敏感。

startIndex = regexp(str,expression)

返回与正则表达式指定的字符模式匹配的每个str子字符串的起始索引。如果没有匹配，startIndex就是一个空数组。

[startIndex,endIndex] = regexp(str,expression)

返回起始索引和结束索引。

out = regexp(str,expression,outkey)

返回由outkey指定的输出。例如，如果outkey是'match'，那么regexp将返回与表达式匹配的子字符串，而不是它们的起始索引。

[out1,...,outN] = regexp(str,expression,outkey1,...,outkeyN)

用于指定多个输出关键字outkey，获得多个输出。

outkey：

'start'：起始索引；
'end'：结束索引；
'tokenExtents'：返回HTML标签的起始和结束索引；
'match'：匹配到的文本；
'tokens'：返回匹配的HTML标签；
'names'：匹配数值并分配给命名；
'split'：被expression分隔开的str的非匹配子字符串的文本。

examples：

普通索引匹配

str = 'bat cat can car coat court CUT ct CAT-scan';
expression = 'c[aeiou]+t';
startIndex = regexp(str,expression)

startIndex = 1×2
     5    17

多个字符串同时匹配

str = {'Madrid, Spain','Romeo and Juliet','MATLAB is great'};
capExpr = '[A-Z]';
capStartIndex = regexp(str,capExpr);
celldisp(capStartIndex)

capStartIndex{1} =
     1     9
capStartIndex{2} =
     1    11
capStartIndex{3} =
     1     2     3     4     5     6

字符串匹配（'match'）

str = 'EXTRA! The regexp function helps you relax.';
expression = 'w*xw*';
matchStr = regexp(str,expression,'match');
celldisp(matchStr)

matchStr{1} =
regexp
matchStr{2} =
relax

非匹配文本

str = 'She sells sea shells by the seashore.';
expression = '[Ss]h.';
[match,noMatch] = regexp(str,expression,'match','split')

match = 1×3 cell 数组
    {'She'}    {'she'}    {'sho'}

combinedStr = strjoin(noMatch,match)

combinedStr = 'She sells sea shells by the seashore.'

捕获HTML标记

str = '<title>My Title</title><p>Here is some text.</p>';
expression = '<(w+).*>.*</1>';
[tokens,matches] = regexp(str,expression,'tokens','match');

tokens{1}{1} =
title
tokens{2}{1} =
p

matches{1} =
<title>My Title</title>
matches{2} =
<p>Here is some text.</p>

Enclosing w+ in parentheses captures the name of the HTML tag in a token. （回溯引用）

命名匹配分配（'names'）

str = '01/11/2000  20-02-2020  03/30/2000  16-04-2020';
expression = ['(?<month>d+)/(?<day>d+)/(?<year>d+)|'...
              '(?<day>d+)-(?<month>d+)-(?<year>d+)'];
tokenNames = regexp(str,expression,'names');
for k = 1:length(tokenNames)
disp(tokenNames(k))
end

    month: '01'
      day: '11'
     year: '2000'

    month: '02'
      day: '20'
     year: '2020'

    month: '03'
      day: '30'
     year: '2000'

    month: '04'
      day: '16'
     year: '2020'

(?<name>d+) finds one or more numeric digits and assigns the result to the token indicated by name.

regexpi

和regexp用法类似，大小写不敏感。

regexprep

newStr = regexprep(str,expression,replace)

Replaces the text in str that matches expression with the text described by replace. The regexprep function returns the updated text in newStr.

examples

回溯引用替换

str = 'I walk up, they walked up, we are walking up.';
expression = 'walk(w*?) up';
replace = 'ascend$1';

newStr = regexprep(str,expression,replace)

newStr = 'I ascend, they ascended, we are ascending.'

引用函数（这个似乎只有在MATLAB里面试用，在其他场合如Notepad++并不适用）

str = 'here are two sentences. neither is capitalized.';
expression = '(^|.)s*.';
replace = '${upper($0)}';

newStr = regexprep(str,expression,replace)

newStr = 'Here are two sentences. Neither is capitalized.'

The replace expression calls the upper function for the currently matching character ($0).