- regexprep——用于对字符串进行查找并替换。
regexp
Definition:
用于对字符串进行查找,大小写敏感。
- startIndex = regexp(str,expression)
返回与正则表达式指定的字符模式匹配的每个str子字符串的起始索引。如果没有匹配,startIndex就是一个空数组。
- [startIndex,endIndex] = regexp(str,expression)
返回起始索引和结束索引。
- out = regexp(str,expression,outkey)
返回由outkey指定的输出。例如,如果outkey是'match',那么regexp将返回与表达式匹配的子字符串,而不是它们的起始索引。
- [out1,...,outN] = regexp(str,expression,outkey1,...,outkeyN)
用于指定多个输出关键字outkey,获得多个输出。
outkey:
- 'start':起始索引;
- 'end':结束索引;
- 'tokenExtents':返回HTML标签的起始和结束索引;
- 'match':匹配到的文本;
- 'tokens':返回匹配的HTML标签;
- 'names':匹配数值并分配给命名;
- 'split':被expression分隔开的str的非匹配子字符串的文本。
examples:
普通索引匹配
str = 'bat cat can car coat court CUT ct CAT-scan'; expression = 'c[aeiou]+t'; startIndex = regexp(str,expression)
startIndex = 1×2 5 17
多个字符串同时匹配
str = {'Madrid, Spain','Romeo and Juliet','MATLAB is great'}; capExpr = '[A-Z]'; capStartIndex = regexp(str,capExpr); celldisp(capStartIndex)
capStartIndex{1} = 1 9 capStartIndex{2} = 1 11 capStartIndex{3} = 1 2 3 4 5 6
字符串匹配('match')
str = 'EXTRA! The regexp function helps you relax.'; expression = 'w*xw*'; matchStr = regexp(str,expression,'match'); celldisp(matchStr)
matchStr{1} = regexp matchStr{2} = relax
非匹配文本
str = 'She sells sea shells by the seashore.'; expression = '[Ss]h.'; [match,noMatch] = regexp(str,expression,'match','split')
match = 1×3 cell 数组 {'She'} {'she'} {'sho'}
combinedStr = strjoin(noMatch,match)
combinedStr = 'She sells sea shells by the seashore.'
捕获HTML标记
str = '<title>My Title</title><p>Here is some text.</p>'; expression = '<(w+).*>.*</1>'; [tokens,matches] = regexp(str,expression,'tokens','match');
tokens{1}{1} = title tokens{2}{1} = p
matches{1} = <title>My Title</title> matches{2} = <p>Here is some text.</p>
Enclosing w+ in parentheses captures the name of the HTML tag in a token. (回溯引用)
命名匹配分配('names')
str = '01/11/2000 20-02-2020 03/30/2000 16-04-2020'; expression = ['(?<month>d+)/(?<day>d+)/(?<year>d+)|'... '(?<day>d+)-(?<month>d+)-(?<year>d+)']; tokenNames = regexp(str,expression,'names'); for k = 1:length(tokenNames) disp(tokenNames(k)) end
month: '01' day: '11' year: '2000' month: '02' day: '20' year: '2020' month: '03' day: '30' year: '2000' month: '04' day: '16' year: '2020'
(?<name>d+) finds one or more numeric digits and assigns the result to the token indicated by name.
regexpi
和regexp用法类似,大小写不敏感。
regexprep
- newStr = regexprep(str,expression,replace)
Replaces the text in str
that matches expression
with the text described by replace
. The regexprep
function returns the updated text in newStr
.
examples
回溯引用替换
str = 'I walk up, they walked up, we are walking up.'; expression = 'walk(w*?) up'; replace = 'ascend$1'; newStr = regexprep(str,expression,replace)
newStr = 'I ascend, they ascended, we are ascending.'
引用函数(这个似乎只有在MATLAB里面试用,在其他场合如Notepad++并不适用)
str = 'here are two sentences. neither is capitalized.'; expression = '(^|.)s*.'; replace = '${upper($0)}'; newStr = regexprep(str,expression,replace)
newStr = 'Here are two sentences. Neither is capitalized.'
The replace expression calls the upper function for the currently matching character ($0).