注册 登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

月伴流星的博客

 
 
 

日志

 
 

Autoint入门指南四- 正则表达式  

2009-11-08 18:48:50|  分类: AU3_程序语言 |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

这里有一个简单的示例,让我们揭开 StringRegExp() 的面纱.

StringRegExp( "测试", "参数" [, flag ] )

"测试" = 需要搜索匹配的字符串.
"参数" =由一些特定字符串来匹配您 正好 要找的字符串. 没有 if, and, 或者 but.. 是匹配或者不匹配.
标志[可选] =告诉函数如果有和"参数"匹配的字符串, 是返回第一个匹配的返回值,还是返回所有匹配的结果.
那是相当简单当你可能已经理解, "参数" 字符串是调用 StringRegExp() 惟一困难的部分 (forthwith: SRE). 我能找到的最好的看法是 "参数" 是用来调用函数匹配字符串的字符串. 这里有不同的方法来找到所需要的某几个字符串: 如果您需要匹配 字符串 "test", 那应该是足够简单的. 您想告诉 SRE 找到 第一个 "t". 如果能找到一个, 就假设已经字符串是匹配的, 然后重置所有参数尝试去证明找到的字符串是不匹配的. 因此, 如果下一个是 "e", 这说明是匹配的. 让我们假设下一个字母为 "x". SRE 会立马认出这个不匹配,因为第三个字符和您认可的 "s" 并不相同.

例子 1

MsgBox(0, "SRE 例子 1 结果", StringRegExp("text", 'test'))

在这个例子中, 消息框会显示 "0", 意思就是参数(pattern) "test" 不能在字符串 "text" 中找到.(原谅下我按照英语中的语法来说) 我知道这个似乎看起来太简单了, 但是我想现在您应该明白了为什么没有找到.

这次我们指定参数(pattern)为 ("[ ... ]"). 相同的,你也可以设置它为一个逻辑函数 "OR". 假设我们使用上一个例子. 我们 想查找字符串 "test" 或者 字符串 "text"中的任意一个字符. 因此, 我会 想象SRE的(对这个参数)想法: 首先 我想要字符会去匹配 "t", 然后是字母 "e", 然后我想下一个字符匹配两个字符中的任意一个. 比如匹配 "s" 或者 "x", 因此我们需要使用 一个参数来替代这个字符: "[sx]",意思是匹配一个字母 "s" 或者一个字母 "x". 最后我想下一个字母匹配 "t" .

例子 2

MsgBox(0, "SRE 例子 2 结果", StringRegExp("text", 'te[sx]t'))
MsgBox(0, "SRE 例子 2 结果", StringRegExp("test", 'te[sx]t'))

上面两个代码都会返回 "1", 因为使用的参数(pattern)可以同时匹配 "test" 和 "text".

您同样可以设置匹配字符多少次,使用 "{number of matches}" ,中文说法: "{匹配多少次}" 或者你可以定义一个范围,使用: "{最小, 最大}". 下方第一个例子是无意义的, 但是应该能让你明白我说的意思:

例子 3

MsgBox(0, "SRE 例子 3 结果", StringRegExp("text", 't{1}e{1}[sx]{1}t{1}'))
MsgBox(0, "SRE 例子 3 结果", StringRegExp("aaaabbbbcccc", 'b{4}'))



不是那么简单了

您现在是否会立即这样想: "为什么不用 StringInStr() 函数呢?". Well, using a "flag" value of 0, most of the time you're right. But SRE is much more powerful than that. As you use SRE's more and more, you'll find you might know less and less about the type of pattern you are looking for. There are ways to be less and less specific about each character you wish to specify in the pattern. Take, for example, a line from the chat log of a game: "Gnarly Monster hits you for 18 damage." You want to find out how much damage Gnarly Monster hit you for. Well, you can't use StringInStr() because you aren't looking for "18", you're looking for "????", where ? could be any digit.

Here's how I would assemble this pattern. Look at what you do and do not know about what you want to find:
1) You know that it will ALWAYS contain nothing but digits.
2) You know that it will SOMETIMES be 2 characters long.
2a) You know from playing the game that the maximum damage a monster can do is 999.
2b) You know that the minimum damage a monster can do is 0.
3) You know that it will ALWAYS be between 1 and 3 characters long.
4) You know that there are no other digits in the test string.

At this point, I'd like to introduce the FLAG value of "1" and the grouping characters "()". The flag value of "1" means that SRE will not only match your pattern, but also return an array, with each element of the array consisting of a captured "group" of characters. So without veering off course too much, take this example:

Example 4

$asResult = StringRegExp("This is a test example", '(test)', 1)
If @error == 0 Then
     MsgBox(0, "SRE Example 4 Result", $asResult[0])
EndIf
$asResult = StringRegExp("This is a test example", '(te)(st)', 1)
If @error == 0 Then
     MsgBox(0, "SRE Example 4 Result", $asResult[0] & "," & $asResult[1])
EndIf

So, first the pattern must match somewhere in the test string. If it does, then SRE is told to "capture" any groups ("()") and store them in the return array. You can use multiple captures, as demonstrated by the second piece of code in Example 4.

Ok, back to the Gnarly Monster. Now that we know how to "capture" text, let's construct our pattern: Since you know what you're looking for is digits, there are 3 ways to specify "match any digit": "[:digit:]", "[0-9]", and "\d". The first is probably the easiest to understand. There are a few classes (digit, alnum, space, etc. Check the helpfile for a full list) you can use to specify sets of characters, one of them being digit. "[0-9]" just specifies a range of all the digits 0 through 9. "\d" is just a special character that means the same as the first two. There is no difference between the three, and with all SRE's there are usually at least a couple ways to construct any pattern.

So, first we know we want to capture the digits, so indicate that with the opening parentheses "(". Next, we know we want to capture between 1 and 3 characters, all consisting of digits, so our pattern now looks like "([0-9]{1,3}". And finally close it off with the closing parentheses to indicate the end of our group: "([0-9]{1,3})". Let's try it:

Example 5

$asResult = StringRegExp("Gnarly Monster hits you for 18 damage.", _
                               '([0-9]{1,3})', 1)
If @error == 0 Then
     MsgBox(0, "SRE Example 5 Result", $asResult[0])
EndIf

There you go, the message box correctly displays "18".

Next we need to cover non-capturing groups. The way you indicate these groups is by opening the group with "(?:" instead of just "(". Let's say your log says "You deflect 36 of Gnarly Monster's 279 damage." Now if you run Example 5's SRE on this, you'll come up with "36" instead of "279". Now what I like to do here is just determine what's different between the numbers. One that jumps out at me is that the second number is always followed by a space and then the word "damage". We could just modify our previous pattern to be "([0-9]{1,3} damage)", but what if our script is just looking for the amount of damage, without " damage" tacked onto the end of the number? Here's where you can use a non-capturing group to accomplish this.

Example 6

$asResult = StringRegExp("You deflect 36 of Gnarly Monster's 279 damage.", '([0-9]{1,3})(?: damage)', 1)
If @error == 0 Then
     MsgBox(0, "SRE Example 6 Result", $asResult[0])
EndIf

This could get lengthy, but mostly I just wanted to lay out the foundation for how regular expressions work, and mainly how SRE "thinks". A few things to keep in mind:
- Remember to think about the pattern one character at a time
- The StringRegExp() function finds the first character in the pattern, then it's your job to provide enough
evidence to "prove" whether or not it truly is a match. Example 6 is a good display of this.
- Remember [ ... ] means OR ([xyz] match an "x", a "y", OR a "z")
If you have any other questions, consult the help file first! It explains in detail all of the nitty gritty syntax that comes along with SRE's. One thing to look at in particular is the section on "Repeating Characters". It can make your pattern more readible by substituting certain characters for ranges. For example: "*" is equivalent to {0,} or the range from 0 to any number of characters.

Good luck, Regular Expressions can greatly decrease the length of your code, and make it easier to modify later. Corrections and feedback are welcome!

Resources


StringRegExp() 例子的GUI ,用于测试各类的 patterns

  评论这张
 
阅读(426)| 评论(0)
推荐 转载

历史上的今天

在LOFTER的更多文章

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2017