个人应用之明文字串到正则
February 10th, 2010
Categories: 应用
近来工作中需要将某种明文字串转为简单的正则式。手动做当然可以,但是大量重复性的劳动,自然是交给机器处理为好。昨晚写了一款这样的脚本,放在这里。因为是处理我自己的工作的脚本,贴在这里仅作记录和存档之用,可能对别人没什么实际作用。当然,从现有的明文字串到正则式的转换,应该是个不错的题目,有兴趣朋友的可以深究。
值得一提的是,代码中用了$&, (?{}) 这样的perl only的东东,明晰了思路,简化了代码。如果不使用这种特性的话,代码要长5倍。另外,据说从效率上来说,use English之后,使用$MATCH比直接使用$&快5倍。但是对于即输入即执行的命令行程序来说,$&已经足够好。
实际应用一例:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | perl hash2re.pl H:aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA0.zip/H:aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0/aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0/aaa/Aaaaa/aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0.exe RE 1: ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}[0-9]\.zip$ Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA0.zip" RE 2: ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]$ Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0" RE 3: ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]$ Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0" RE 4: ^[a-z]{3}$ Matches: "aaa" RE 5: ^[A-Z][a-z]{4}$ Matches: "Aaaaa" RE 6: ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]\.exe$ Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0.exe" |
源码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | #!/usr/bin/perl # by rex zhang # Feb 09 2010 in Shanghai # usage: split and regexize hashed filename # my $lines=$ARGV[0]; while($lines =~ m#(C:[^/]+)#) { $c=$1; $lines =~ s/$c//; print "ClearText Filename Ignored:\t\"$c\"\n\n"; } my @array=split(m!\s*(?:\/|H:)+\s*!, $lines); my $counter=0; foreach $line (@array){ next if not $line; my $re=$line; local $len; $re =~ s/(?=[.\[\]()])/\\/g; $re =~ s/\?/./g; $re =~ s/0+(?{ $len=length($&)})/[0-9]\{$len\}/g; $re =~ s/A+(?{ $len=length($&)})/[A-Z]\{$len\}/g; $re =~ s/a+(?{ $len=length($&)})/[a-z]\{$len\}/g; $re =~ s/(.)\1+(?{ $len=length($&)})/$1\{$len\}/g; $re =~ s/\{1\}//g; $re = "\^$re\$"; $counter++; if ($line =~ /$re/) { print "RE $counter:\t$re\n"; print "\tMatches: \"$line\"\n"; } else { print "RE $counter:\t$re\n"; print "\tFailed: \"$line\"\n"; } print "\n"; } |
Awesome! rex++
[Reply]
rex Reply:
February 11th, 2010 at 12:28 am
[Reply]