个人应用之明文字串到正则

February 10th, 2010 Categories: 应用

近来工作中需要将某种明文字串转为简单的正则式。手动做当然可以,但是大量重复性的劳动,自然是交给机器处理为好。昨晚写了一款这样的脚本,放在这里。因为是处理我自己的工作的脚本,贴在这里仅作记录和存档之用,可能对别人没什么实际作用。当然,从现有的明文字串到正则式的转换,应该是个不错的题目,有兴趣朋友的可以深究。

值得一提的是,代码中用了$&, (?{}) 这样的perl only的东东,明晰了思路,简化了代码。如果不使用这种特性的话,代码要长5倍。另外,据说从效率上来说,use English之后,使用$MATCH比直接使用$&快5倍。但是对于即输入即执行的命令行程序来说,$&已经足够好。

实际应用一例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
perl hash2re.pl H:aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA0.zip/H:aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0/aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0/aaa/Aaaaa/aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0.exe
RE 1:   ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}[0-9]\.zip$
        Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA0.zip"

RE 2:   ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]$
        Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0"

RE 3:   ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]$
        Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0"

RE 4:   ^[a-z]{3}$
        Matches: "aaa"

RE 5:   ^[A-Z][a-z]{4}$
        Matches: "Aaaaa"

RE 6:   ^[a-z]{3}-[A-Z][a-z]{3}-[A-Z][a-z]{3}[A-Z][a-z]{6}[A-Z][a-z]{7}-[A-Z]{3}-[0-9]\.exe$
        Matches: "aaa-Aaaa-AaaaAaaaaaaAaaaaaaa-AAA-0.exe"

源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#!/usr/bin/perl

#   by rex zhang
#   Feb 09 2010 in Shanghai

#   usage: split and regexize hashed filename
#

my $lines=$ARGV[0];
while($lines =~ m#(C:[^/]+)#)
{
    $c=$1;
    $lines =~ s/$c//;
    print "ClearText Filename Ignored:\t\"$c\"\n\n";

}
my @array=split(m!\s*(?:\/|H:)+\s*!, $lines);

my $counter=0;
foreach $line (@array){
    next if not $line;
    my $re=$line;
    local $len;    

    $re =~ s/(?=[.\[\]()])/\\/g;
    $re =~ s/\?/./g;
    $re =~ s/0+(?{ $len=length($&)})/[0-9]\{$len\}/g;
    $re =~ s/A+(?{ $len=length($&)})/[A-Z]\{$len\}/g;
    $re =~ s/a+(?{ $len=length($&)})/[a-z]\{$len\}/g;
    $re =~ s/(.)\1+(?{ $len=length($&)})/$1\{$len\}/g;
    $re =~ s/\{1\}//g;
    $re =  "\^$re\$";

    $counter++;
    if ($line =~ /$re/)
    {
        print "RE $counter:\t$re\n";
        print "\tMatches: \"$line\"\n";    
    }
    else
    {
        print "RE $counter:\t$re\n";
        print "\tFailed: \"$line\"\n";
    }
    print "\n";
}
Tags:

2 Responses to “个人应用之明文字串到正则”

  1. February 10th, 2010 at 17:00
    1

    Awesome! rex++

    [Reply]

    rex Reply:

    :)

    [Reply]

Leave a Comment