问题描述:

Is there a way to get sets of the pattern symbols?

For example, I have a regular expression `[az]+[A-Z]*`

. Then the symbol set of the first symbols is `a`

and `z`

. Then the symbol set of the second symbol is `a`

and `z`

. Then the symbol set of the third symbol is `a`

and `z`

. ....

The task is: I have a pattern and a string. Now I want to know whether the string start with the same characters as one of the string which match to the pattern.

**UPDATE:**

For example, I have a regular expression `[az]\\:[A-Z]*`

. Then the symbol set of the first symbols is `a`

and `z`

. Then the symbol set of the second symbol is `:`

. Then the symbol set of the third symbol is `A-Z`

. Then the symbol set of the fourth symbol is `A-Z`

. ....

It sounds like you are asking for a function that takes a regular expression as an argument and returns a set of characters that could match at a given offset into a string to be matched:

```
Set<Character> getSymbols(String regEx, int offset);
```

This is non-trivial.

Using your example:

```
getSymbols("[az]\\:[A-Z]*", 1)
```

should return ['a', 'z'],

```
getSymbols("[az]\\:[A-Z]*", 2)
```

should return [':'],

```
getSymbols("[az]\\:[A-Z]*", 3)
```

should return ['A', 'B', 'C', ..... 'Y', 'Z']

But this is a trivial input. What if the input was:

```
getSymbols("[abc]*FRED[xzy]*", 5)
```

Now you have to factor in the fact that any number of "abc" characters could proceed FRED, and would shift everything else, leading to a result set like this:

```
1: ['a', 'b', 'c', 'F']
2: ['a', 'b', 'c', 'F', 'R']
3: ['a', 'b', 'c', 'F', 'R', 'E']
4: ['a', 'b', 'c', 'F', 'R', 'E', 'D']
5: ['a', 'b', 'c', 'x', 'y', 'z', 'F', 'R', 'E', 'D']
```

The code that solves that has to parse regular expressions, which has a lot of expressiveness with all the escape characters (\w for whitespace, etc. etc.), then needs a recursive algorithm to build the output set.

If this is what you intend, the next question is, "What problem are you really trying to solve?"