I can't deal with a regular expression to separate the argument from function.

The function takes arguments in following way:

``FunctionName(arg1;arg2;...;argn)``

Now to make the rest of my code work I need to do the following-put every argument in ():

``FunctionName((arg1);(arg2);(arg3))``

The problem is that the arg can be anything- a number, an operator, other function

The test code for the solution is:

The function before regexp:

``Function1((a1^5-4)/2;1/sin(a2);a3;a4)+Function2(a1;a2;1/a3)``

After i needd to get sth like this:

``Function1(((a1^5-4)/2);(1/sin(a2));(a3);(a4))+Function2((a1);(a2);(1/a3))``

Unless I'm missing something, isn't it as simple as replacing `;` with `);(` and surrounding the whole thing in `(` `)` ?

Using `Regex`:

``````(?:([^;()]+);?)+
``````

and `LINQ`:

``````string result = "FunctionName(" +
String.Join(";",
from Capture capture in
Regex.Matches(inputString, @"FunctionName\((?:([^;()]+);?)+\)")[0].Groups[1].
Captures
select "(" + capture.Value + ")") + ")";
``````

This is a far cry from a `Regex` but the potential for nested functions combined with the fact that this is a structured language being modified that a lexer/parser scheme is more appropriate.

Here is an example of a system that processes things of this nature

First, we define something that can be located in the input (the expression to modify)

``````public interface ISourcePart
{
/// <summary>
/// Gets the string representation of the kind of thing we're working with
/// </summary>
string Kind { get; }

/// <summary>
/// Gets the position this information is found at in the original source
/// </summary>
int Position { get; }

/// <summary>
/// Gets a representation of this data as Token objects
/// </summary>
/// <returns>An array of Token objects representing the data</returns>
Token[] AsTokens();
}
``````

Next, we'll define a construct for housing tokens (identifiable portions of the source text)

``````public class Token : ISourcePart
{
public int Position { get; set; }

public Token[] AsTokens()
{
return new[] {this};
}

public string Kind { get; set; }

/// <summary>
/// Gets or sets the value of the token
/// </summary>
public string Value { get; set; }

/// <summary>
/// Creates a new Token
/// </summary>
/// <param name="kind">The kind (name) of the token</param>
/// <param name="match">The Match the token is to be generated from</param>
/// <param name="index">The offset from the beginning of the file the index of the match is relative to</param>
/// <returns>The newly created token</returns>
public static Token Create(string kind, Match match, int index)
{
return new Token
{
Position = match.Index + index,
Kind = kind,
Value = match.Value
};
}

/// <summary>
/// Creates a new Token
/// </summary>
/// <param name="kind">The kind (name) of the token</param>
/// <param name="value">The value to assign to the token</param>
/// <param name="position">The absolute position in the source file the value is located at</param>
/// <returns>The newly created token</returns>
public static Token Create(string kind, string value, int position)
{
return new Token
{
Kind = kind,
Value = value,
Position = position
};
}
}
``````

We'll use `Regex`es to find our tokens in this example (below - Excerpt from Program.cs in my demo project).

``````/// <summary>
/// Breaks an input string into recognizable tokens
/// </summary>
/// <param name="source">The input string to break up</param>
/// <returns>The set of tokens located within the string</returns>
static IEnumerable<Token> Tokenize(string source)
{
var tokens = new List<Token>();
var sourceParts = new[] { new KeyValuePair<string, int>(source, 0) };

}
``````

As you can see, I've defined patterns for open and close parenthesis, semicolons, basic math operators and letters and numbers.

The `Tokenize` method is defined as follows (again from Program.cs in my demo project)

``````/// <summary>
/// Performs tokenization of a collection of non-tokenized data parts with a specific pattern
/// </summary>
/// <param name="tokenKind">The name to give the located tokens</param>
/// <param name="pattern">The pattern to use to match the tokens</param>
/// <param name="untokenizedParts">The portions of the input that have yet to be tokenized (organized as text vs. position in source)</param>
/// <returns>The set of tokens matching the given pattern located in the untokenized portions of the input, <paramref name="untokenizedParts"/> is updated as a result of this call</returns>
static IEnumerable<Token> Tokenize(string tokenKind, string pattern, ref KeyValuePair<string, int>[] untokenizedParts)
{
//Do a bit of setup
var resultParts = new List<KeyValuePair<string, int>>();
var resultTokens = new List<Token>();
var regex = new Regex(pattern);

//Look through all of our currently untokenized data
foreach (var part in untokenizedParts)
{
//Find all of our available matches
var matches = regex.Matches(part.Key).OfType<Match>().ToList();

//If we don't have any, keep the data as untokenized and move to the next chunk
if (matches.Count == 0)
{
continue;
}

//Store the untokenized data in a working copy and save the absolute index it reported itself at in the source file
var workingPart = part.Key;
var index = part.Value;

//Look through each of the matches that were found within this untokenized segment
foreach (var match in matches)
{
//Calculate the effective start of the match within the working copy of the data
var effectiveStart = match.Index - (part.Key.Length - workingPart.Length);

//If we didn't match at the beginning, save off the first portion to the set of untokenized data we'll give back
if (effectiveStart > 0)
{
var value = workingPart.Substring(0, effectiveStart);
}

//Get rid of the portion of the working copy we've already used
if (match.Index + match.Length < part.Key.Length)
{
workingPart = workingPart.Substring(effectiveStart + match.Length);
}
else
{
workingPart = string.Empty;
}

//Update the current absolute index in the source file we're reporting to be at
index += effectiveStart + match.Length;
}

//If we've got remaining data in the working copy, add it back to the untokenized data
if (!string.IsNullOrEmpty(workingPart))
{
}
}

//Update the untokenized data to contain what we couldn't process with this pattern
untokenizedParts = resultParts.ToArray();
//Return the tokens we were able to extract
return resultTokens;
}
``````

Now that we've got the methods and types in place to handle our tokenized data, we need to be able to recognize pieces of larger meaning, like calls to simple functions (like `sin(x)`), complex functions (like `Function1(a1;a2;a3)`), basic mathematical operations (like `+`, `-`, `*`, etc.), and so on. We'll make a simple parser for dealing with that; firstly we'll define a match condition for a parse node.

``````public class ParseNodeDefinition
{
/// <summary>
/// The set of parse node definitions that could be transitioned to from this one
/// </summary>

/// <summary>
/// Creates a new ParseNodeDefinition
/// </summary>
private ParseNodeDefinition()
{
_nextNodeOptions = new List<ParseNodeDefinition>();
}

/// <summary>
/// Gets whether or not this definition is an acceptable ending point for the parse tree
/// </summary>
public bool IsValidEnd { get; private set; }

/// <summary>
/// Gets the name an item must have for it to be matched by this definition
/// </summary>
public string MatchItemsNamed { get; private set; }

/// <summary>
/// Gets the set of parse node definitions that could be transitioned to from this one
/// </summary>
public IEnumerable<ParseNodeDefinition> NextNodeOptions
{
get { return _nextNodeOptions; }
}

/// <summary>
/// Gets or sets the tag that will be associated with the data if matched
/// </summary>
public string Tag { get; set; }

/// <summary>
/// Creates a new ParseNodeDefinition matching items with the specified name/kind.
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <param name="tag">The tag to associate with matched items</param>
/// <param name="isValidEnd">Whether or not the element is a valid end to the parse tree</param>
/// <returns>A ParseNodeDefinition capable of matching items of the given name</returns>
public static ParseNodeDefinition Create(string matchItemsNamed, string tag, bool isValidEnd)
{
return new ParseNodeDefinition { MatchItemsNamed = matchItemsNamed, Tag = tag, IsValidEnd = isValidEnd };
}

{
}

public ParseNodeDefinition AddOption(string matchItemsNamed, string tag)
{
}

/// <summary>
/// Adds an option for a named node to follow this one in the parse tree the node is a part of
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <param name="tag">The tag to associate with matched items</param>
/// <param name="isValidEnd">Whether or not the element is a valid end to the parse tree</param>
/// <returns>The ParseNodeDefinition that has been added</returns>
public ParseNodeDefinition AddOption(string matchItemsNamed, string tag, bool isValidEnd)
{
var node = Create(matchItemsNamed, tag, isValidEnd);
return node;
}

public ParseNodeDefinition AddOption(string matchItemsNamed, bool isValidEnd)
{
}

/// <summary>
/// Links the given node as an option for a state to follow this one in the parse tree this node is a part of
/// </summary>
/// <param name="next">The node to add as an option</param>
{
}
}
``````

This will let us match a single element by name (whether it's a `ParseTree` defined later) or a `Token` as they both implement the `ISourcePart` interface. Next we'll make a `ParseTreeDefinition` that allows us to specify sequences of `ParseNodeDefinitions` for matching.

``````public class ParseTreeDefinition
{
/// <summary>
/// The set of parse node definitions that constitute an initial match to the parse tree
/// </summary>

/// <summary>
/// Creates a new ParseTreeDefinition
/// </summary>
/// <param name="name">The name to give to parse trees generated from full matches</param>
public ParseTreeDefinition(string name)
{
_initialNodeOptions = new List<ParseNodeDefinition>();
Name = name;
}

/// <summary>
/// Gets the set of parse node definitions that constitute an initial match to the parse tree
/// </summary>
public IEnumerable<ParseNodeDefinition> InitialNodeOptions { get { return _initialNodeOptions; } }

/// <summary>
/// Gets the name of the ParseTreeDefinition
/// </summary>
public string Name { get; private set; }

/// <summary>
/// Adds an option for a named node to follow this one in the parse tree the node is a part of
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <returns>The ParseNodeDefinition that has been added</returns>
{
}

/// <summary>
/// Adds an option for a named node to follow this one in the parse tree the node is a part of
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <param name="tag">The tag to associate with matched items</param>
/// <returns>The ParseNodeDefinition that has been added</returns>
public ParseNodeDefinition AddOption(string matchItemsNamed, string tag)
{
}

/// <summary>
/// Adds an option for a named node to follow this one in the parse tree the node is a part of
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <param name="tag">The tag to associate with matched items</param>
/// <param name="isValidEnd">Whether or not the element is a valid end to the parse tree</param>
/// <returns>The ParseNodeDefinition that has been added</returns>
public ParseNodeDefinition AddOption(string matchItemsNamed, string tag, bool isValidEnd)
{
var node = ParseNodeDefinition.Create(matchItemsNamed, tag, isValidEnd);
return node;
}

/// <summary>
/// Adds an option for a named node to follow this one in the parse tree the node is a part of
/// </summary>
/// <param name="matchItemsNamed">The name of the item to be matched</param>
/// <param name="isValidEnd">Whether or not the element is a valid end to the parse tree</param>
/// <returns>The ParseNodeDefinition that has been added</returns>
public ParseNodeDefinition AddOption(string matchItemsNamed, bool isValidEnd)
{
}

/// <summary>
/// Attempts to follow a particular branch in the parse tree from a given starting point in a set of source parts
/// </summary>
/// <param name="parts">The set of source parts to attempt to match in</param>
/// <param name="startIndex">The position to start the matching attempt at</param>
/// <param name="required">The definition that must be matched for the branch to be followed</param>
/// <param name="nodes">The set of nodes that have been matched so far</param>
/// <returns>true if the branch was followed to completion, false otherwise</returns>
private static bool FollowBranch(IList<ISourcePart> parts, int startIndex, ParseNodeDefinition required, ICollection<ParseNode> nodes)
{
if (parts[startIndex].Kind != required.MatchItemsNamed)
{
return false;
}

return parts.Count > (startIndex + 1) && required.NextNodeOptions.Any(x => FollowBranch(parts, startIndex + 1, x, nodes)) || required.IsValidEnd;
}

/// <summary>
/// Attempt to match the parse tree definition against a set of source parts
/// </summary>
/// <param name="parts">The source parts to match against</param>
/// <returns>true if the parse tree was matched, false otherwise. parts is updated by this method to consolidate matched nodes into a ParseTree</returns>
public bool Parse(ref IList<ISourcePart> parts)
{
var partsCopy = parts.ToList();

for (var i = 0; i < parts.Count; ++i)
{
var tree = new List<ParseNode>();

if (InitialNodeOptions.Any(x => FollowBranch(partsCopy, i, x, tree)))
{
partsCopy.RemoveRange(i, tree.Count);
partsCopy.Insert(i, new ParseTree(Name, tree.ToArray(), tree[0].Position));
parts = partsCopy;
return true;
}
}

return false;
}
}
``````

Of course these don't do us much good without having some place to store the results of the matchers we've defined so far, so let's define `ParseTree` and `ParseNode` where a `ParseTree` is simply a collection of `ParseNode` objects where `ParseNode` is a wrapper around a `ParseTree` or `Token` (or more generically any `ISourcePart`).

``````public class ParseTree : ISourcePart
{
/// <summary>
/// Creates a new ParseTree
/// </summary>
/// <param name="kind">The kind (name) of tree this is</param>
/// <param name="nodes">The nodes the tree matches</param>
/// <param name="position">The position in the source file this tree is located at</param>
public ParseTree(string kind, IEnumerable<ISourcePart> nodes, int position)
{
Kind = kind;
ParseNodes = nodes.ToList();
Position = position;
}

public string Kind { get; private set; }

public int Position { get; private set; }

/// <summary>
/// Gets the nodes that make up this parse tree
/// </summary>
public IList<ISourcePart> ParseNodes { get; internal set; }

public Token[] AsTokens()
{
return ParseNodes.SelectMany(x => x.AsTokens()).ToArray();
}
}

public class ParseNode : ISourcePart
{
/// <summary>
/// Creates a new ParseNode
/// </summary>
/// <param name="sourcePart">The data that was matched to create this node</param>
/// <param name="tag">The tag data (if any) associated with the node</param>
public ParseNode(ISourcePart sourcePart, string tag)
{
SourcePart = sourcePart;
Tag = tag;
}

public string Kind { get { return SourcePart.Kind; } }

/// <summary>
/// Gets the tag associated with the matched data
/// </summary>
public string Tag { get; private set; }

/// <summary>
/// Gets the data that was matched to create this node
/// </summary>
public ISourcePart SourcePart { get; private set; }

public int Position { get { return SourcePart.Position; } }

public Token[] AsTokens()
{
return SourcePart.AsTokens();
}
}
``````

That's it for the constructs we need, so we'll move into configuring our parse tree definitions. The code from here on is from Program.cs in my demo.

As you might have noticed in the block above about declaring the patterns for each token, there were some values referenced but not defined, here they are.

``````private const string CloseParen = "CloseParen";
private const string ComplexFunctionCall = "ComplexFunctionCall";
private const string FunctionCallStart = "FunctionCallStart";
private const string Literal = "Literal";
private const string OpenParen = "OpenParen";
private const string Operator = "Operator";
private const string ParenthesisRequiredElement = "ParenthesisRequiredElement";
private const string ParenthesizedItem = "ParenthesizedItem";
private const string Semi = "Semi";
private const string SimpleFunctionCall = "SimpleFunctionCall";
``````

Let's begin by defining a pattern that matches literals (`\w+` pattern) that are followed by open parenthesis; we'll use this to match things like `sin(` or `Function1(`.

``````static ParseTreeDefinition CreateFunctionCallStartTree()
{
var tree = new ParseTreeDefinition(FunctionCallStart);
return tree;
}
``````

Really not a whole lot to it, setup a tree, add an option for the first thing to match as a Literal, add an option of the next thing to match as an open parenthesis and say that it can end the parse tree. Now for one that's a little more complex, binary mathematical operations (couldn't think of any unary operations that would need to be included)

``````static ParseTreeDefinition CreateBinaryOperationResultTree()
{
var tree = new ParseTreeDefinition(Literal);

return tree;
}
``````

Here we say that the parse tree can start with a parenthesized item (like `(1/2)`), a literal (like `a5` or `3`), a simple call (like `sin(4)`) or a complex one (like `Function1(a1;a2;a3)`). In essence we've just defined the options for the left hand operand. Next, we say that the parenthesized item must be followed by an Operator (one of the mathematical operators from the pattern declared way up at the beginning) and, for convenience, we'll say that all of the other options for the left hand operand can progress to that same state (having the operator). Next, the operator must have a right hand side as well, so we give it a duplicate set of options to progress to. Note that they are not the same definitions as the left hand operands, these have the flag set to be able to terminate the parse tree. Notice that the parse tree is named Literal to avoid having to specify yet another kind of element to match all over the place. Next up, parenthesized items:

``````static ParseTreeDefinition CreateParenthesizedItemTree()
{
var tree = new ParseTreeDefinition(ParenthesizedItem);

return tree;
}
``````

Nice and easy with this one, start with a parenthesis, follow it up with pretty much anything, follow that with another parenthesis to close it. Simple calls (like `sin(x)`)

``````static ParseTreeDefinition CreateSimpleFunctionCallTree()
{
var tree = new ParseTreeDefinition(SimpleFunctionCall);

return tree;
}
``````

Complex calls (like `Function1(a1;a2;a3)`)

``````static ParseTreeDefinition CreateComplexFunctionCallTree()
{
var tree = new ParseTreeDefinition(ComplexFunctionCall);

return tree;
}
``````

That's all the trees we'll need, so let's take a look at the code that runs this all

``````static void Main()
{
//The input string
const string input = @"Function1((a1^5-4)/2;1/sin(a2);a3;a4)+Function2(a1;a2;1/a3)";
//Locate the recognizable tokens within the source
IList<ISourcePart> tokens = Tokenize(input).Cast<ISourcePart>().ToList();
//Create the parse trees we'll need to be able to recognize the different parts of the input
var functionCallStartTree = CreateFunctionCallStartTree();
var parenthethesizedItemTree = CreateParenthesizedItemTree();
var simpleFunctionCallTree = CreateSimpleFunctionCallTree();
var complexFunctionCallTree = CreateComplexFunctionCallTree();
var binaryOpTree = CreateBinaryOperationResultTree();

//Parse until we can't parse anymore
while (functionCallStartTree.Parse(ref tokens) || binaryOpTree.Parse(ref tokens) || parenthethesizedItemTree.Parse(ref tokens) || simpleFunctionCallTree.Parse(ref tokens) || complexFunctionCallTree.Parse(ref tokens))
{ }

//Run our post processing to fix the parenthesis in the input
FixParenthesis(ref tokens);

//Collapse our parse tree(s) back to a string
var values = tokens.OrderBy(x => x.Position).SelectMany(x => x.AsTokens()).Select(x => x.Value);

//Print out our results and wait
Console.WriteLine(string.Join(string.Empty, values));
}
``````

The only thing we've got left to define is how to actually do the wrapping of the elements in the argument list of a "complex" call. That's handled by the `FixParenthesis` method.

``````private static void FixParenthesis(ref IList<ISourcePart> items)
{
//Iterate through the set we're examining
for (var i = 0; i < items.Count; ++i)
{
var parseNode = items[i] as ParseNode;

//If we've got a parse node...
if (parseNode != null)
{
var nodeTree = parseNode.SourcePart as ParseTree;
//If the parse node represents a parse tree...
if (nodeTree != null)
{
//Fix parenthesis within the tree
var nodes = nodeTree.ParseNodes;
FixParenthesis(ref nodes);
nodeTree.ParseNodes = nodes;
}

//If this parse node required parenthesis, replace the subtree and add them
if (parseNode.Tag == ParenthesisRequiredElement)
{
var nodeContents = parseNode.AsTokens();
var combined = string.Join(string.Empty, nodeContents.OrderBy(x => x.Position).Select(x => x.Value));
items[i] = Token.Create(parseNode.Kind, string.Format("({0})", combined), parseNode.Position);
}

continue;
}

var parseTree = items[i] as ParseTree;

//If we've got a parse tree...
if (parseTree != null)
{
//Fix parenthesis within the tree
var nodes = parseTree.ParseNodes;
FixParenthesis(ref nodes);
parseTree.ParseNodes = nodes;
}
}
}
``````

At any rate, I hope this has helped or at least provided a fun diversion.

I probably managed to deal with it(now testing). It turned out to be 5-stage operation. Assuming that '{' and ';' cannot occur in function I've done sth like this:

``````        sBuffer = Regex.Replace(sBuffer, @"(?<sep>[;])", "};{");
sBuffer = Regex.Replace(sBuffer, @"([(])(?<arg>.+?)[}]", "({\${arg}}");
sBuffer = Regex.Replace(sBuffer, @"([;])(?<arg>.+?)([)]){1}", ";\${arg}})");
sBuffer = Regex.Replace(sBuffer, @"{", "(");
sBuffer = Regex.Replace(sBuffer, @"}", ")");
``````

0. `function1((a1^5-4)/2;1/sin(a2);a3;a4)+function2(a1;a2;1/a3)'`

1.First line replaces ; with };{

``````function1((a1^5-4)/2};{1/sin(a2)};{a3};{a4)+function2(a1};{a2};{1/a3)
``````

2.For first argument - after ( or (not intended) arguments which contain ')' replace (arg};with ({arg}:

``````function1({(a1^5-4)/2};{1/sin({a2)};{a3};{a4)+function2({a1};{a2};{1/a3)
``````

3. The same at the and of function: {arg) with {arg}:

``````function1({(a1^5-4)/2};{1/sin({a2})};{a3};{a4})+function2({a1};{a2};{1/a3})
``````

4.5. Replace '{' and '}' with '(' ')':

``````function1(((a1^5-4)/2);(1/sin((a2)));(a3);(a4))+function2((a1);(a2);(1/a3))
``````

We have some extra () specially when argument itself is surrounded by '(' ')' (nested function) but it doesn't metter as the code is then proceed by Reversed Polish Notation

This is my first code for regexp(I found out about rgexp just few days ago- I'm a beginer) . I hope it's satisfies all the cases (at least those that can occur in excel formulas)

Top