Nfa (monq packages)

java.lang.Object
- monq.jfa.Nfa

```
public class Nfa
extends java.lang.Object
```
models a non-deterministic finite automaton.

The most frequent use involves several calls to or(CharSequence, FaAction) to bind regular expressions to actions. After having added all necessary re/action pairs, compile() into a Dfa and operate it by a DfaRun to filter text. Filtering means that all regular expressions which were bound to actions compete for matches. The regular expression with the longest match is selected and its associated action will be invoked.
The regular expression syntax depends on the ReParser implementation in use, which can be inspeced and changed with getDefaultParserFactory() and setDefaultParserFactory(). At the time of this writing, the default parser is a ReClassicParser.

Author:

© 2003–2007 Harald Kirsch

Field Summary

Fields
Modifier and Type	Field and Description
`static monq.jfa.Nfa.EmptyNfaType`	`EPSILON` value of an enumeration type to be passed to `Nfa(Nfa.EmptyNfaType)`.
`static monq.jfa.Nfa.EmptyNfaType`	`NOTHING` value of an enumeration type to be passed to `Nfa(Nfa.EmptyNfaType)`.

Constructor Summary

Constructors
Constructor and Description
`Nfa()` creates an automaton that recognizes nothing but is suitable to be filled with calls to any of the `or()` methods.
`Nfa(java.lang.CharSequence regex)` calls `Nfa(CharSequence,FaAction)` with the 2nd parameter set to `null`.
`Nfa(java.lang.CharSequence regex, FaAction a)` creates an `Nfa` from the given regular expression and assigns the given action to the stop state.
`Nfa(monq.jfa.Nfa.EmptyNfaType type)` creates an `Nfa` which does either not match anything or only the empty string.

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`Nfa`	`addAction(FaAction a)` adds an action to the Nfa (expert only).
`void`	`allPrefixes()` converts the automaton into one which matches all non-empty prefixes of the given automaton.
`Dfa`	`compile(DfaRun.FailedMatchBehaviour fmb)` compiles `this` into a `Dfa` with the given behaviour for non-matching input.
`Dfa`	`compile(DfaRun.FailedMatchBehaviour fmb, FaAction eofAction)` compiles this non-deterministic finite automaton into a deterministic finite automaton.
`Nfa`	`completeToSkip(FaAction noMatchAction)` extends the automaton with an additonal automaton which matches everything but all non-empty prefixes of this automaton.
`Nfa`	`copy()` creates a copy of this Nfa where all the states are duplicated, but the actions are kept such that both Nfas referernce the same actions.
`java.lang.String`	`escape(java.lang.String s)` escapes special characters in `s` according to the rules of the current `ReParser` set for `this`.
`void`	`escape(java.lang.StringBuilder out, java.lang.CharSequence in, int startAt)` is only a pass-through to `getReParser().escape(...)`
`static ReParserFactory`	`getDefaultParserFactory()` returns the default parser factory used by `Nfa`s to obtain a default `ReParser`.
`double`	`getMemoryForSpeedTradeFactor()`
`ReParser`	`getReParser()` returns the regular expression parser currently used by this `Nfa`.
`Nfa`	`invert()` inverts the automaton, such that the resulting automaton will match the set-complement of the set of strings matched by the given automaton.
`boolean`	`markAsSub()` marks this automaton as a reporting subautomaton.
`Nfa`	`not()` implements the request match everthing but X.
`Nfa`	`optional()` transforms this finite automaton to match the empty string too.
`Nfa`	`or(java.lang.CharSequence re)` adds the given regular expression to the automaton (expert only).
`Nfa`	`or(java.lang.CharSequence regex, FaAction action)` adds the given regular expression `re` to the automaton and associates it with the given action.
`Nfa`	`or(Nfa other)` joins the other `Nfa` into this automaton while keeping all stop states and assigned actions.
`Nfa`	`plus()` applies the `+` operator to the NFA.
`Nfa`	`seq(java.lang.CharSequence s)` calls `seq(CharSequence,FaAction)` with `null` as the 2nd parameter.
`Nfa`	`seq(java.lang.CharSequence regex, FaAction a)` extend the current automaton to recognize `regex` as a suffix of the strings already recognized and arrange for the given action to be called if the suffix was recognized.
`Nfa`	`seq(Nfa other)` Concatenates `other` to `this` such that a sequence of characters vw is matched where v is matched by `this` and w is matched by `other`.
`static void`	`setDefaultParserFactory(ReParserFactory rpf)` sets the default parser factory used by `Nfa`s to obtain a default `ReParser`.
`void`	`setMemoryForSpeedTradeFactor(float f)` when creating DFA transitions during NFA to DFA compilation, the transition tables use different implementations, depending on how dense the transition table is.
`Nfa`	`setReParser(ReParser reParser)` sets the parser to be used by this `Nfa`.
`Nfa`	`shortest()` transforms the Nfa into an Nfa which only matches the shortest matches of the given Nfa.
`java.lang.String`	`specialChars()` is a shortcut for `.getReParser().specialChars()`
`Nfa`	`star()` applies the Kleene closure (`*`) operator to the NFA.
`void`	`toDot(java.io.PrintStream out)`
`void`	`toDot(java.lang.String filename)`

Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Field Detail
  - NOTHING
```
public static final monq.jfa.Nfa.EmptyNfaType NOTHING
```
    value of an enumeration type to be passed to Nfa(Nfa.EmptyNfaType).
  - EPSILON
```
public static final monq.jfa.Nfa.EmptyNfaType EPSILON
```
    value of an enumeration type to be passed to Nfa(Nfa.EmptyNfaType).
- Constructor Detail
  - Nfa
```
public Nfa()
```
    creates an automaton that recognizes nothing but is suitable to be filled with calls to any of the or() methods.
  - Nfa
```
public Nfa(monq.jfa.Nfa.EmptyNfaType type)
```
    creates an Nfa which does either not match anything or only the empty string. Most of the time you will use NOTHING as the parameter and later add regular expressions with or(). If, however, you intend to start adding to the automaton with seq(), call this constructor with EPSILON.
  - Nfa
```
public Nfa(java.lang.CharSequence regex,
           FaAction a)
    throws ReSyntaxException
```
    creates an Nfa from the given regular expression and assigns the given action to the stop state. More regex/action pairs are then normally added with or(CharSequence,FaAction).
    
    Throws:
    
    ReSyntaxException
  - Nfa
```
public Nfa(java.lang.CharSequence regex)
    throws ReSyntaxException
```
    calls Nfa(CharSequence,FaAction) with the 2nd parameter set to null.
    
    Throws:
    
    ReSyntaxException
- Method Detail
  - setDefaultParserFactory
```
public static void setDefaultParserFactory(ReParserFactory rpf)
```
    sets the default parser factory used by Nfas to obtain a default ReParser.
    
    See Also:
    
    setReParser(monq.jfa.ReParser)
  - getDefaultParserFactory
```
public static ReParserFactory getDefaultParserFactory()
```
    returns the default parser factory used by Nfas to obtain a default ReParser.
    
    See Also:
    
    setReParser(monq.jfa.ReParser)
  - escape
```
public java.lang.String escape(java.lang.String s)
```
    escapes special characters in s according to the rules of the current ReParser set for this.
    
    See Also:
    
    ReParser.escape(StringBuilder, CharSequence, int)
  - escape
```
public void escape(java.lang.StringBuilder out,
                   java.lang.CharSequence in,
                   int startAt)
```
    is only a pass-through to getReParser().escape(...).
    
    See Also:
    
    ReParser.escape(StringBuilder, CharSequence, int)
  - specialChars
```
public java.lang.String specialChars()
```
    is a shortcut for .getReParser().specialChars()
  - setMemoryForSpeedTradeFactor
```
public void setMemoryForSpeedTradeFactor(float f)
```
    when creating DFA transitions during NFA to DFA compilation, the transition tables use different implementations, depending on how dense the transition table is. The fastest transition table uses an array that is indexed by the transition character. For sparse tables, this is a significant memory overhead, therefore a denser table can be used which uses a binary search for lookup.
    
    The factor given defines how much larger the fast table may be, before the denser table is used. If the factor is 1.0, the implementation with the smaller estimated memory footprint is used. The larger the factor is, the more likely is it that even sparse tables still use the faster implementation. With a factor of several thousand, nearly all transition tables will be fast, possibly using up a lot of heap space.
    The default is 1.0.
  - getMemoryForSpeedTradeFactor
```
public double getMemoryForSpeedTradeFactor()
```
  - getReParser
```
public ReParser getReParser()
```
    returns the regular expression parser currently used by this Nfa. If none was explicitely set, getDefaultParserFactory() is used to set the parser, before it is returned.
  - setReParser
```
public Nfa setReParser(ReParser reParser)
```
    sets the parser to be used by this Nfa.
    
    Returns:
    
    this
  - addAction
```
public Nfa addAction(FaAction a)
```
    adds an action to the Nfa (expert only). This is done by adding an epsilon transition of the last state of this Nfa to a new last state which will carry the action.
    
    If the automaton has free subgraphs, these will be bound to this action.
    
    Note:If the automaton has already one or more actions (i.e. stop states), this operation may easily make the automaton uncompilable due to conflicting actions.
  - toDot
```
public void toDot(java.io.PrintStream out)
```
  - toDot
```
public void toDot(java.lang.String filename)
```
  - markAsSub
```
public boolean markAsSub()
```
    marks this automaton as a reporting subautomaton. It will of course only become a subautomaton, if further operators are applied, or if this automaton is combined with another Nfa.
    
    Marking subgraphs for reporting matching substrings is expensive, therefore the number of subgraphs per action is limited to a single byte. This method returns null if 256 subgraph markings are exceeded before an action is added to this Nfa. In this case, the subgraph marking requested was not performed. The Nfa is not touched in any way in this case.
    
    Returns:
    
    this if the subgraph marking could be performed or null if the marking could not be performed, because the number of possible subgraphs was exceeded.
  - optional
```
public Nfa optional()
```
    transforms this finite automaton to match the empty string too. All actions defining stop states are kept as they are. This method implements the operator ?.
    
    Returns:
    
    the return value is this.
  - star
```
public Nfa star()
```
    applies the Kleene closure (*) operator to the NFA.
  - plus
```
public Nfa plus()
```
    applies the + operator to the NFA.
  - completeToSkip
```
public Nfa completeToSkip(FaAction noMatchAction)
                   throws CompileDfaException
```
    extends the automaton with an additonal automaton which matches everything but all non-empty prefixes of this automaton.
    
    For example, if this automaton matches "abc", the addional automaton matches every string that does not contain "a", "ab" or "abc". The additional automaton will have the given action attached.
    
    The resulting, combined automaton will nearly never hit a no-match situation when used as a filter. The only exception is when the input ends in a true prefix of the original automaton. Informally, the operation performed is as follows:
```
this.or(this.copy().allPrefixes().not())
```
    Parameters:
    
    noMatchAction - the action to call when the additional automaton runs into a match, i.e. when the this does not match.
    
    Throws:
    
    CompileDfaException - if this automaton does not compile.
  - allPrefixes
```
public void allPrefixes()
                 throws CompileDfaException
```
    converts the automaton into one which matches all non-empty prefixes of the given automaton. Any actions already specified will be deleted.
    
    Throws:
    
    CompileDfaException - if the internally called compile operation throws this exception
  - invert
```
public Nfa invert()
           throws CompileDfaException
```
    inverts the automaton, such that the resulting automaton will match the set-complement of the set of strings matched by the given automaton.
    
    Note: Except for rare circumstances you rather want to use not() instead. The behaviour of this method, while theoretically sound, does not implement the request match everthing but X for some regular expression X.
    
    If the automaton has no actions assigned yet, its natural stop state defines the set of strings matched. If it has already actions assigned, those add to the set of recognized strings and no difference is made between them. The resulting automaton will not have any actions.
    
    Returns:
    
    this
    
    Throws:
    
    CompileDfaException - if the automaton can not be compiled. Compilation is a necessary step to invert the automaton.
  - not
```
public Nfa not()
        throws CompileDfaException
```
    implements the request match everthing but X. While invert() implements the more theoretical set complement on the set of recognized strings of an automaton, this method implements what most people intuitively expect when asking to match anything but the strings recognized by the given automaton.
    
    As an example consider the regular expression re="[A-Z]+" which matches all strings
    which are completely uppercase.
    After application of invert(), the result, namely "([A-Z]+)~", will match all strings
    which are not completely uppercase.
    
    In particular "xHallo" or "Hallo" will be matched.
    
    To the contrary, "([A-Z]+)^, i.e. not() applied to re, will not match them. In fact "(re)^" is a shortcut for "((.*re.*)?)~", which matches neither the empty string nor any string containing re.
    
    The following is an example application of not(). It shows a way to enclose words separated by white space with angle brackets:
```
 Nfa n = new Nfa("([ \t\n\r]+)^",
                 new AbstractFaAction.Printf("[%0]"))
 
```
    The caret ("^") is the operator applying not(). The resulting automaton will call the callback for every longest prefix of the input string which does not contain a white space character. The callback will then enclose the matching text in brackets.
    Throws:
    
    CompileDfaException
  - shortest
```
public Nfa shortest()
             throws CompileDfaException
```
    transforms the Nfa into an Nfa which only matches the shortest matches of the given Nfa. This operation is best understood by its implementation, which compiles the Nfa into a Dfa and then removes all outgoing links of any stop states.
    If this has not stop state yet, the operation is actually not well defined. Therefore and to avoid counterintuitive results, the last state of the Nfa is treated like a stop state, even if it is none.
    
    Throws:
    
    CompileDfaException
  - copy
```
public Nfa copy()
```
    creates a copy of this Nfa where all the states are duplicated, but the actions are kept such that both Nfas referernce the same actions.
  - seq
```
public Nfa seq(java.lang.CharSequence regex,
               FaAction a)
        throws ReSyntaxException
```
    extend the current automaton to recognize regex as a suffix of the strings already recognized and arrange for the given action to be called if the suffix was recognized. If the action is null, no action will be added.
    
    Throws:
    
    ReSyntaxException
  - seq
```
public Nfa seq(java.lang.CharSequence s)
        throws ReSyntaxException
```
    calls seq(CharSequence,FaAction) with null as the 2nd parameter.
    
    Throws:
    
    ReSyntaxException
  - seq
```
public Nfa seq(Nfa other)
```
    Concatenates other to this such that a sequence of characters vw is matched where v is matched by this and w is matched by other. The other Nfa is initialized to the same state as if just constructed with Nfa().
    
    FIX ME: unassigned reporting subexpressions in other may create a mess
  - or
```
public Nfa or(java.lang.CharSequence regex,
              FaAction action)
       throws ReSyntaxException
```
    adds the given regular expression re to the automaton and associates it with the given action.
    
    Throws:
    
    ReSyntaxException
  - or
```
public Nfa or(java.lang.CharSequence re)
       throws ReSyntaxException
```
    adds the given regular expression to the automaton (expert only). To make it effective, addAction(monq.jfa.FaAction) should be called before compilation into a Dfa.
    
    Throws:
    
    ReSyntaxException
  - or
```
public Nfa or(Nfa other)
```
    joins the other Nfa into this automaton while keeping all stop states and assigned actions. The other Nfa is initialized to the same state as if just created constructed with Nfa().
    
    FIX ME: unassigned reporting subexpressions in other may create a mess
  - compile
```
public Dfa compile(DfaRun.FailedMatchBehaviour fmb,
                   FaAction eofAction)
            throws CompileDfaException
```
    compiles this non-deterministic finite automaton into a deterministic finite automaton. The given automaton is not changed.
    
    Note: under most circumstances it is not a good idea to compile an Nfa which has no actions added, i.e. which has no stop state, because this automaton matches nothing, not even the empty string. Nevertheless, it is not forbidden to compile such an automaton into a Dfa. It can even be used within a DfaRun but will never produce any output.
    
    Parameters:
    
    fmb - describes the initial behaviour of a DfaRun which operates the Dfa created here
    
    eofAction - describes the action to run at the end of input when the resulting Dfa is used with a DfaRun.
    
    Throws:
    
    CompileDfaException - if stop states with differing actions are found that can not be merged by their mergeWith() methods. To get rid of the exception, either change the regular expressions such that they do not have matches in common, or make sure the actions involved can be merged. One way of merging is to use extend AbstractFaAction and prioritize the assigned actions.
  - compile
```
public Dfa compile(DfaRun.FailedMatchBehaviour fmb)
            throws CompileDfaException
```
    compiles this into a Dfa with the given behaviour for non-matching input.
    
    Throws:
    
    CompileDfaException
    
    See Also:
    
    compile(DfaRun.FailedMatchBehaviour,FaAction)

Class Nfa

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Detail

NOTHING

EPSILON

Constructor Detail

Nfa

Nfa

Nfa

Nfa

Method Detail

setDefaultParserFactory

getDefaultParserFactory

escape

escape

specialChars

setMemoryForSpeedTradeFactor

getMemoryForSpeedTradeFactor

getReParser

setReParser

addAction

toDot

toDot

markAsSub

optional

star

plus

completeToSkip

allPrefixes

invert

not

shortest

copy

seq

seq

seq

or

or

or

compile

compile