Class RegexLineTokenizer

java.lang.Object
org.springframework.batch.item.file.transform.AbstractLineTokenizer
org.springframework.batch.item.file.transform.RegexLineTokenizer
All Implemented Interfaces:
LineTokenizer

public class RegexLineTokenizer extends AbstractLineTokenizer
Line-tokenizer using a regular expression to filter out data (by using matching and non-matching groups). Consider the following regex which picks only the first and last name (notice the non-matching group in the middle):
 (.*?)(?: .*)* (.*) 
 
For the names:
  • "Graham James Edward Miller"
  • "Andrew Gregory Macintyre"
  • "No MiddleName"
the output will be:
  • "Miller", "Graham"
  • "Macintyre", "Andrew"
  • "MiddleName", "No"
An empty list is returned, in case of a non-match.
Author:
Costin Leau
See Also:
  • Constructor Details

    • RegexLineTokenizer

      public RegexLineTokenizer()
  • Method Details

    • doTokenize

      protected List<String> doTokenize(String line)
      Specified by:
      doTokenize in class AbstractLineTokenizer
    • setPattern

      public void setPattern(Pattern pattern)
      Sets the regex pattern to use.
      Parameters:
      pattern - Regular Expression pattern
    • setRegex

      public void setRegex(String regex)
      Sets the regular expression to use.
      Parameters:
      regex - regular expression (as a String)