8.12.2 Replacing Substrings and Splitting Strings

  • sub function replaces patterns in a string
  • split function breaks a string into pieces, based on patterns

Function sub—Replacing Patterns

  • sub function replaces all occurrences of a pattern with the replacement text you specify
In [ ]:
import re
In [ ]:
re.sub(r'\t', ', ', '1\t2\t3\t4')
  • Three required arguments:
    • the pattern to match (the tab character '\t')
    • the replacement text (', ') and
    • the string to be searched ('1\t2\t3\t4')
  • Keyword argument count can be used to specify the maximum number of replacements
In [ ]:
re.sub(r'\t', ', ', '1\t2\t3\t4', count=2)

Function split

  • split function tokenizes a string, using a regular expression to specify the delimiter
  • Returns a list of strings
In [ ]:
re.split(r',\s*', '1,  2,  3,4,    5,6,7,8')
  • Keyword argument maxsplit specifies maximum number of splits
In [ ]:
re.split(r',\s*', '1,  2,  3,4,    5,6,7,8', maxsplit=3)

©1992–2020 by Pearson Education, Inc. All Rights Reserved. This content is based on Chapter 5 of the book Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud.

DISCLAIMER: The authors and publisher of this book have used their best efforts in preparing the book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The authors and publisher make no warranty of any kind, expressed or implied, with regard to these programs or to the documentation contained in these books. The authors and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.