Regular expression
Generating regular expression patterns drafts with GitHub Copilot
Description
GitHub Copilot can create a draft of regular expression patterns. Below, we'll explain how to use regular expressions to search or extract strings. Two examples illustrate how GitHub Copilot can generate regular expressions through both input-output pattern and natural language pattern.
Example
Input and Output Pattern
Sample Code
First, by writing the input and output, GitHub Copilot can create the regular expression pattern:
import re
# Write a regular expression
# - Input: "Hello World"
# - Output: ["H", "W"]
regex
Sample Result
import re
# Write a regular expression
# - Input: "Hello World"
# - Output: ["H", "W"]
regex = r"[A-Z]"
matched = re.findall(regex, "Hello World")
Natural Language Pattern
Second, by providing what you want to achieve with the regular expression in natural language:
Sample Code
import re
# Write a regular expression
# - "I have 3 apples and 2 oranges" like string, extract only the numbers to make an array
sentence
Sample Result
import re
# Write a regular expression
# - "I have 3 apples and 2 oranges" like string, extract only the numbers to make an array
sentence = "I have 3 apples and 2 oranges"
regex = r"\d+"
matched = re.findall(regex, sentence)
Exercise
Exercise 1: Extract only the lowercase letters from the string "Hello World."
Checklist for Further Learning
Are the regular expression patterns extracting the exact matches from the given strings?
Currently, LLMs like GitHub Copilot do not have the ability to properly represent complex regular expressions. What would you do if you want to represent a complex regular expression? How would you leverage GitHub Copilot to support and assist you in building it?
Last updated