开始·结束·分组·替换

 
text = """
SELECT 
    t.PARTY_ID,
FROM ...
SELECT 
    a.PARTY_ID,
WHERE ...
"""

pattern_regex = r"(?i)SELECT\s+[\s\S]*?\.PARTY_ID\s*,"
replacement = "select aa,"

    

 
SELECT 
  t.PARTY_ID,
FROM ...
select aa
WHERE ...
    

 
import re

def replace_last_occurrence(text, pattern_regex, replacement):
    # 查找所有匹配正则表达式的项
    matches = list(re.finditer(pattern_regex, text))
    
    if not matches:
        return text  # 没有匹配项,返回原文本
    
    last_match = matches[-1]  # 最后一个匹配项

    # 替换操作
    return text[:last_match.start()] + replacement + text[last_match.end():]
    
    

 
sql_text = """
SELECT 
    t.PARTY_ID,
FROM customer t
WHERE ...
SELECT 
    a.PARTY_ID,
ORDER BY a.PARTY_ID;
"""

# 匹配以 SELECT 开头,包含 .PARTY_ID 并以逗号结尾的子句块(跨行也支持)
pattern_regex = r"(?i)SELECT\s+[\s\S]*?\.PARTY_ID\s*,"

# 匹配以 SELECT 开头,包含 PARTY_ID,并且PARTY_ID可能有别名的情况
pattern_regex = r"(?i)SELECT\s+[\s\S]*?\w{1,3}\.?PARTY_ID\s*,"

replacement = "select aa,"

result = replace_last_occurrence(sql_text, pattern_regex, replacement)

print(result)

    

 
(?i):忽略大小写

SELECT\s+:匹配 SELECT 后面至少一个空白字符

[\s\S]*?:非贪婪地匹配任意字符(包括换行),用于跨行匹配

\.PARTY_ID:匹配 .PARTY_ID

\s*,:匹配可选空格加逗号

    
如果你希望这个函数适用于多种字段(不只是 PARTY_ID),可以将字段名参数化:

 
def replace_last_select_field(text, field_name, replacement):
  pattern_regex = fr"(?i)SELECT\s+[\s\S]*?\.{field_name}\s*,"
  return replace_last_occurrence(text, pattern_regex, replacement)

replace_last_select_field(sql_text, "PARTY_ID", "select aa,")
    

 
text = """
SELECT 
    t.amt,
FROM ...
SELECT 
    a.PARTY_ID,
WHERE ...
"""

如果是这样的格式,那么示例1会直接将第1个SELECT与最后一个PARTY_ID作为匹配项
毕竟,它符合以SELECT开关,以PARTY_ID结尾 
    

 
pattern_regex = r"(?i)SELECT\s+[\s\S]{1,4}?\w{1,4}\.?PARTY_ID\s*,"

\S 任意非空字符,任意个连续的非空字符都能匹配,中间有空格就无法匹配 

{1,4}这里限定1-4个字符,实际情况是只能有空字符不能有非空字符的,
但1-4这限定也除非前面第1个SELECT被误选的情况

实际上去掉\S应该更符合本意,就是只有空白
    

 
上面的正则匹配出问题了,有段文本没有匹配到最后一个select,反而匹配到中间的一个select
于是就去掉了\S,下面的模式可以正常使用 

pattern_regex = r"(?i)SELECT\s+\w{1,4}\.?PARTY_ID\s*,"

这也说明正则匹配应该精准,并且是精准的模式越好 

    

 

    

 
import re

def extract_insert_part(sql):
    # 匹配从INSERT INTO开始到SELECT之前的部分
    pattern = r'(INSERT\s+INTO\s+[\w_]+\s*\([^)]*\)\s*)(?=SELECT)'
    match = re.search(pattern, sql, re.IGNORECASE | re.DOTALL)
    
    if match:
        return match.group(1).strip()
    else:
        return None

insert_part = extract_insert_part(sql)

 


 


 

  

 


python replace

 
s = "AND p.PARTY_STATUS_CD = '0' "

# 使用字符串的 replace 方法
result = s.replace("AND", "", 1)  # 只替换一次
print(result)
    

 
p.PARTY_STATUS_CD = '0' 
    

 
Signature: s.replace(old, new, count=-1, /)
Docstring:
Return a copy with all occurrences of substring old replaced by new.

    count
    Maximum number of occurrences to replace.
    -1 (the default value) means replace all occurrences.

If the optional argument count is given, only the first count occurrences are
replaced.

    

 

    

 


 

  

 


参考