Saturday, September 12, 2009

use of REgexp_replace

The innermost regexp_replace gets rid of all the non-letter characters out of the string:

SQL> with src as ( select 'Internet + Multimedia Services (KT Whole)' col_1 from dual)
2 select regexp_replace(col_1,'([^A-Za-z ])','')
3 from src;

REGEXP_REPLACE(COL_1,'([^A-ZA-Z])','')
--------------------------------------
Internet Multimedia Services KT Whole


The second Regexp_replace takes the output from the first, and looks for strings matching the pattern '3 letters followed by any number of characters that aren't spaces. It replaces each of these strings with the '3 letters' part of the string.

SQL> with src as ( select 'Internet Multimedia Services KT Whole' col_1 from dual)
2 select regexp_replace(col_1,'([A-Za-z]{3})[^ ]*','\1')
3 from src;

REGEXP_REPLACE(COL_
-------------------
Int Mul Ser KT Who


The final regexp_replace takes this output, and replaces 1 or more contiguous spaces with a single underscore character:

SQL> with src as ( select 'Int Mul Ser KT Who' col_1 from dual)
2 select regexp_replace(col_1,'([ ]+)','_')
3 from src;

REGEXP_REPLACE(COL
------------------
Int_Mul_Ser_KT_Who
Report message to a moderator

--------------------------------------------
Another option using regex:


SQL> select a1
2 , rtrim(regexp_replace(a1,'([^[:alpha:]]*)([[:alpha:]]{1,3})([^ ]*)', '\2_'), '_') str
3 from full_name;

A1 STR
--------------------------------------------- -------------------------
Internet & Multimedia Services (KT Whole) Int_Mul_Ser_KT_Who
ABCD + XYZ Corp (PQ Half) ABC_XYZ_Cor_PQ_Hal
A B C D E F G A_B_C_D_E_F_G


with src as ( select 'Internet + Multimedia Services (KT Whole)' col_1 from dual)
select regexp_replace(regexp_replace(regexp_replace(col_1,'([^A-Za-z ])',''),'([A-Za-z]{3})[^ ]*','\1'),'([ ]+)','_')
from src;


with src as ( select 'Internet + Multimedia Services (KT Whole)' col_1 from dual)
select regexp_replace(regexp_replace(regexp_replace(col_1,'([^A-Za-z ])',''),'([A-Za-z]{3})[^ ]*','\1'),'([ ]+)','_')
from src;

No comments: