Few days ago I wrote “Some thoughts on denormalization” post. Though concatenating data to one string is easy with there introduced function, splitting it back could be harder. I left it as exercise, but I feel, to make it complete, it’s good to show one possible implementation.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
recreate procedure Tokenize(input varchar(1024), token char(1)) returns (result varchar(255)) as declare newpos int; declare oldpos int; begin oldpos = 1; newpos = 1; while (1 = 1) do begin newpos = position(token, input, oldpos); if (newpos > 0) then begin result = substring(input from oldpos for newpos - oldpos); suspend; oldpos = newpos + 1; end else if (oldpos - 1 < char_length(input)) then begin result = substring(input from oldpos); suspend; break; end else begin break; end end end
The procedure splits the
input string using the specified
token. The string can (or not) end with the token itself, the procedure will handle it.
1 2 3 4 5
select * from Tokenize('ab,cd,e', ',') union all select * from Tokenize('ab,cd,e,', ','); union all select * from Tokenize('ab,cd,e,,', ',');
What it is not handling is some form of quoting in case there’s a token inside the element. Mainly because its intended purpose is to tokenize strings you can control (see the previous post) and because it will slow down the execution.
Feel free to improve it etc. (either post link to your solution or post the code in comments).