I tried permutations of LIKE and CHAR to find the character as well as REPLACE and CHARINDEX , all to no avail. Then I spotted this -
0x0000 (char(0)) is an undefined character in Windows collations and cannot be included in CHARINDEX.
This was on the MSDN page for CHARINDEX
You can read the whole string character by character however and recreate it, removing those characters. This is what my function below does.
CREATE FUNCTION dbo.StripUnwantedCharacters (@InputString VARCHAR(MAX)) RETURNS VARCHAR(MAX) AS BEGIN DECLARE @OutputString varchar(MAX) DECLARE @CurrentCharacterValue INT DECLARE @Position INT SET @OutputString = '' SET @Position = 1 WHILE @Position < LEN(@InputString)+1 BEGIN SET @CurrentCharacterValue = ASCII(SUBSTRING(@InputString, @Position, 1)) IF @CurrentCharacterValue >= 9 -- Exclude Characters where ASCII value less that 9 BEGIN SET @OutputString = @OutputString + CHAR(@CurrentCharacterValue) END SET @Position = @Position + 1 END RETURN @OutputString END GOCode to use this procedure is as simple as ...
UPDATE tablename SET cleandata = dbo.StripUnwantedCharacters(dirtydata)
Or too generate sql for all columns ...
SELECT 'UPDATE [' + TABLE_SCHEMA + '].[' + TABLE_NAME + '] SET ' + COLUMN_NAME + ' = RTRIM(dbo.StripUnwantedCharacters(' + COLUMN_NAME + '))' from INFORMATION_SCHEMA.COLUMNS WHERE DATA_TYPE IN ('char','varchar') GO
No comments:
Post a Comment