Saturday, June 25, 2005

Comparing numerical data with sed and diff

A common situation that occures when working with Hex data is you will have a piece of data that works and one that doesn't. The idea is to compare the 2 HEX strings without pulling your hair out.

Here is a quick way to compare 2 HEX data strings.

"works_good"
0051000B919145557705F400F5AA44605040B8423F0DC0601AE02056A
0045C60C0
336362e3137362e3231322e3137342f62622f7377696e672
e676966000103497427732061204
24f59202120212021000101

"works_bad"
0051000B919145557705F400F5AA440605040B8423F0DC0601AE02056
A0045C60C0336
362e3137362e3231322e3137342f62622f7377696e67
2e67696600010349742773206120626f7
95c215c215c21000101

Rather than checking the data with a pencil one byte at a time, we can use sed to break up the data into chunks of 10 and then the diff command to show where the first difference is.

echo 0051000B919145557705F400F5AA44605040B8423F0DC0601AE02056A
0045C60C0
336362e3137362e3231322e3137342f62622f7377696e672e6769
66000103497427732061204
24f59202120212021000101 |sed 's/........../&\n/g'>works_bad

echo 0051000B919145557705F400F5AA440605040B8423F0DC0601AE02056A
0045C60C0336
362e3137362e3231322e3137342f62622f7377696e672e67696
600010349742773206120626f7
95c215c215c21000101|sed 's/........../&\n/g'>works_good

[root@a1a ~]# diff --side-by-side a b |grep --color -C500 '|'

0051000B91 0051000B91
9145557705 9145557705
F400F5AA44 F400F5AA44
605040B842 | 0605040B84
3F0DC0601A | 23F0DC0601
E02056A004 | AE02056A00
5C60C03363 | 45C60C0336
62e3137362 | 362e313736
e3231322e3 | 2e3231322e
137342f626 | 3137342f62
22f7377696 | 622f737769
e672e67696 | 6e672e6769
6000103497 | 6600010349
4277320612 | 7427732061
0424f59202 | 20626f795c
1202120210 | 215c215c21
00101 | 000101


We can easily see that 'works_bad' is missing a 0 in position 40.

No comments: