...making Linux just a little more fun!

Software Review: uvhd - file investigation utility

By Owen Townsend

'uvhd' is a binary file investigation utility. It displays the contents of any file in vertical hexadecimal format, and prompts for commands to browse, search, select, update, scan/replace, print, translate, etc. uvhd is an interactive utility with a command line interface and 18 help screens.

'uvhd' is Copyright(C) 1993-2008, UV Software Inc, and is distributed under GPLv3.

Download and Compile

You may download uvhd from http://www.uvsoftware.ca/libuvhd.htm. uvhd requires only the standard ANSI C libraries. Compile as follows:

 cc src/uvhd.c -o bin/uvhd

The above assumes you are logged in to your home directory on Linux or Unix, have set up sub-directories 'src' and 'bin', and stored the downloaded source (uvhd.c) in the 'src' subdir. We will also assume you have added $HOME/bin to your $PATH for the following tutorials.

uvhd Tutorials

This article will present 3 illustrations, using uvhd on 3 types of binary files, demonstrating file display, browsing, searching, selecting, and updating.

In this article, we will not cover the many other features and options which are documented in the reference manual http://www.uvsoftware.ca/uvhd.htm.

A1. Tutorial #1 - investigate an executable binary program
  - search the uvhd program itself (for 'version')
B1. Tutorial #2 - investigate /var/log/wtmp log file
  - logs events such as reboot, shutdown, logins(userids)
  - select records for specified userid, write separate file
C1. Tutorial #3 - investigate a typical mainframe file migrated to Unix/Linux
- customer master file with Name,Address, and 24 packed decimal monthly sales
- search and update 1 record at a time interactively
- or 1 command to search all records replacing 1 pattern with a 2nd pattern

A1. Tutorial #1 - search executable file

If you have downloaded and compiled uvhd (as described above), you can do this tutorial right now. For our first binary file to investigate, let's use the compiled uvhd program. We will also specify options 'r256s3', which specifies 'r'ecord size as 256 and 's'pacing as 3 (space after scale and between groups).


 uvhd bin/uvhd r256s3  <-- execute uvhd to display bin/uvhd with options r256s3
 ====================    - r256 Record-size (256 is the default if omitted)
                         - s3 Spaces between scale and 3 line groups

filename=/home/uvadm/bin/uvhd options=r256s3 records=813 filesize=104126 recsize=256 fsize%rsize(remainder)=62

                10        20        30        40        50        60
 r#  1 0123456789012345678901234567890123456789012345678901234567890123
     0 .ELF..............>.......@[email protected][[email protected]...@..... <-chars
       744400000000000000300000B040000040000000450000000000403000401010 <-zones
       F5C621100000000020E010000D000000000000008B100000000000808000D0A0 <-digits
    64 ........@.......@.@.....@.@.....................................
       00000000400000004040000040400000C0000000C00000000000000000000000
       6000500000000000000000000000000001000000010000008000000030004000
   128 ..........@.......@.............................................
       0000000000400000004000001000000010000000000000000000000000000000
       020000000200000002000000C0000000C0000000100000001000500000000000
   192 ..@[email protected]........ ..............@.......@a.....
       0040000000400000E3000000E300000000200000000000000400000004600000
       0000000000000000491000004910000000000000100060000010000000100000

null=next,r#=rec,s=search,u=update,x=rollback,p=print,i=iprint,w=write,e=count ,g=genseq#,c=chkseq#,t=translate(ta=Asc,te=Ebc,tu=Upr,tl=Lwr,tc=Chars,tp=Pers) ,R#=Recsize,h1=char,h2=hex,q=quit,?=help -->

uvhd displays data in 'vertical hexadecimal', 64 byte segments, in 3 line groups (characters, zones, and digits). For example, the 'E' in 'ELF' is x'45' in horizontal hexadecimal. Note that any unprintable bytes are shown as periods on the 'character' line, but you can see the true value on the 'zone' and 'digit' lines. Of course there are some bytes whose zone/digit bits just happen to coincide with an ASCII printable character, such as '@' x'40'.

Also note that the byte offset (zero relative displacement) is shown at the beginning of each 3 line group. For example '128' is the offset of the 1st byte in the 3rd group of 64.

You would also get a warning (only on the 1st record) if the filesize is not evenly divisible by the specified record size. I have not shown it here since it would not be relevant for program executable files, but would be important for fixed length data files (probably migrated from a mainframe to Unix/Linux).

A2. Search command

After the uvhd data display, you are prompted to enter a command. A brief command summary is displayed (null=next,r#=rec,s=search,u=update,...,?=help). You may enter '?' for 18 help screens (options, command formats, etc).

Let's use the 'search' command to search for 'version' (assuming the program contains that word).


 uvhd bin/uvhd r256    <-- execute uvhd with option 'r' recsize=256
 ==================        (r256 is the default if no options specified)

 ---> s 'version'      <-- 's'earch for 'version' anywhere in program
      ===========
                      10        20        30        40        50        60
 r#      276 0123456789012345678901234567890123456789012345678901234567890123
       70400 useful,.but WITHOUT ANY WARRANTY; without even the implied warra
             7766762067725454455244525455445532767667726766276626676666276776
             53565CC025407948F5401E9071221E49B07948F540565E048509D0C954071221
          64 nty of..MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.....
             6772660044544445444445526724454455244524254554454452555545420000
             E490F600D52381E4129C9490F20694E53306F20100124935C1200520F35E0000
         128 See the full description of the GNU General Public License at:.h
             5662766267662667676776662662766244524666766257666624666676267306
             3550485065CC0453329049FE0F6048507E5075E521C0052C930C935E35014A08
         192 ttp://www.gnu.org/licenses......uvhd version 20080807 - Copyrigh
             7773227772667267626666676720000077662767766623333333322246777666
             440AFF777E7E5EF27FC935E353E000005684065239FE0200808070D03F092978
                                                  *******
 found--> s 'version' <--at byte# 229 of record# 276
 rec#=276 rcount=406 rsize=256 fsize=104126 bin/uvhd

 ---> ss     <-- may enter 'ss' to repeat the last 's' search
      ===      - will find 'version' in 2 other records (not shown here)

 ---> 1      <-- could then reset to record# 1 and repeat 'ss' to find again
      ===

B1. Tutorial #2 - /var/log/wtmp search/select

For our 2nd example, let's investigate /var/log/wtmp, a Unix/Linux system file that stores events such as reboot, shutdown, runlevel changes, LOGINs, and userids logging in. This is a binary file with fixed record size 384 bytes.


 uvhd /var/log/wtmp r384   <-- investigate wtmp (recsize=384)
 =======================
 uvhd filename=/var/log/wtmp
 options=r384 lastmod=2008081704 today=20080817143240 print=p1
 rec#=1 rcount=1343 filesize=515712 recsize=384 fsize%rsize(remainder)=0
                      10        20        30        40        50        60
 r#        1 0123456789012345678901234567890123456789012345678901234567890123
           0 ....05..~...............................~~..runlevel............
             0000330070000000000000000000000000000000770077666766000000000000
             10000500E0000000000000000000000000000000EE0025EC565C000000000000
          64 ............2.6.18-92.1.6.el5xen................................
             0000000000003232332332323266376600000000000000000000000000000000
             0000000000002E6E18D92E1E6E5C585E00000000000000000000000000000000
         128 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         192 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         256 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         320 .......................H<.......................................
             000000000000000000008A843900000000000000000000000000000000000000
             00000000000000000000B928C0B0000000000000000000000000000000000000
  1. On the 1st record above, the event is a runlevel change (on a reboot). 'runlevel' is at 44(8) (dsplcmnt 44 0 relative, length 8). On other records this field might be: shutdown, LOGIN, userids, etc).

  2. Note that the time is coded in binary in bytes 340(4b), the Unix epoch time, the number of seconds since Jan 1, 1970. x'8BA98248' is 2008/07/19_19:57:15. The starting displacement is 320+20=340 since the row starts at 320 and the x'8B' lines up under 20 on the scale.

  3. You can use the 'utmpdump' system command to display the file contents

  4. Unix/Linux systems provide the 'utmpdump' utility to display /var/run/utmp and /var/log/wtmp records in a user friendly readable format. utmpdump will convert the binary Unix-times to a human readable format. Try 'utmpdump /var/log/wtmp | more'.

  5. In the next section, we will use uvhd to select records for a desired userid and then use utmpdump to display the contents.

[ If your distro doesn't have this useful command, simply download this tarball, decompress it, and build the program with 'cc -o utmpdump utmpdump.c'. -- Ben ]

B2. Search /var/log/wtmp for userid uvadm


 uvhd /var/log/wtmp r384   <-- startup uvhd for /var/log/wtmp (recsize 384)
 =======================     - will display 1st record (same as above)
                             - not shown here to save space
 --> s 44(5),'uvadm'     <-- search for records with userid 'uvadm'
                           - displays 1st record found as follows:
 r#       74 0123456789012345678901234567890123456789012345678901234567890123
       28032 ........tty2............................2...uvadm...............
             0000800077730000000000000000000000000000300077666000000000000000
             70006F004492000000000000000000000000000020005614D000000000000000
          64 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         128 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         192 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         256 ................................................................
             0000000000000000000000000000000000000000000000000000000000000000
             0000000000000000000000000000000000000000000000000000000000000000
         320 ....................A..H9.......................................
             000000000000000080004C843E00000000000000000000000000000000000000
             00000000000000006F0019489640000000000000000000000000000000000000

found--> s 44(5),'uvadm' <--at byte# 44 of record# 74

We could use 'ss' to repeat the search for the next matching record (as we did in tutorial #1 for 'version'), but now we will demo the select/write command.

B3. Select all records for userid 'uvadm'


 --> w9999 44(5),'uvadm'    <-- Write all records with 'uvadm' in bytes 44-48
     ===================        to a tmp/file
                      10        20        30        40        50        60
 r#     1340 0123456789012345678901234567890123456789012345678901234567890123
      514176 ........tty2............................2...uvadm...............
             0000800077730000000000000000000000000000300077666000000000000000
             70005F004492000000000000000000000000000020005614D000000000000000
                      ----- bytes 64-319 omitted to save space -----
         320 .......................H.8......................................
             0000000000000000000011A4E300000000000000000000000000000000000000
             0000000000000000000050888890000000000000000000000000000000000000

w9999 44(5),'uvadm' 30 written, tmp/wtmp_080817_151157W

  1. The 'w'rite command writes selected records to the 'tmp/' subdir within your current working directory, with a date/time stamp as shown above, and with 'W' suffix to identify as a Write command output.

  2. Note tmp/... is in your current working directory (NOT /tmp). If ./tmp is not present uvhd will create it.

B4. Examine records selected by Write

We can now examine the selected records as follows:


 uvhd tmp/wtmp_080817_153211W r384  <-- examine selected records (user 'uvadm')
 =================================
 uvhd filename=/home/uvadm/tmp/wtmp_080817_153211W
 options=r384 lastmod=2008081715 today=20080817153306 print=p1
 rec#=1 rcount=30 filesize=11520 recsize=384 fsize%rsize(remainder)=0
                      10        20        30        40        50        60
 r#        1 0123456789012345678901234567890123456789012345678901234567890123
           0 ........tty2............................2...uvadm...............
             0000800077730000000000000000000000000000300077666000000000000000
             70006F004492000000000000000000000000000020005614D000000000000000
                      ----- bytes 64-319 omitted to save space -----
         320 ....................A..H9.......................................
             000000000000000080004C843E00000000000000000000000000000000000000
             00000000000000006F0019489640000000000000000000000000000000000000

We can now use 'utmpdump' to display the selected records in user friendly format with the binary times converted to a readable format.


 utmpdump tmp/wtmp_080817_153211W
 ================================
 [7] [03974] [2   ] [uvadm   ] [tty2   [Mon Jul 21 10:37:05 2008 PDT]
 [7] [03936] [2   ] [uvadm   ] [tty2   [Tue Jul 22 07:39:50 2008 PDT]
 [7] [03974] [2   ] [uvadm   ] [tty2   [Wed Jul 23 10:07:30 2008 PDT]
         - - - - - 24 records omitted to save space - - - - -
 [7] [03971] [2   ] [uvadm   ] [tty2   [Fri Aug 15 06:32:59 2008 PDT]
 [7] [03973] [2   ] [uvadm   ] [tty2   [Sat Aug 16 08:06:10 2008 PDT]
 [7] [03973] [2   ] [uvadm   ] [tty2   [Sun Aug 17 04:48:37 2008 PDT]

C1. Tutorial #3 - Customer Master with packed decimal fields

For our third example we will use the file 'custmas1', which you can download from http://www.uvsoftware.ca/custmas1. This is a mainframe-style customer Name and Address that has been migrated to Unix/Linux. It has fixed length records of 256 bytes, with 24 * 5 byte packed decimal fields (monthly sales), and without linefeeds (which are required by the usual Unix/Linux editors). The field layout is as follows:

      000-005 - cust#
      010-034 - customer name
      035-059 - address
      060-075 - address
      077-078 - province
      080-089 - postal code
      090-101 - telephone#
      102-119 - contact name
      120-179 - this year monthly sales 12 * 5 byte packed decimal
      180-239 - last year monthly sales 12 * 5 byte packed decimal
      240-256 - unused

Download the custmas1 demo file from http://www.uvsoftware.ca/custmas1 into the data/ subdir in your homedir.


 uvhd data/custmas1 r256u  <-- execute uvhd on custmas1 with options r256u
 ========================    - option 'u' is required to allow Updates
                             - uvhd displays 1st record and prompts for commands

uvhd filename=/home/uvadm/data/custmas1 rec#=1 rcount=32 filesize=8192 recsize=256 fsize%rsize(remainder)=0

                      10        20        30        40        50        60
 r#        1 0123456789012345678901234567890123456789012345678901234567890123
           0 130140    EVERGREEN MOTORS LTD.    1815 BOWEN ROAD          NANA
             3333332222454545444244545524542222233332445442544422222222224444
             130140000056527255E0DF4F230C44E0000181502F75E02F140000000000E1E1
          64 IMO          BC V9S1H1    250-754-5531 LARRY WRENCH     ..4V|...
             4442222222222442535343222233323332333324455525544442222201357000
             9DF00000000002306931810000250D754D55310C12290725E38000000246C000
         128 .........W0....`........)X}..f3.....\.................4V}...f...
             0000000005300016000000002570063100095000000000000000013570016000
             0C0000C0270D0540C0000C0098D0263C0444C0000C0000C0000C0246D0056C00
         192 .E|...V}.......................f.....<........f.C 19950531
             0470005700000000880000000018000680001300000000694233333333222222
             35C0046D0000C0023C0000C0083C0056D0012C0000C0016D3019950531000000

null=next,r#=rec,s=search,u=update,x=rollback,p=print,i=iprint,w=write,e=count ,g=genseq#,c=chkseq#,t=translate(ta=Asc,te=Ebc,tu=Upr,tl=Lwr,tc=Chars,tp=Pers) ,R#=Recsize,h1=char,h2=hex,q=quit,?=help -->

Note the 24 * 5 byte packed decimal fields from 120-239. The 1st field is x'001234567C', which is $12,345.67+ Packed fields can be identified by the sign x'_C'(+) or x'_D'(-) in the right hand nibble of each field.

C2. custmas1 - Search/Update

For this tutorial, we will Search for incorrect province codes and Update them. The province code in the 1st record displayed above is 'BC' which is correct for British Columbia, but there are some records coded as 'AL' (Alabama), which should be corrected to 'AB' (Alberta).

We will specify the search field as '77(2)', offset 77 (0 relative) and length (2). If we did not have the record layout above, we could determine the offset by adding 64+13=77. i.e., 64 bytes in the 1st segment + 13 bytes into the 2nd segment. The 1st byte of the province code (BC, AL, AB, etc.) lines up under 13 on the scale preceding the record.


 --> s 77(2),'AL'     <-- Search for record with 'AL' in bytes 77-78
     ============       - will display found record and prompt for commands
                      10        20        30        40        50        60
 r#       13 0123456789012345678901234567890123456789012345678901234567890123
        3072 201120    ALLTYPE RENTAL LTD.      BOX 1819                 DRAY
             3333332222444555425445442454222222244523333222222222222222224545
             20112000001CC4905025E41C0C44E0000002F801819000000000000000004219
 'AL'-->  64 TON VALLEY   AL T0E0M0    403-246-5274LARRY ZOLF        ........
             5442544445222442534343222233323332333344555254442222222200000000
             4FE061CC590001C04050D00000403D246D5274C12290AFC6000000000000C000
         128 ..........Fl...Il......................................vl..9q...
             0000000000460014600000000000000000000000000008900000000760037100
             0C0000C0086C0039C0000C0000C0000C0000C0000C0003C0000C0066C0091C00
         192 .4..................%.L.............I...........A 20010731
             1390000000000000000020400810000000004000000000004233333333222222
             24C0000C0000C0000C0054C0095C0000C0039C0000C0000C1020010731000000
      found--> s 77(2),'AL' <--at byte# 77 of record# 13
 rec#=13 rcount=32 rsize=256 fsize=8192 dat1/custmas1
 null=next,r#=rec,s=search,u=update,x=rollback,p=print,i=iprint,w=write,e=count
 ,g=genseq#,c=chkseq#,t=translate(ta=Asc,te=Ebc,tu=Upr,tl=Lwr,tc=Chars,tp=Pers)
 ,R#=Recsize,h1=char,h2=hex,q=quit,?=help -->

 --> u 77(2),'AB'    <-- Update bytes 77-78 with 'AB'
     ============      - re-displays record to confirm Update to 'AB'
                       - Updated record not shown here to save space

 --> ss              <-- repeat previous Search (double letters repeat commands)
     ===               - next AL record not shown here to save space

 --> uu              <-- repeat previous Update
     ===               - Updated AB record not shown here to save space

C3. custmas1 - Multi-Record Search/Update

There could be many records with incorrect province 'AL' to be corrected to 'AB' and yes, there is a faster way to perform multi record Search/Update. We will assume you have restored the original downloaded custmas1 demo file to your $HOME/data/custmas1.


 uvhd data/custmas1 r256u    <-- re-execute uvhd on restored custmas1
 ========================      - displays 1st record and prompts for command
                               - 1st record not shown here to save space

 --> u999 77(2),'AB',,'AL'   <-- Update 77-78 to 'AB', IF existing 'AL'
     =====================     - displays last record updated as follows:
                      10        20        30        40        50        60
 r#       27 0123456789012345678901234567890123456789012345678901234567890123
        6656 318833    TOP NOTCH CONSTRUCTION   BOX 308, STN J           CALG
             3333332222545244544244455554544422244523332255424222222222224444
             31883300004F00EF43803FE3425349FE0002F80308C034E0A0000000000031C7
          64 ARY          AB T2A4X6    403-385-2965HARRY SMIRNOFF    ..85\...
             4552222222222442534353222233323332333344555254454444222200335000
             12900000000001204214860000403D385D29658122903D92EF6600000085C000
         128 ................................................................
             0000000000000000000000000000000000000001100000000000000000000000
             0C0000C0000C0000C0000C0000C0000C0000C0007C0000C0000C0000C0000C00
         192 .....................p...............<..%P......C 20021130
             0000000000000000000017800000000000008300258000004233333333222222
             00C0000C0000C0000C0000C0000C0000C0027C0050C0000C3020021130000000
         EOF, 32 records read, 11 updated u999 77(2),'AB',,'AL'
 rec#=27 rcount=32 rsize=256 fsize=8192 tmp/custmas1
 null=next,r#=rec,s=search,u=update,x=rollback,p=print,i=iprint,w=write,e=count
 ,g=genseq#,c=chkseq#,t=translate(ta=Asc,te=Ebc,tu=Upr,tl=Lwr,tc=Chars,tp=Pers)
 ,R#=Recsize,h1=char,h2=hex,q=quit,?=help --> ** quit request - program ended **

Notes about Multi-Record Search/Update


 --> u999 77(2),'AB',,'AL'  <-- Update 77-78 to 'AB', IF existing 'AL'
     =====================    - displays last record updated (as shown above)
  1. u999 <-- means (search)/Update the next 999 records. Just 'u' with no count would update only the currently displayed record.

  2. u999 77(2),'AB' <-- this (1st 2 operands only) would update all records if operands 3,4 were not specified.

  3. ,,'AL' <-- conditions in operands 3 and 4 for update specified by operands 1 and 2. Note that operand 3 (,,omitted) defaults to op1, but you could specify a different field.

  4. All rules are documented in detail at http://www.uvsoftware.ca/uvhd.htm

Conclusions

I hope these examples give you some ideas on how you might use 'uvhd', and I welcome any feedback on what you use it for.

Thanks for reading this, and I hope you agree with most of my customers who say: "uvhd is our favorite utility".

If you find any bugs or have suggestions for improvements, please email me (Owen Townsend, [email protected], http://www.uvsoftware.ca).

References

UVSI Home-Page


Talkback: Discuss this article with The Answer Gang


[BIO]


Owen Townsend, UV Software Inc, 4667 Hoskins Rd
North Vancouver BC, V7K2R3 Canada
[email protected] www.uvsoftware.ca
Tel: 604-980-5434 Fax: 604-980-5404

Owen has a science degree from Ontario Agricultural College (now University of Guelph), and taught high school science, physics, and chemistry. Owen then switched careers to work many years for Sperry-Univac (which merged with Burroughs in 1986 to create Unisys).

Owen is now the president of UV Software Inc, which was founded in 1993 to develop and market software for converting mainframes to Unix and Linux. For detailed descriptions of the JCL, COBOL,& DATA conversions, please see the web site at 'http://www.uvsoftware.ca'.

Since 1993, UV Software has supplied conversion software, training, and assistance to convert about 50 mainframes to Unix or Linux. Please see the customer list and some customer comments on the web site at 'http://www.uvsoftware.ca/uvintro.htm#G1'.

Owen enjoys jogging on the trails in Lynn Headwaters park in North Vancouver. He has jogged for over 30 years, but only started running 1/2 marathons in 2006. Since then he has run 6 and surprised himself by winning (in his age group), the 2007 Toronto International 1/2 marathon in 1 hour 52 minutes.

You can see a few photos of Lynn Valley, jogging, skiing, kayaking, etc at http://www.uvsoftware.ca/photos.htm. Owen has 3 children and enjoys the 5 grandchildren (ages 1 to 5).


Copyright © 2008, Owen Townsend. Released under the Open Publication License unless otherwise noted in the body of the article. Linux Gazette is not produced, sponsored, or endorsed by its prior host, SSC, Inc.

Published in Issue 154 of Linux Gazette, September 2008

Tux