Parse Files for JPG Header

curlydog

Well-known member
Joined
Jul 5, 2010
Messages
50
Programming Experience
Beginner
Hi,
I've spent all afternoon searching for an answer to this, but got nowhere. I'm sure it's simple, but I must be phrasing my searches badly.

I'm trying to parse through a large file which I know contains the raw data for numerous jpg files. I need to start by identifying the location of header for each jpg image and then carve out the image.

The header in this case is as follows;
hxFF hxD8 hxFF hxE0 hx00 hx10 hx4A hx46 hx49 hx46

Any advice to get me going would be appreciated.
Jason
 
OK, just to show I am trying myself, here's some code I found on another thread. I've beena adapting ti to try and meet my needs.
VB.NET:
Dim b As Integer = fs.ReadByte()
        While b <> -1
            If b = &HFF + &HD8 + &HFF Then
                Debug.Print(" hxFFD8FF found at " & SeekOrigin.Current)
                'fs.Seek(-1, SeekOrigin.Current)
            End If
            b = fs.ReadByte()
        End While

The problem seems to be that tht code is written just to match one hex value pair, where I'm trying to find a number of consecutive hex pairs.

Am I heading in the right direction, or do I need a complete rethink?

Thanks
Jason
 
At my previous job we had file like this that contained an index at the beginning of the file letting you know each image's start position and file length. Does yours have something like that?

Here's an example that should get you far enough to find the start of a jpeg. Plenty of stuff left to do like reading and checking if you've got a byte sequence of HxFF HxD9 which should be the end of a JPEG file, writing to an output stream, looping through and getting the next file, etc.

VB.NET:
        Dim filePath = "Some path to your file"
        Using fs = File.OpenRead(filePath)
            Dim blockStart As Long
            Dim buf = New Byte(3) {}

            fs.Read(buf, 0, 4)
            If buf.SequenceEqual(New Byte() {&Hff, &Hd8, &Hff, &He0}) Then
                blockStart = fs.Position
                fs.Read(buf, 0, 2) 'This will be your hx00 & hx10 values which are not necessary
                Dim blockLength = ((buf(0) << 8) + buf(1))
                fs.Read(buf, 0, 4)

                'JFIF is Hx4a Hx46 Hx49 Hx46
                'EXIF is Hx45 Hx78 Hx69 Hx66
                'You could check buf.SequenceEqual again here but I think this is more readable.
                'After the JFIF EXIF check the next byte will be Hx00
                If Encoding.ASCII.GetString(buf, 0, 4) = "JFIF" AndAlso fs.ReadByte() = 0 Then
                    blockStart += blockLength
                    fs.Position = blockStart
                End If
            End If
        End Using
 
Back
Top