# Tcl’s [binary scan] and the Go Challenge 1 _Edited to improve the code and for style after publication. Thanks to [Poor Yorick](https://wiki.tcl-lang.org/page/Poor+Yorick) and [rich123](https://github.com/rich123) for suggestions that helped improve the code and the comments._ In the programming language Tcl the command [`binary scan`](https://wiki.tcl-lang.org/page/binary+scan) decodes binary data according to a format string. For example, the command ```tcl binary scan $data {cu4 f1 a*} fourBytes aFloat remainder ``` will - Read four unsigned 8-bit integers (`cu4`) from the variable `data` and put them into the variable `fourBytes`; - Read one 32-bit floating point number (`f1`) and put it in `aFloat`; - Finally, put the remaining string (`a*`) in `remainder`. It will do all of this without altering `data` itself. I thought of this command when reading the [Go Challenge 1](https://web.archive.org/web/20160205125716/http://golang-challenge.com/go-challenge1/), a programming challenge that involves reverse engineering a binary file format and writing a program that extracts data from it. The command proved very useful in solving the challenge. The challenge is fun, so if you haven't solved it yourself I encourage you to go (no pun intended) and try it before you read my solution, lest it spoils some of the fun for you. ## Code [Original GitHub gist](https://gist.github.com/dbohdan/89b4f2d9ecd3a5b9c362). ### `decoder.tcl` ```tcl #! /usr/bin/env tclsh package require fileutil namespace eval ::decoder { # Allow us to use +, -, *, /, etc. as commands outside of the [expr] DSL. namespace path ::tcl::mathop } proc ::decoder::decode-file {filename} { # Read the entire file into memory as binary. [::fileutil::cat] runs the # command ::fileutil::cat and substitutes its return value in its place. # $filename substitutes the value of that variable. set data [::fileutil::cat -translation binary $filename] # Decode the header with [binary scan]. The curly braces around the format # string quote it. They are analogous to single quotes in the POSIX shell. # The components of the format string used to decode the file header have # the following meaning: # # a6 - a string of six characters # W - one 64-bit big-endian integer # A32 - a string of 32 characters with trailing blanks and nulls discarded # r1 - one little-endian 32-bit float # a* - the remainder of the string unchanged # # The first four values are assigned to elements in the array "header". The # remainder of the string is assigned to the variable "data", replacing its # previous value. binary scan $data {a6 W A32 r1 a*} \ header(magic) header(length) header(version) header(tempo) data # {} is an empty sting as well as an empty list. set tracks {} set charsRemaining [- $header(length) 36] while {$charsRemaining > 0} { # Assign each item of the list returned by [decode-track] to a variable. lassign [decode-track $data] newTrack charsDecoded data # Append $newTrack to the list tracks. lappend tracks $newTrack # Decrease charsRemaining by $charsDecoded. incr charsRemaining -$charsDecoded } # A list with the format of {key1 value1 key2 value2 ...} is also a valid # Tcl dictionary. We will use this later in [pretty-print]. [array get] # converts an array to a dictionary. return [list header [array get header] tracks $tracks] # Aside: Arrays are an older feature of Tcl. They allow a convenient syntax # shortcut of "name(key)" to access their elements, which we have used # above. However, unlike most things in the language they are *not* # immutable values. This makes it necessary to convert our headers array to # a dictionary before we return it. } proc ::decoder::decode-track {data} { set charsDecoded [string length $data] # Read the length of the track name. "cu" means an unsigned 8-bit integer. binary scan $data {cu1 cu3 cu1 a*} track(number) _ track(nameLength) data # Below we substitute $track(nameLength) in the format string to decode a # variable-length field. You don't need double quotes to perform # substitution in Tcl. binary scan $data a${track(nameLength)}cu16a* track(name) track(score) data incr charsDecoded -[string length $data] return [list [array get track] $charsDecoded $data] } proc ::decoder::pretty-print {data} { # The command [dict with dictionary code] assigns the values of the keys in # $dictionary to variables with the same names in the current scope # (procedure), runs the code then put the changes back into the dictionary. # E.g., # # set test {foo bar}; dict with test {puts $foo; set foo baz} # # outputs "bar" then changes the value of test to {foo baz}. Since we don't # want to change any of the values we will leave the code part empty. dict with data { # no code here } ;# $header is now a dictionary. dict with header {} puts "Saved with HW Version: $version" puts "Tempo: [format-float $tempo]" foreach track $tracks { dict with track {} # Alter score with a map... set score [string map {0 - 1 x} [join $score ""]] # ...and then with a regular expression. regsub {(....)(....)(....)(....)} $score {|\1|\2|\3|\4|} score puts "($number) $name\t$score" } } proc ::decoder::format-float {f} { # The first argument to the [if] command is an expression. To make doing # math more convenient expressions use the [expr] DSL. It is infix (meaning # that you write "$a + $b * $c" rather than "[+ $a [* $b $c]]") and has # functions that take arguments in parentheses. if {round($f) == $f} { return [format %.0f $f] } else { return [format %.1f $f] } } set decoded [::decoder::decode-file [lindex $argv 0]] ::decoder::pretty-print $decoded ``` ### `decoder_test.go.diff` ```diff decoder_test.go | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/decoder_test.go b/decoder_test.go index a3f0e19..92763a2 100644 --- a/decoder_test.go +++ b/decoder_test.go @@ -2,6 +2,7 @@ package drum import ( "fmt" + "os/exec" "path" "testing" ) @@ -61,7 +62,9 @@ Tempo: 999 } for _, exp := range tData { - decoded, err := DecodeFile(path.Join("fixtures", exp.path)) + output, err := exec.Command("tclsh", "decoder.tcl", + path.Join("fixtures", exp.path)).Output() + decoded := string(output) if err != nil { t.Fatalf("something went wrong decoding %s - %v", exp.path, err) } -- 2.1.0 ``` ## Page metadata URL: Published 2015-03-20, updated 2024-08-16. Tags: - binary data - data - Go - old blog - programming - Tcl Index: [Old blog](/old-blog).