Sat Aug 09 21:05:00 UTC 2008

Working with DBF files and Ruby - a simple dbf2csv example

Posted in Ruby at 09:05 PM by mohits

Just a very short note on using Ruby to process DBF files. A number of applications still use DBF files as a simple database format (I’ve since moved on to SQLite3 – more about that some other time) – most importantly, it’s used as part of the Shapefile format.

The DBF Gem

The gem to use is called DBF and it can be got from http://dbf.rubyforge.org/dbf/ – just install it using gem install


command>gem install dbf
Bulk updating Gem source index for: http://gems.rubyforge.org
Install required dependency hoe? [Yn]  Y
Install required dependency rubyforge? [Yn]  Y
Successfully installed dbf-1.0.6
Successfully installed hoe-1.7.0
Successfully installed rubyforge-1.0.0
Installing ri documentation for dbf-1.0.6...
Installing ri documentation for hoe-1.7.0...
Installing ri documentation for rubyforge-1.0.0...
Installing RDoc documentation for dbf-1.0.6...
Installing RDoc documentation for hoe-1.7.0...
Installing RDoc documentation for rubyforge-1.0.0...

Simple Usage

As a simple example, I convert a DBF file to a comma-separated list. In this example, we check the type of field to see if it is non-numeric. Non-numeric fields are wrapped in quotes so that we don’t have to worry about commas in the text. If you are sure that none of the fields have commas, you could just proceed to output them without quotes – or you could use gsub! to replace any commas. Either way, this is just an example!


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
require 'rubygems'
require 'dbf'

#Load in the DBF table
table = DBF::Table.new(ARGV[0])

#wraps a field in quotes
def qt(field)
   "\"#{field}\""
end

count = 0

# Convert to CSV and output to stdout
table.records.each do |record|  #for all records
  csvr = []
  table.columns.each {|col|  #for each column in the record
    #Wrap non-numeric fields in quotes
    field = (col.type=='N')? record.attributes[col.name] : qt(record.attributes[col.name])
    csvr << field 
  }
  count +=1
  puts csvr.join(',')
end

puts "[#{db}]\t- processed #{count} records."

When using DBF, you can usually access the attribute by doing something like record.x (if x is the name of the attribute that you want) – this works fine except in cases where the name of the attribute is something like type which will give you interesting results :) That’s why for a generic example, we’ve used record.attributes[col.name] which should work in all cases.

References

Sorry, comments are closed for this article.