Sep 4 2007

ruby: the scope of ARGV

So it turns out that ARGV has global scope. This is completely fine 99% of the time, but when you’re trying to do something funky this can bite you in the ass. Here’s an example of what made me scratch my head. Script I want to run:

puts 'arguments passed in:'
ARGV.each do |arg|
  puts arg
end

puts "\\ndoing require\\n"

require 'arg_fucker'

puts 'arguments after require:'
ARGV.each do |arg|
  puts arg
end

… and here’s arg_fucker.rb:

ARGV.delete_at(0)

When I run this script, as you’d probably expect, the first element of ARGV gets deleted:

>ruby test_runner.rb a b c d
arguments passed in:
a
b
c
d

doing require
arguments after require:
b
c
d

This behavior is fine, unless you’re someone like me who is trying to make modules into runnable scripts where the first argument is the method to be run. And you’re including one module inside another. <shoots-self/>


Aug 16 2007

example of using hpricot, rexml, and kml

I wrote a little Ruby script to parse some hot spring data and throw it into a kml file for display in google maps or google earth (here’s the results). It doesn’t do anything too complex, but it makes for a decent example of using hpricot to parse html and rexml to generate kml. Here’s the source if anyone would like to take a look:

#written by Benjamin Smith
#www.disjointthoughts.com
#benjamin.lee.smith@gmail.com

require 'hpricot'
require 'open-uri'
require 'rexml/document'
require 'erb'

#arg: url to parse from http://www.hotspringsenthusiast.com/TextLinks.asp
#doesnt work for arkansas, georgia, new york, north carolina, south dakota, virgina
url = ARGV[0]

#parse name of state out of url
state = url[url.index('.com')+5..url.length-5]

#grab the html
doc = Hpricot(open(url))

#create xml for the kml file
xml = REXML::Document.new
kml = xml.add_element 'kml', {'xmlns' => 'http://earth.google.com/kml/2.1'}
kml_doc = kml.add_element 'Document'

#add the name of the state to the xml
(kml_doc.add_element 'name').text = "#{state} Hot Springs"

#add reference to me
(kml_doc.add_element 'description').text = 'created by Benjamin Smith http://www.disjointthoughts.com'
doc_folder = kml_doc.add_element 'Folder'

#add reference to the data source
(doc_folder.add_element 'name').text = 'http://www.hotspringsenthusiast.com/'
(doc_folder.add_element 'description').text = "data source #{url}"

#iterate over the rows in the table
doc.search('//tr').each do |tr|
  tds = tr.search('//td')
  if tds.first.inner_html != 'STATE'
    link = tds[3].search('//a').to_s
    lat = link[link.index('lat=')+4..link.index('&',link.index('lat='))-1]
    lon = link[link.index('lon=')+4..link.index('"',link.index('lon='))-1].gsub('E','')
    topo = link[link.index('href="')+6..link.index('"',link.index('href="')+7)-1] 

    placemark = doc_folder.add_element 'Placemark'
    (placemark.add_element('name')).text = tds[3].search('//a').inner_html.to_s.gsub("\r\n",'').squeeze.downcase
    (placemark.add_element('description')).text = REXML::CData.new('Temperature: '+tds[4].inner_html+'F/'+tds[5].inner_html+'C<br/><a href="'+topo+'">Topo')
    point = placemark.add_element 'Point'
    (point.add_element 'coordinates').text = "#{lon.to_s},#{lat.to_s}"
  end
end

f = File.new("#{state.downcase}.kml",'w')
f.write(''+"\n")
xml.write(f,4)
f.close

#create html file to display kml in google map
erb = ERB.new(File.read('template_hot_springs.erb'))
f = File.new("#{state.downcase}_hot_springs.html",'w')
f.write(erb.result(binding))
f.close

If you’d like to run it on your own, you can download it here. You’ll also need this template file and this css file.


Jun 7 2007

rails: getting ip address from request

This is useful if you want to log the ip address of the person using your rails app (except when the person is behind a proxy, thanks James!):
request.env['REMOTE_ADDR']
Taken from this ruby thread.


Mar 21 2007

ruby: variable scope and passing blocks to methods

Here’s the latest bit of Ruby that had me scratching my head for a minute…

Say you have a method that takes a block of code, and this block of code is executed using yield inside of the method:

def foo
  yield
end

foo { puts "boobs"}

Now let’s say that block needs to access some variable defined inside of that method:

def foo
  n = 'boobs'
  yield
end

foo { puts "n: #{n}"}

This doesn’t work, you end up with “undefined local variable or method `n’ for main:Object (NameError)”. But if n is defined right before the call to method foo it works:

def foo
  n = 'boobs'
  yield
end

n = 'titties'
foo { puts "n: #{n}"}

This, as expected, will output “n: titties”. So how can you use n from within the method inside of the block? Like so:

def foo
  n = 'boobs'
  yield n
end

n = 'titties'
foo { |n| puts "n: #{n}"}

The result of running this code is “n: boobs”. You give the variable n as an argument to yield, and add “|n|” to the beginning of the block. At times blocks still feel foreign to me, but I think I’m starting to get the hang of them.


Mar 2 2007

using negate in ruby regex

In Ruby regular expressions, the ^ character is used to denote the beginning of the String. The the following…

a = 'abcde'
s.match(/^a/)

…returns true/makes a match because the string s begins with the character “a”. But if you want to match on all characters that are not “a”, how do you do it?

s.match(/[^a]/)

Once ^a is placed inside of the square brackets it becomes a character class where the ^ character acts as negation rather than beginning-of-string.

I’m sure most people who pay attention picked this up the first time reading through a Ruby book, but I don’t pay 100% attention all the time and this minor detail slipped by.


Jan 15 2007

file locking in ruby using flock

I was writing a bit a ruby code that used flock to write and read from a file using two different processes. I ran into some funny issues, so I’m going to try to explain the process I went through to reach my final working code.

First let’s say I have a two ruby processes, one is writing a small amount of text to a file while the other is reading from that file. For example the process doing the writing looks like:

for i in 0..1000 do
  puts i.to_s
  file = File.new('shared_file','w')
  file.write("line1\\nline2")
  file.close
end

…and the process reading the file looks like:

for i in 0..1000 do
  puts i.to_s
  file = File.new('shared_file','r')
  lines = file.readlines
  if(lines[0]!="line1\\n")
    puts("got #{lines[0]} instead\\n")
    exit
  end
  file.close
end

Unfortunately when these two chunks of code run at the same time I end up with a message something like..

file_writer.rb:3:in `initialize': Text file busy - shared_file (Errno::ETXTBSY)

This message comes from the process doing the writing. So I think I can positively say that even though I’m only writing a small amount of text, the file needs to be locked in order for both processes to run correctly.

So the next step was to test out the flock File method using a couple of irb sessions. In irb I am able to lock a file by doing the following:

irb(main):001:0> (file = File.new('shared_file','w')).flock(File::LOCK_EX)
=> 0
irb(main):002:0>

When I try to open the same file for reading in the second irb session I get:

irb(main):001:0> (file = File.new('shared_file','r')).flock(File::LOCK_EX)

…this will hang here until I unlock the file from the first irb session. So using this approach to locking the file I can update my writing code:

for i in 0..1000 do
  puts i.to_s
  (file = File.new('shared_file','w')).flock(File::LOCK_EX)
  file.write("line1\\nline2")
  file.flock(File::LOCK_UN)
  file.close
end

… and my reading code …

for i in 0..1000 do
  puts i.to_s
  (file = File.new('shared_file','r')).flock(File::LOCK_EX)
  lines = file.readlines
  if(lines[0]!="line1\\n")
    puts("got #{lines[0]} instead\\n")
    exit
  end
  file.flock(File::LOCK_UN)
  file.close
end

Now I would expect this to work, but it doesn’t behave quite as I’d expect. Sometimes the output of my reader is:

# ruby file_reader.rb
0
got  instead

When I get this output from my reader, the writer continues on its merry way until it terminates. Now this output is saying that instead of reading “line1″ from the file, it read nothing or an empty string. This is weird since the writer is always writing the same thing and the file should always contain “line1\nline2″ Other times once I start the reader the writer spits out:

file_writer.rb:3:in `initialize': Text file busy - shared_file (Errno::ETXTBSY)

This output is also strange, since this is the sort of error I was getting before I added the file locking code. This would lead me to believe that the file is actually not getting locked even though it worked in irb.

So what’s going on? I’m not exactly sure, but I found this thread from someone else who seemed to be having a similar problem. The solution that was purposed included:

“Also, opening files for writing and truncating them may introduce subtle
problems - you cannot lock the file before it’s truncated, so you need to
try open it in read-write mode, then lock and then truncate it…”

So I tried updated my writing code:

for i in 0..1000 do
  puts i.to_s
  (file = File.new('shared_file','r+')).flock(File::LOCK_EX)
  file.truncate 0
  file.write("line1\nline2")
  file.flock(File::LOCK_UN)
  file.close
end

And what do you know, it works. I’m not sure why, but opening the file using “r+” (read-write) instead of just “w” (write) fixes all of the problems that I was seeing previously. I’m not sure if this is the only or best solution to this problem, but this is the only one I could come up with while still using flock to lock the file.


Oct 26 2006

desert code camp

The second Desert Code Camp is coming up this weekend.

“Code Camp is a free, one-day event put on by the local Phoenix community to help promote software development in general. There is no right or wrong language, platform, or technology. If a topic relates in any way to the code that causes a machine to produce a desired result, it’s welcome here.”

The previous code camp was ok, but the sessions were too jammed together. There was no time to meet people and just talk. Everyone kept running from session to session trying to keep up. Some of the sessions were good, while others were a waste of my time.

I’ll be attending again, hopefully they’ve tweaked the schedule enough so it’s no so jam-packed. There aren’t any sessions that are must-attends for me, so I’ll mostly be there to hang out and support the community. If nothing else, at least it’s a free lunch.

Edit: Apparently lunch isn’t free.


Aug 4 2006

Ruby in Steel

SapphireSteel Software claims that its Ruby IDE will have “…IntelliSense (‘code completion’) support in future releases of Steel.” I will be interested to see how well they can manage that one.

Ruby in Steel FAQ.


Jun 28 2006

missing gems/plugins

The designer of a project I’m working on decided to use the RedCloth ruby gem in the project we’re working on. He installed the gem on his local machine and added “require ‘RedCloth’” to environment.rb. Now, since he installed a gem and ruby gems by default are stored in the ruby installation directory, the source repository did not contain the RedCloth library.

Unforunately I forgot to get the RedCloth gem. So when I updated my project source from SVN the project broke output in development.log. I eventually figured out that I was missing the gem. So I downloaded the gem, then moved the whole RedCloth folder into the vendor/plugins directory of the rails project.

The lack of error messages pointing me to the “require ‘RedCloth’” was annoying to say the least. Maybe the require statement should be moved someplace else? Someplace after the logger has been initialized maybe? I wonder if other error/issues in enviroment.rb will make WEBrick die out too. Maybe you can start WEBrick with some advanced debugging for these kinds of situations? Hmm, I guess I should do a little research…