Skip to content

2009

[Geeky] RMagick and memory leaks

A recent post talked about creating a "Domain Specific Language" for image processing of ballots. I've made a lot of progress on that project and wanted to give an update.

One commenter (my pal Aleks) said to 'watch out for RMagick as it has major memory leaks. I talked to him further and he recommended using MiniMagick instead. I investigated.

It turns out indeed that my RMagick app was gobbling up tons of memory.

Both RMagick and MiniMagick are Ruby bindings to ImageMagick which is a comprehensive image processing library, written in C I believe. The differences between RMagick and Mini Magick are:

  1. RMagick uses the ImageMagick API and presents a comprehensive 'rubyfication' of the ImageMagtick api. This is useful but you do find that you jump back and forth between the RMagick doc and the ImageMagick doc to see what methods in one correspond to which ones in the other. MiniMagick on the other hand is a very thin veneer over ImageMagick's command line utilities. It generally uses 'method-not-found' to decide to invoke the corresponding C method. This means that the ImageMagick doc is your primary source. The MiniMagick source itself is a single tiny (but sophisticated) Ruby script.

  2. RMagick creates in memory ImageMagick ('malloc') objects and retains pointers to those inside of Ruby structures. Unless those Ruby structures are garbage collected, the ImageMagick objects just hang around and eat memory, hence the memory leak reputation. Because as far as Ruby is concerned, not that much memory has been used yet, natural garbage collections don't occur and your system memory footprint gets bigger and bigger.

  3. MiniMagick works only with files; it never creates the ImageMagick malloc objects and hence does not suffer from the memory leak. On the other hand, with a complex process like what I am working on, you are creating, saving and reading a lot of files, which slows things down.

I did nearly a day of coding comparing the two and while MiniMagick made the memory leaks go away, in the end it was slower for my purposes than RMagick, probably due to the files that were constantly being created, opened and saved. Note after a few dozen complicated operations, the RMagick version got super slow and basically hung because of memory consumption.

That was easily solved with a little research. I found a post that explained how to address the apparent memory leaks in RMagick: I added a forced reclamation of the ImageMagick malloc object whenever I am done with one (Image.destroy! is the call) and the huge leak is gone. Not sure yet whether there are other ones, but for now, RMagick wins for me!

What is the ‘true’ result of an election?

With what's going on in Iran right now, many commentators are wondering whether the elections were fair. That is to say, do the reported results correspond to reality, to how people actually voted?

"In Washington, a State Department spokesman, Ian Kelly, said the United States is “deeply troubled” by the unrest in Iran and is concerned about allegations of ballot fraud. But he stopped short of condemning the Iran security forces for cracking down on demonstrators and said Washington does not know whether the allegations of fraud are, in fact, true." (from "Top Cleric Calls for Inquiry as Protesters Defy Ban in Iran")

I don't know of course, but I want to make a case that the notion of 'the true result' is a difficult one, in an of itself. This of course has applicability here as well, where we have elections being questioned in various parts of country. Presumably we would like to know who 'really' won in Minnesota and so on.

Here's a thought experiment.

Let's assume that you can, with 100% confidence, collect all and only the cast ballots from an election in one place. Let's say there are a hundred boxes sitting on a pallet, with security guards and live web cams.

Ask yourself, is there an objective, theoretically correct tabulation of these ballots? Is there a 'right answer'? And so the job of the election process is to come up with a set of procedures and devices to determine that right answer?

Oddly, I say that the answer is "no". There's no 'right answer'. There's always subjectivity involved. Why? Well let's take some examples, from here in the US, but I say similar examples exist no matter how ballots are designed and votes are counted.

For example, in most states there is a requirement that a paper ballot be filled in by fully filling in the circles in front of the candidates that you choose, and that there are no other marks on the ballot. In fact the rule is that any stray marks on the ballot make it invalid and not counted.

In practice it happens often that a voter marks a circle part way, then crosses that out and clearly marks another candidate. Or that a voter, instead of filling the circle in front of their candidate, instead draws a big circle around the candidate's name. Or that they fill in a write-in candidate, but neglect to fill in the circle in front of the name. In each of these cases, they have 'technically' and 'legally' made their ballot invalid. Theoretically these ballots should not be counted.

Now here's where it gets interesting : in fact the purpose or goal of the law is to capture 'voter intent ' - what did the voter mean? Even if they did not follow the instructions precisely, is it clear who they wanted to vote for?

This is but one of many ways in which, even if the law is clear, the adjudication of the ballot is subject to interpretation. There is no specific objective, theoretically correct tabulation of that ballot. Wow.

So when the vote is close, the losing candidate has a chance to argue about each voter's intent based on what they scribbled on the ballot. That's why recounts can go on and on and end up in the Supreme Court.
And that's why the question of what the 'true' count is of an election in the final analysis doesn't have a real objective answer. (Even you can guarantee (which you can 't) that you have all and only the legally voted ballots in a particular election, another 'fact' that actually doesn't have an objective meaning.)

[disclaimer: I am not an elections expert, this is what I have come to understand from fairly extensive, but still non-expert, study of how elections work.]

Check out TheRentables.com

I was shown a promising new site called TheRentables, for people (I know at least 3) who are looking to rent an apartment. They say:

"The Rentables takes a fundamentally different approach to rental property listings to deliver the most relevant, accurate and comprehensive housing information to anyone at any time." (from About TheRentables)

I looked in Brooklyn, New York, and there were no listings (they are in the very early stages) But in Boston there were some listings. The site looks useful. Now the trick is to get landlords to enter their information.Technorati Tags: realestate, therentables, startups

Election day in Benton, New Hampshire

My post about Barack Obama's election was getting a bit long, so I thought I'd break it up and get some more blog miles out of the story.

First of all, check out this election map from Google. A few things to notice. First of all,the outcome shows (as I said in the previous post) that Obama got a mere 9 votes less than McCain.

Given the comment that 'this is McCain country' we were pretty surprised and really gratified to see this.

There were other signs though. One of the poll workers offered to hold our sign while we went inside to get off our feet for a few minutes, saying "I'd be proud to hold Barack Obama's sign."

She also said how exciting it was to see some Obama presence at little old Benton -- this was the very first time either major candidate had a visibility presence in Benton on Election day.

Cool!

The other thing to notice is (and this is a traditional theme of mine) the miracle of technology. Within hours of the count by hand, by people in Benton (see previous post) and a phone call that they presumably made to the New Hampshire secretary of state) it appears on this web page, broken out by town and county, in color no less. That 's transparency!

Here are two more snapshots of the day. In the first one you cann see the Benton Town Office.

The reason that the sign that Chris is holding has all kinds of random additional names on it is that we got the very last sign that the campaign office in Plymouth had.

I wonder what would have happened if we arrived 5 minutes later!

The second snapshot is of the Barack Obama sign in wild apples that Chris constructed right next to the ramp into the building.

During a pleasant chat with the town moderator we learned a few things.

First of all, that no Barack Obama (or other signs) were allowed within 10 feet. We said we'd be glad to move or remove it if he said so (obviously we didn't want to get into trouble 🙂

Later on, explaining what the town moderator's duties are at the election he explained that he was in charge.

And for example he could kick out anyone he felt was disrupting or being inappropriate the polling place. Ouch!

Still though he never asked us to remove the Barack Obapple sign.

Originally posted on Nov 08, 2008. Reprinted courtesy of ReRuns plug-in.

Just when I was thinking about stopping my NetFlix subscription

TIVO keeps on getting cooler and cooler:

"Netflix said it will begin testing a service Thursday that lets users with TiVo's latest DVR models access movies and television shows from an online library of 12,000 Netflix titles. The service will be available at no additional charge to subscribers of Netflix's DVD rental service, as long as the Netflix customers are on rental plans that cost at least $8.99 a month." (from WSJ)

Odd, and sad for Netflix that somehow in my mind I am giving psychological credit for this new benefit to Tivo.

Odd also that essentially this says I am willing to spend an EXTRA $9 on my monthly Tivo bill to get at this huge collection of movies. Wait, am I, really???

Originally posted on Oct 31, 2008. Reprinted courtesy of ReRuns plug-in.

[GEEKY] Check out Ruby Best Practices book (not yet out ! :)

As you know from a previous post, I've been working on a "domain specific language" for election ballot processing. In my search for information I got a pointer to a book called: Ruby Best Practices. It's not out yet, but it looks like it will be excellent.

You can get a sample chapter (which contained lots of information relevant to my domain specific language work) here.

In it you will "[… snip] look at a favorite topic for budding Rubyists. I’m going to share the secrets behind building flexible interfaces that can be used for domain-specific applications.". Hmm. Does that make me a budding Rubyist? I thought I had already budded 🙂

Anyway, I really got a lot out of the sample chapter and look forward to when the book is out.

Technorati Tags: DSL, Ruby

Exploding head scene

An amusing article in Newsweek about Peter Arnell, self-styled renaissance man, inventor of the "Peapod", a "a mix of Darth Vader, a bullet train, and a Citroën deux chevaux":

"[snip…] With no air conditioning and a top speed of 25 miles per hour, the $12,500 Peapod is basically a fancy golf cart. Arnell hopes people will buy them for doing errands around town. He wants to call customers "peaple" and has designed a line of accessories: pens, flashlights, T shirts, baseball caps, shopping carts, picnic baskets, yoga bags, gardening sets. He's even designed fragrance inserts that create an aromatherapy experience while you drive. "I would argue this business could be hundreds and hundreds of millions of dollars," he says. His counterpart at the meeting, a veteran Chrysler engineer, just nods and says, " Uh huh." (from Newsweek Mad Man")

And the promised exploding head scene from the same article:

"[… snip] if I stay much longer I fear that my head might explode. Either that or I'll burst out laughing. After I leave it occurs to me that the way to understand Peter Arnell is to think of everything he does as a kind of high-stakes performance art. Not just the commercials and advertisements, but everything—the meetings, the memos, the celebrity phone calls, the crazy brainstorming genius shtick. When it works, it works. Who knows why? You can study it, but you can't explain it. So Peter Arnell seduced PepsiCo into forking over millions of dollars, and gave them a memo about perimeter oscillations and the gravitational pull of a soda-pop can. Is that nuts? Probably." (from Newsweek - Mad Man")

I don't know… I thought it was an interesting/funny article.

[GEEKY] A DSL for Image Analysis

I have been working quite a lot on Election Reform over the last few weeks, at least from the technology side.

To be honest there is just so much I could be blogging about in this narrow specialized space that my cup overfloweth, but also it has been an impediment, not knowing where to start. There's so much background and new new learning (for me anyway) that it's been daunting.

Herewith the start of my attempts to further document what I am up to.

One task I've taken on is prototyping a "post election audit" system (more on this soon.) Basically at the heart of that beast is a bit of code to analyze an image of a ballot and figure out what the vote was.

For now my programming language of choice is Ruby , although image processing with Ruby may still turn out to be impractical. I've been studying up on the task, reading books (see Practical Algorithms for Image Analysis, for example) and studying techniques and image processing code libraries that seem appropriate.

Two of the biggies I have come across are RMagick/ImageMagick and OpenCV. Both have a lot of history and dynamic communities. I don 't know yet which is the best one to use. The investigation continues.

But one idea I have started to implement which is quite fruitful on many levels is a "Domain Specific Language" for Image Analysis. There is a lot of literature on creating DSLs, and in particular DSLs hosted on Ruby. They are easy to do and in this particular domain, add a lot to my productivity and ability to frame and comprehend what the heck I am doing.

I won't go into hairy technical detail here but I would be glad to share my approach and my code with anyone who asks. Here's what one of my earliest test programs look like as written in this home-brew DSL:

`
open_image :one, "432Leon200dpibw431.tif"
open_image :target, "target2.tif"
open_image :t3, "target3.tif"

binarize :one
binarize :target

find_similar_regions :one, :target, :points
print :points

relativize_points :points, :outpoints
print :outpoints

deskew :one
write_image :one, "432Deskewed.tif"
find_first_nonwhite_row :one, :nonwhite_row
print :nonwhite_row
`

See how it talks in very high level primitives about image processing? Also see how the choice between OpenCV and RMagick is totally hidden? I can change my mind later and not break anything Is it kind of readable?

I will build out this DSL just in the direction and to the extent needed for my particular task, Ballot analysis. But you can see that it can go pretty far. How'd'you like it?

Technorati Tags: dsl, ruby, opencv, imagemagick, ballots