Category: Code

Last Subversion repository decommissioned – at last!

Posted by – 01/10/2015

Today I completed the git migration of the last Subversion repository we still had online at Propus.

Nothing different from what we had done for all the rest, but this was an old repository, mostly with unmaintained code, modeled after “project-as-subdirectory” style… Good old days…

Two scripts – I am sure people can find somewhere else on the Internet – are worth pasting here:

First, I detected 3 revisions committed by root user. The first two matched the creation of first two projects (AKA directories), so I mapped the three of them to the same person (who had this bad habit of logging as root). It ended up I was wrong for the third revision: it was the beginning of a new project, but by a different person. Since the authors file maps one-to-one, “root” could only be one and the person.

So, after the migration, I just use filter-branch to fix that particular commit. Trivial, but worth a gist:

Lastly, there are some projects (mapped as subdirectories in the new git repo), that we’d like to split from the “archived” do keep using or adding to them. They will become active projects again. So, we cloned the whole repo again, and used filter-branch to the rescue:

Since that was the last Subversion repository still alive, I’d like to end this post with a thanks for all the years of good service! Time to move on…

Such a big directory…

Posted by – 17/09/2014

There’s no easy way to list files in a 10-million-file directory. Our beloved find and ls will almost never do it without requiring your whole week. I know it’s a bad design to have such a beast in place, but hey… we don’t always control the design of things our customers like to deploy. And some customers seems to like to defy logic sometimes.

Anyway, having to work with such thing requires some special plumbing. I usually code something quick using getdents in C, but I always forget to put it around and end up coding everything again everytime. I was about to upload it to github this time, but I found something already there. So this is just a heads up if there’s anyone out there needing this.

Simple CGI MJPEG Streamer in Bash

Posted by – 30/06/2014

This is mainly just for personal reference; I’m posting here since it may help someone else. I am sending images captured in surveillance IP cameras via FTP to some hosting. This is for convenience, since any cheap surveillance IP camera has this facility. So, how to watch in near-real-time without having to refresh the browser or using some javascript trickery? Also, How to watch using common IPcam mobile clients?

Easy enough: just create a MJPEG stream and use a CGI-capable HTTP server for the job. Some quick hacking (and inotify abuse) gave me the following Bash script:

Now, I just have to figure-out how to kill the process afterwards 🙂

Goodbye Jorte, hello ownCloud

Posted by – 30/05/2014

After I migrated from Palm Treo to Android phones, I was very displeased with the Calendar application. I don’t use Google Apps (my phone is AOSP), which only leaves me with (argh!) Exchange for online-synced calendar. Besides, the first incarnations of the Calendar application in AOSP were barely usable.

So I just went for a standalone free (as in beer) Calendar application named Jorte, which is really good. Recently, I’ve been investigating ownCloud and CalDAV, and decided to review the state of current AOSP Calendar application. To my surprise, it have evolved a lot (to the point of being usable), but still no CalDAV sync.

So I found DAVdroid in F-droid repository, which is an interesting application that can register a CalDAV account that is usable by AOSP Calendar. So, now, I am able to use ownCloud calendaring, ownCloud CalDAV server and my phone, and, since free-as-in-speech software is much better than free-as-in-beer, I decided to ditch Jorte.

But Jorte doesn’t have an option to export its data to an Icalendar file (why make things easy, right?). All it spits is a csv-file as a backup. (As a side-note, Jorte seems to intentionally not provide an Icalendar-file export option… as the ‘rrule’ field they use in the csv follow the same rules Icalendar standard dictates, they might be using it internally). Since this is pretty trivial stuff, I just coded a Ruby script to do the job. I released it to my github repo, just in case anyone else finds it useful.

Timeout a process in Bash > v4

Posted by – 19/04/2012

Just for reference, this is really useful:

( cmdpid=$BASHPID; (sleep 10; kill $cmdpid) & exec some_command )

Update Apr 20, 2012 @ 16:54: As pointed in a comment by Timo Juhani Lindfors, if “some_command” exits early and the interval is long, another process can reuse its process number and get killed once the sleep runs out. Does anybody know a better way of doing that without using timeout from coreutils (better yet: using just bash)?

Migrating from Mephisto to WordPress

Posted by – 12/03/2012

Just as I promised yesterday, I pushed a new git repo with my fork of the tool I used to migrate my old Mephisto blog to this new WordPress one.

I forked because the tool have not worked the first time. First of all, I was missing uuidtools gem, and to install it would be a pain inside the jail system I used to run my blog. Too much trouble just to get a UUID we can get by other means… so I just added an environment variable UUIDGEN anyone can use to point to a tool to do the job. I know this have performance implications, but I am not talking about 10-thousand entries…

Then, I found out that, for some odd reason I still have to understand, WordPress was cutting my articles everytime it read a “à” character. I could study the subject, but I just added a #gsub in mephisto-to-wxr code and moved on. I was about to remove it from the repo, but I left it there since it could help other people. Also, since there might be other similar occurrences, leaving it there serves as a heads up.

Also, I added support for Categories and Tags to mephisto-to-wxr, that seemed to be limitedly accepted (I translated Mephisto Sections into WordPress Categories).

All other activities were just clean-up. That tool generated a .WXR with all the articles and comments from my Mephisto blog. Everything I had to do was import it using WordPress import tool.

Swiss Tournament in Ruby

Posted by – 03/02/2010

Being a chess player (not a very good one), I’ve always been intrigued by Swiss Tournaments. They are so practical, and ensure that even a lowsy player like myself, can play the same number of rounds as any other player. That’s being inclusive!

I’ve played some knock-out tournaments (to me it meant being kicked off in the second or third round), and, given their nature, not-so-good players tend not to attend these tournaments (since their fun will, almost surely, end before long).

Well, to solve a very similar problem, but not in any game championship, a co-worker suggested we could use a Swiss Tournament system. I liked the idea, but not being sure it could really solve the problem, I had to quickly implement something to test our data with… so Ruby to the rescue!

In no time we were up and running, and apart from minor issues that were being fixed along the way, I guess it’s a pretty good implementation. You can checkout the code to get a feel of it. Of course, it doesn’t follow any rules from any Chess or Go association (Wikipedia, after all, was my guide here), but it serves our goal. Being a “proof-of-concept” code, feel free to improve it (just tell me about it, will you?).

Looking for a new programming language to learn

Posted by – 17/11/2009

I know it has been a long time since my last post. I am sorry about that, but life has it’s complications every now and then (as you know)… Well, on to the article.

Recently I had to reimplement in C a prefork server I wrote in Ruby for an internal project at Propus. Not that the Ruby version wasn’t enough (after all, although being in Ruby, I was using Unix plumbing, much in the fashion Ryan tell us about in the – now famous – I like Unicorn because it’s Unix article)… The problem is that, in one of our clients, the only version available for Ruby was 1.8.1.

Yeah… I know… But we were not allowed to upgrade and, although it didn’t seem at first, the same server presented a nasty memory leak in 1.8.1 that was not present in 1.8.7 and 1.9.1. I still don’t know where the problem is… I suspect some of the C-to-Ruby glues around TCP sockets might be blamed, but after a couple of days trying to figure it out, I decided it was easier just to reimplement it using C.

It actually took less than a day to get the C version going… nothing fancy and, apart from memory footprint, just the same functionality and about the same speed of the Ruby version. But it was enough to remind me I really don’t like all the scaffolding one has to raise in order to make something useful in C. It’s not just a matter of SLOC (of course, C version was more than 3 times longer than Ruby one)… I am talking about all the manual memory management, pointer operations and the disgusting experience of dealing with strings in C. I know some people are addicted to that sort of thing like heroin, but to me it just slows development.

This experience made me think about learning a second compiled programming language. I do some Perl, a lot of Python and (of course) most of my work in Ruby, but those are all interpreted languages. For compiled languages I always resorted to C… So I am officially looking for a language to learn.

So far, the best candidates are OCaml (I got a little excited about JoCaml a few months ago, now I might get serious about it), Haskell, Lisp, Objective-C, Ada, and Vala. Of these, I’ve been reading a lot about OCaml… It seems a fine and expressive language, with decent foundations, object-oriented extension, broad standard library and (with JoCaml) concurrency… Also it might give me the proper excuse to finally wrap my mind around a functional language!

People keep me pointing to Java and Erlang… Well… for using Java I would much prefer using JRuby. Erlang, ITOH, has a weird syntax (at least to me) and it seems much of what makes it great will, eventually, be part of Ruby (or already is using libraries) – either that or I’ll just wait for Reia to be ready. Besides, neither can be compiled to native code (ok, that argument can be stretched both ways, so just ignore it).

So, what do you think? Any advice?

Ruby versus Python

Posted by – 26/09/2009

This is not another rant to praise one in spite of the other (an everybody knows I love Ruby, so it would not be impartial), but sometimes people seems to live in another world and do things for the wrong reasons.

I just read this blog post by Kanwei Li in which he gives 2 or 3 reasons he ditched Ruby in favor of Python. First of all, both are great languages and, although I favor Ruby, I use Python for some projects and they are not all that different. Of course, everyone is free to choose which language one favors, but Kanwei seems to be “ditching” Ruby out of not knowing much about it, or out of preferring one style over the other…

His first “reason” is that in Python white spaces matter. I used to think this is just a matter of style, but every now and then mandatory alignment hurts me (just try to put together a code generator and you’ll notice it). Although my code is always correctly aligned, I like that it’s done so because I want it that way, and not because some language demands it. Rants and more rants have been written about Python’s mandatory alignment (or other languages lack of it), and I am not going through all of it… Just I don’t think it’s a good reason to ditch Ruby…

After, he makes a big deal out of Ruby’s ternary if. As written by him, he prefers

if len(a) > 0:
        v = a[0]
        a = a[1:]
        v = None

over Ruby’s ternary if

v = a.empty? ? a.shift : nil

Hey! Come on… Ruby’s ternary if is not mandatory… It was copied from C just as a syntax sugar. You can do without it, just as in Python:

if ! a.empty?
    v = a.shift
    v = nil

Better yet! you can use if’s return as v value:

v = if ! a.empty?

How beautiful is that!

Python lacked ternary if for a long time, and when it finally acquired one via PEP 308 its syntax was made different from every other language! Although I don’t think that is a problem, some people might think it would be better not reinventing the wheel.

Next, Kanwei goes over a famous “problem” of Ruby: the lack of a sum method for Array. I admit it’s strange, but that is completely coherent: Ruby’s Arrays are ordered collection of objects and not mathematical arrays. How do you sum objects that are not numbers? Many different people will have many different answers to that, so Ruby leaves this decision for the programmer and provides basic methods to deal with collections of anything (that can be used to apply sum to numbers, if wished). So, in Ruby you have to use Array#inject to perform a sum:

[1,2,3].inject(0) { |sum, value| sum + value }

Array#inject (actually Enumerable#inject) was borrowed from Smalltalk and allows you to loop through an array, building up an “accumulator value” as you go. When it’s done, the final value of this accumulator is returned. Very useful for combining array elements, whether by summing them, building up a pretty display string, whatever. In the example above, I am initializing the accumulator with 0.

If you use Array to mathematical operations and you want your arrays to work that way, you can always add a sum method to Array class:

class Array
    def sum
        self.inject(0) {|sum, value| sum + value}

Maybe it would be better if you just use Arrays as containers (as it was intended to) and implement that sum inside your own class… I completely agree with Reg Braithwaite here.

Kanwei also mention Python is faster than Ruby. That is true, but was “more true” some time ago. First of all, Python is older and has had more time to improve its speed. Ruby, ITOH, just now acquired a good VM and improvements to it finally can run parallel to improvements in the language itself, so I am expecting this to be less true every release. Python is already not getting much faster between releases, unlike Ruby (the differences between 1.8.7 and 1.9.1 are really impressive!). IMHO this is not a good reason to choose one instead of the other: if you really need speed, go for C 🙂

Now this is something I find interesting Kanwei has mentioned: “Python is more production ready”. He argues that Google is using it, so it must be good. Well… I cannot argue against that: Google is really using Python. But IBM, Oracle, EA, Cisco, Siemens, etc are using Ruby… so that is just a matter of preferring one or another company. Both are production ready… I agree, though, that Ruby 1.9.1 has many differences from 1.8.7, and that that may be seen as some inconsistency, but Python also has changed a lot since its 2.0 version, for that matter. And the changes to Ruby brought many benefits… I think they worth it.

At last, Kanwei compares Python and Ruby docstrings. Here I also have to agree with him: Ruby docstrings sucks. Actually that’s why everybody uses rdoc instead (and that is much more powerful than Python’s docstrings). Again, I don’t think that is reason enough to ditch Ruby (actually, the existence of rdoc, rubygems & friends should bring people to Ruby instead), but that is a matter of personal taste.

Surely, Kanwei’s reasons were easy to argue against. There are areas were Python shines much more than Ruby (and vice-versa), but those Kanwei mentioned are not among them.

I think both languages are powerful enough, and both are way better than Perl or PHP, so either one you choose would be fine. Better if you don’t have to choose and use both ;). If you have to, ITOH, pay more attention on how you feel while coding in each one, and not to some cheap reasons such as above. If you are a programmer, what matters most is that you’ll spend a lot of time coding with any given language… let that be something pleasant then.

Code testing coverage

Posted by – 10/09/2009

I like building tests for my code. That is not an old habit, it’s just something I’ve been developing in the recent months or some few years. No, I am not doing TDD (although that doesn’t sound like a bad idea): I just build tests after I code as a safeguard – to be sure I haven’t broken anything. I suspect there are more programmers like myself than those using tests as part of a TDD (BDD, SDD, etc) approach, but that is just an opinion.

Well I just recently became found of code coverage estimates and tools, and rcov is such a nice tool that sometimes I just find myself building tests just to “please” it. I also suspect there are at least a bunch of people that do the same. Here are the results of the test coverage of one of my projects:

spectra@rohan:~/work/xmpp4r-observable$ rake rcov
(in /home/spectra/work/xmpp4r-observable)
rm -r coverage
Loaded suite /usr/bin/rcov
Finished in 70.995814 seconds.
24 tests, 97 assertions, 0 failures, 0 errors
|                  File                              | Lines |  LOC  |  COV   |
|lib/xmpp4r-observable.rb                            |   648 |   414 |  61.4% |
|lib/thread_store.rb                                 |    58 |    39 |  87.2% |
|lib/observable_thing.rb                             |   187 |   118 |  91.5% |
|Total                                               |   893 |   571 |  69.4% |
69.4%   3 file(s)   893 Lines   571 LOC

Sure it’s tempting to get more of lib/xmpp4r-observable.rb covered, isn’t it?

Apresentando XMPP4R-Observable

Posted by – 08/09/2009

Há apenas alguns dias fiz uma apresentação no FISL10 sobre a utilização de XMPP PubSub com Ruby e sobre um fork de uma biblioteca popular à qual acrescentei os rudimentos do PubSub. Naquela mesma apresentação listei uma série de problemas que aquela abordagem tem e falei sobre um roadmap para o futuro…

Acontece que acabei me convencendo de que não posso utilizar o PubSub no lado XMPP da biblioteca e uma forma de periodical pooling no lado Ruby. Resolvi, então, substituir a biblioteca que havia forkado por uma versão Observable, preservando as coisas boas do XMPP4R-Simple. O resultado chamei de XMPP4R-Observable, e acabo de publicar no GitHub.

Uma boa parte do código está coberta por testes (e “roubei” alguns dos testes da própria XMPP4R-Simple)… pretendo cobrir o restante ao longo do tempo (contribuições são bem-vindas). Por hora, chamei esse primeiro release de versão 0.5.1 e acrescentei um .gemspec para gerar um .gem automaticamente… No entanto, o GitHub ainda não publicou o .gem… Quando publicar, para instalá-lo deve ser tão simples quanto:

bash# gem sources -a
bash# gem install spectra-xmpp4r-observable

Não deixem de reportar qualquer erro. Happy hacking.

Update 2009-09-13 10:29:00: Acabo de confirmar que o .gem foi publicado pelo GitHub.

Update 2009-10-10 20:21:00: O .gem do XMPP4R-Observable vai ser mantido no GemCutter, a partir de hoje.

Fighting Memory Leaks

Posted by – 17/08/2009

I love to code in Ruby… I’ve stated that many times. It’s a simple, intuitive, unobstrutive language good for quick prototyping and has a lot of good libraries around. There’s one annoiance, though, that every now and then pops up and makes my life miserable. For the last 10 days I’ve been fighting (and loosing) a battle against a memory leak… And this is its story:

I’ve told you that my company has some internal projects using XMPP. I’ve even published a fork of XMPP4R-Simple and presented it in FISL10. XMPP4R, which is the library underneath it, has causing me problems (at least I think it is guilty). This project has a lot of parsing and message/events exchange between XMPP agents. Everything is fine with the test suite and development tests went ok… but when I got it into production, it lasted a day before even the most simple test end up eating all the memory in the server.

Of course… my first thought was that our parser was the guilty one. It’s common knowledge that implementing parsers is not so intuitive in Ruby and one has to pay attention to some glitches not to end up in a memory leak (Why has commented about it before). I’ve rewrote the entire parser and the same behaviour emerged… I’ve even began considering using Treetop and stop worrying about parse construction (I hope treetop would do it The Right Way(tm)), but then I decided it’s worth a little research on the issue.

First, I tried to recompile Ruby 1.8.7 (I was using the one packaged for Debian). Same behaviour. Then I’ve tried Ruby 1.9.1 (I already had it compiled, but I recompiled it anyway 😉 ). Same behaviour. I would not like to try JRuby: even if it fixes the memory leak, it will not be available in all production environments we expect…

This brought me to the important conclusion that there’s no really good tools for memory profiling Ruby programs. There’s a lot of attempts and some even get you an idea of what’s going on, but nothing that points to the really obvious “perpretator” of a memory leak… Also, many people have dealt with this before and it seems to be a pain for everyone.The problem just grows with the complexity of your program and the number of dependencies. To break it up for my case, I designed a Test Case that would require only basic stuff and do the minimal thing (it’s really nothing big and a bit hackish, but it spits a lot of info on what is going on)…

I ended up convinced that either XMPP4R or REXML (which comes with Ruby) are leaking memory (maybe both). Even if it is REXML the faulty library, I suspect XMPP4R guys would like to know about it and would help me to narrow it down. At their home page they even list some desired information on bug reports:

  • Contact information
  • XMPP4R version or Git SHA1 commit
  • Use Jabber::debug = true but remove any sensitive information
  • Describe your environment (Ruby version, OS, server software)
  • Cool hackers send test cases

All obvious stuff, and since I’ve already built a test case

Mandando pro Subversion algo que só existe no git

Posted by – 17/07/2009

Eu não sei bem por que tem pessoas que acham que eu sou um expert em git… Caras: só porque eu mantenho alguns projetos no GitHub não quer dizer que virei expert. Toda semana tem algum email para mim perguntanto alguma coisa sobre git… A maioria eu consigo responder já que é coisa básica (ou aponto para alguma documentação e pronto), mas ontem veio uma pergunta meio estranha: Como mandar para o subversion algo que, até o momento, só existe em git?

Essa é interessante… Até agora eu não tinha precisado disso: só estava usando o git para manter projetos que já tinham começado no subversion da empresa… Pesquisando um pouco e adaptando para o estilo de trabalhar da Propus, eis minha proposta:

bash$ cd /caminho/para/o/projetoX
bash$ svn mkdir https://servidor.svn/projetoX -m "Importando do Git"
bash$ svn mkdir https://servidor.svn/projetoX/trunk -m "Importando do Git"
bash$ git checkout -b svn
bash$ git svn init https://servidor.svn/projetoX -s
bash$ git svn fetch
bash$ git rebase trunk
(aqui eventualmente o git se "perde", e algum conflito é gerado. Nos
projetos em que isso aconteceu para mim, um "git add arquivo-com-conflito"
seguido de um "git rebase --continue" foi o suficiente).
bash$ git svn dcommit

Com isso você tem um branch chamado svn que vai espelhar o que está no subversion. A partir de então e só seguir mantendo o código no master (ou em algum branch que quiser), fazer o merge com o branch svn e mandar para cima com um git svn dcommit

find | while read var; do something “$var”; done

Posted by – 25/06/2009

Essa vai para a galera que scripta muito bash. É a milésima vez que tenho de repetir esse comando para alguém (na milésima-primeira eu desisto e ponho no blog para referência ;-)).

O pessoal fica estressado com nomes de arquivos com espaços, ou tentando usar xargs com mais de um comando. No loop while você pode colocar o conjunto de comandos que quiser para executar sobre a variável em questão:

bash$ find ~/photos | while read foto; do mogrify -resize 800x "$foto"; done

Simples e eficiente.

rsync logs with restricted ssh

Posted by – 15/04/2009

SSH is really the Swiss Army pocket knife of sysadmin tools. When I needed to periodically synchronize log files from an old server (old as in customer-would-never-update-it-or-install-anything-new), I built a simple and secure solution using rsync and ssh. This is what I did:

(I will call “remote” the system where the logs I want to retrieve are, and “local” system where I want them to be copied to) First I created an account with a restricted shell (ideally this should be a system account, but we’ll get there!):

remote# adduser --ingroup nogroup --shell /bin/rbash rlogs

Then locally, I created a new, password-less ssh key pair, copying it to my remote system:

local$ ssh-keygen
>>> When asked where to save it, I chose a different name, like .ssh/rlogs
local$ ssh-copy-id -i .ssh/ rlogs@remote
>>> You can delete the password of user rlogs, so it, effectively,
>>> cannot log-in with it (almost like a system user).
remote# passwd -d rlogs

Now you should be able to run password-less rsync already (note that I use -e option to point to a different key):

local$ mkdir logs
local$ rsync -av -e "ssh -i $HOME/.ssh/rlogs" rlogs@remote:"logs/" logs/
receiving file list ... done

But even with a restricted shell, I wanted even less possible things to happen. That’s what command= directive is for… It will only allow that command to be run in a session started by that key. Since rsync translates a lot of its command-line options, I run it again with a dirty ps-in-a-loop in the remote host, just to see what running rsync locally causes remotely:

remote$ while 1; do ps wp $(pgrep rsync); sleep 1; done
local$ rsync -av -e "ssh -i $HOME/.ssh/rlogs" rlogs@remote:"logs/" logs/
>>> in the remote loop you should be able to get the command:
 6183 ?        Ss     0:00 /usr/bin/rsync --server --sender -vlogDtpre.i . logs/

Here comes the authorized_keys magic. At the remote host I edited .ssh/authorized_keys to add a command= line with what I found out in my dirty loop. Also, I added a couple of directives to restrict it even further (they are pretty self-explanatory):

rlogs@remote$ cat .ssh/authorized_keys
command="rsync --server --sender -vlogDtpre.i . logs/",no-port-forwarding,no-agent-forwarding,no-X11-forwarding,no-pty ssh-rsa (...) myuser@local

Now everything is set. I just added the rsync command to the local crontab and it’s done.