Ryan Tomayko has a post on how Ruby recapitulates AWK (or to be more biologically accurate, how it carries vestigial traits which reveal its evolutionary lineage from AWK down through Perl).
He gives an example of how curl, AWK, and sort can be chained together to compute word counts for Swift's A Modest Proposal:
curl -s http://www.gutenberg.org/files/1080/1080.txt | ruby -ne ' BEGIN { $words = Hash.new(0) } $_.split(/[^a-zA-Z]+/).each { |word| $words[word.downcase] += 1 } END { $words.each { |word, i| printf "%3d %s\n", i, word } } ' | sort -rn
Back in the day I was an enthusiastic user of AWK. I was happy to discover that JEQL can be handily used for similar kinds of text processing, when equipped with suitable string handling and RegEx functions. Here's the word count functionality in JEQL (using a source for the text that is more bot-friendly than Project Gutenberg):
TextReader t file: "http://www.victorianweb.org/previctorian/swift/modest.html"; t = select String.toLowerCase(splitValue) word from t split by RegEx.splitByMatch(line, "[a-zA-Z]+" ); Print select word, count(*) cnt from t group by word order by cnt desc;
AWK had a bit of a rep for being somewhat write-only. To my SQL-attuned eyes the JEQL version is more understandable.
Really nice blog I really appreciate your concern about this topic and I want to share something about Frequency Distribution Table that is In statistics, a frequency distribution is an arrangement of the values that one or more variables take in a sample.
ReplyDeleteMartin,
ReplyDeleteInspirational work as always! JEQL looks very interesting, particularly the implementation of table-based programming. Thank you for describing it.
If JEQL is ever released as open source, I'd be over the moon and start to use it immediately. So, please do sing out if that ever looks possible.
regards,
-Frank
Martin and All,
ReplyDeleteJust a quick follow-up. This document:
http://foss4g-na.org/wp-content/uploads/2012/03/JEQL_Language_for_Spatial_Processing_2012.pdf
indicates that JEQL will be open source "soon". So that is very encouraging.
regards,
-Frank
http://tsusiatsoftware.net/jeql/main.html
Frank,
ReplyDeleteGlad you like the look of JEQL. I am definitely intending to open-source it - and I'll try and make that happen ASAP.