semr
Get Version
0.2.0→ ‘dsl’ → ‘ruby’
What
Semr is the gateway drug framework to supporting natural language processing in you application. It’s goal is to follow the 80/20 rule where 80% of what you want to express in a DSL is possible in familiar way to how developers normally solve solutions. (Note: There are other more flexible solutions but also come with a higher learing curve, i.e. like treetop)
Installing
sudo gem install semr
The semr gem uses the oniguruma library to leverage more mature regular expression features then what ruby supports. This library is part of the ruby 1.9 build but it needs to be installed if running semr on 1.8.* For more info on gem: http://oniguruma.rubyforge.org/
sudo gem install oniguruma
The basics
There are two constructs when defining the grammar in semr…a phrase and a concept.
Phrases
These are the structure of statements supported by the DSL and what concepts are to be matched. When a user’s statement matches a phrase, the concepts are extracted, processed as defined by the concept (see normalizers) and then passed to the block defined by the phrase for execution. Phrases are matched in the order they are defined and are only matched to one phrase. Semr delimits a users input by the newlines which means each line represents a statement that will be executed by one of the phrases if it matches.
phrase 'match this :some_concept here' do |concept| #when the user types a statement that matches the phrase, #the concept, :some_concept, is extracted and passed to this block end
Concepts
Concepts are the meaningful variations in the DSL that we want to perform certain actions on. Examples of concepts are any word that changes, a number, multiple words in quotes, or an ActiveRecord model class. After predefining concepts then you express the phrases that the concepts will appear in.
#Concepts are just named regular expressions concept :number, /([0-9]*)/ #any_name is a more explicit way to express a number #(see: expressions in rdoc for more) concept :number, any_number #normalizers handle the concepts before passing to #the phrase block (see: normailzers in rdoc for more) concept :number, any_number, :normalize => as_fixnum #matches any word concept :a_work, /(\w+)/ #matches only hi, goodbye, hello concept :greeting, words('hi', 'goodbye', 'hello') #matches any string of characters in quotes i.e. 'some words' concept :greeting, words_in_quaotes #built in support for rails applications, allow users #to directly reference ActiveRecord models. concept :model, all_models
Context
After processing all of the user’s statements, semr returns the hash result. This hash is accessible to the phrases during execution and is used to store all the output of each phrases execution.
phrase 'match this :some_concept here' do |concept| context[:value] = concept end ... context = language.parse('match this thing here') context[:value] #=> 'thing'
Demonstration of usage
Here is a basic example that lets the user express how often to greet someone. Valid greetings are goodbye, hi and hello.
require 'rubygems' require 'semr' language = Semr::Language.create do #also accepts a path to a file instead of a block concept :number, any_number, :normalize => as_fixnum concept :greeting, words('hi', 'goodbye', 'hello') phrase 'say :greeting :number times' do |greeting, number| number.to_i.times { puts greeting } end end language.parse('say hello 6 times') # hello # hello # hello # hello # hello # hello language.parse('say goodbye 2 times') # goodbye # goodbye
A more complicated example below allows the user to express finding our domain concepts (i.e. models) in relatively non technical terms
require 'rubygems' require 'semr' concept :action, word('feature', 'show', 'highlight') concept :model, word(*all_models.with_synonyms.and.with_plurals.to_a), :normalize => as_class phrase ':action :model <where><with><from> :attribute <is> :criteria' do |action, model, attribute, criteria| context[action] ||= [] context[action] << model.find(:first, :conditions => {attribute.to_s => criteria}) end context = language.parse('feature people where age is 30') context['feature'] #=> an array of Person models where age is 30 context = language.parse('show events from city bangalore') context['events'] #=> an array of events in bangalore context = language.parse("Highlight articles with title 'Growth in China.'") context['highlight'] #=> an array of articles
Forum
http://groups.google.com/group/semr
How to submit patches
Read the 8 steps for fixing other people’s code You can fetch the source from:
git clone git://github.com/mdeiters/semr.git
License
This code is free to use under the terms of the MIT license.
Contact
Comments are welcome. Send an email to Matthew Deiters or email the semr group via forum
FIXME full name, 14th November 2008
Theme extended from Paul Battley