- Finite State Machines in Clojure core.logic
- lx in core.logic #2: Jumps, Flexible Transitions and Parsing
Today, we're gonna look at finite state transducers, which are commonly used to model and implement translation. While sounding fancy and powerful, they are straightforward extensions of finite automata.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(ns fst | |
(:refer-clojure :exclude [==]) | |
(:use [clojure.core.logic])) | |
;; A finite state transducer is essentially a translator between | |
;; two tapes of symbols. It is normally a translator from an input | |
;; tape to an output tape, but since we are using core.logic, | |
;; we hope to relax this restriction :). | |
;; The main idea is that every transition accepts two symbols | |
;; (one from each tape). We will implement a simple pluralizer | |
;; for most English words. | |
;; WARNING: Maths ahead, skip at your leisure | |
;; | |
;; Formally, a finite state transducer is a tuple | |
;; T = (Q, Sigma, Gamma, I, F, delta) | |
;; | |
;; where | |
;; Q is the state space | Q = {0, std, es, s, 1} | |
;; Sigma is the input alphabet | Sigma = {a-z} and :jump | |
;; Gamma is the output alphabet | Gamma = {a-z} and :jump | |
;; I are the starting states | {0} | |
;; F are the accepting states | {1} | |
;; delta is the transition function | |
;; delta(q, a, b, qto) signifies that it is possible | |
;; to transition from state q to state qto by consuming | |
;; a from the first tape and b from the second | |
;; We will implement the following rules: | |
;; -h, -s, -o |--> -(h|s|o)es | |
;; -y |--> -ies | |
;; - |--> -s | |
;; Our transducer looks as follows: | |
;; | |
;; +---(any <x> not in {s,h,o,y}):<x>----(std) | |
;; | | | |
;; | #:s | |
;; | s:s | | |
;; | h:h v | |
;; +------> (0) --o:o--> (es) --#:e--> (s) --#:s--> (1) | |
;; | | y:i | |
;; +-<x>:<x>-+ | |
;; | |
;; Notation | |
;; | |
;; <a>:<b> - <a> consumed from first tape and <b> from second | |
;; # - jump | |
;; (<x>) - state | |
(defrel start q) | |
(fact start 0) | |
(defrel accepting q) | |
(fact accepting 1) | |
;; Transition table. Note we haven't included | |
;; the transitions that accept many symbols, since | |
;; we do not want to enumerate the input alphabets. | |
(defrel transition* from a b to) | |
(facts transition* [[0 'y 'i 'es] | |
[0 's 's 'es] | |
[0 'h 'h 'es] | |
[0 'o 'o 'es] | |
['es :jump 'e 's] | |
['s :jump 's 1] | |
['std :jump 's 1]]) | |
;; Dynamic extension of the transition table | |
;; to our full transition relation. This includes | |
;; the transitions from 0 to 0 and 0 to std. | |
(defn transition [from a b to] | |
(conde | |
((transition* from a b to)) | |
((!= a :jump) | |
(== a b) | |
(== from 0) | |
(conde ((== to 0)) | |
((== to 'std) | |
(!= a 's) | |
(!= a 'h) | |
(!= a 'o) | |
(!= a 'y)))))) | |
;; Translation *relation* | |
(defn translate | |
([tape1 tape2] | |
(fresh [q0] | |
(start q0) | |
(translate q0 tape1 tape2))) | |
([q tape1 tape2] | |
(matcha [tape1 tape2] | |
(['() '()] | |
(accepting q)) | |
([[t . ape1] [c . ape2]] | |
; This seems unnecessary going forward, but it makes | |
; sure we get no jumps in the input if we | |
; "translate" backwards, i.e., look for the singular | |
(!= t :jump) | |
(fresh [qto] | |
(transition q t c qto) | |
(translate qto ape1 ape2))) | |
([_ [t . ape2]] | |
(fresh [qto] | |
(transition q :jump t qto) | |
(translate qto tape1 ape2)))))) | |
(run* [q] (translate '(p a s s) '(p a s s e s))) | |
;; => (_.0) | |
;; | |
;; Checking pluralizations | |
(run* [q] (translate '(p i r a t e) q)) | |
;; => ((p i r a t e s)) | |
;; | |
;; Running the pluralizer | |
(run* [q] (translate '(d a i s y) q)) | |
;; => ((d a i s i e s)) | |
(run* [q] (translate '(h e r o) q)) | |
;; => ((h e r o e s)) | |
(run* [q] (translate q '(v a r s))) | |
;; => ((v a r)) | |
;; | |
;; Getting the singular | |
(run* [q] (translate q '(p a s s e s))) | |
;; => ((p a s s) (p a s s e)) | |
;; | |
;; Unfortunately, the transducer doesn't really know English, | |
;; but at least it got the right answer | |
(run 10 [q] (fresh [a b] (translate a b) (== q [a b]))) | |
;; => ([(h) (h e s)] | |
;; [(o) (o e s)] | |
;; [(s) (s e s)] | |
;; [(_.0 h) (_.0 h e s)] | |
;; [(_.0) (_.0 s)] | |
;; [(y) (i e s)] | |
;; [(_.0 o) (_.0 o e s)] | |
;; [(_.0 s) (_.0 s e s)] | |
;; [(_.0 _.1 h) (_.0 _.1 h e s)] | |
;; [(_.0 _.1) (_.0 _.1 s)]) |
No comments:
Post a Comment