Input methods using transducers

Rémi Zajac, Malek Boualem, Jan Amtrup, « Specification and implementation of input methods using finite-state transducers » , 14th International Unicode Conference, Boston, Massachusetts, March 22-25, 1999.


Most input method frameworks proposed so far try to bridge the gap between an input method implementation, the text editor and the input events. In this paper, we propose to specify the implementation of input methods using a declarative approach using extended finite-state transducers. We implemented a finite-state model in Java that handles Unicode strings and events called Salsa. A transducer specifying an input method is defined classically as a finite-state transducer where transitions between 2 states contain the left-projection of the transduction (input), the right-projection (output) and an event which sends back a value: Transition from state i to state j with input « a », event Null, output « a »;  The transition is traversed when the FST receives a character string corresponding to the left-projection. The associated event is fired and if it handled successfully, the FST sends as output the string on the right-projection. The Salsa language provides a textual syntax for specifying transducers. This language allows the factorization of transitions such that simple FST with few states but many transitions can be written economically. The language also allows the specification of classes of input (character classes or string classes). By default, Strings are Unicode strings: the text specifying the transducer is UTF8. However, any character set can be specified as input or output using hexadecimal codes for specifying characters.  Salsa comes with a compiler which reads a Salsa specification and builds a runtime representation of the transducer, and a Salsa runtime module which can be accessed through the Salsa runtime API.  The class implementing the InputMethodListener interface bridges the gap between the transducer and the outside world. This class is parameterized by a Salsa transducer which defines the input method.  Other applications of Salsa include the specification and implementation of context analysis, and the implementation of codeset converters.

Malek Boualem