Monday, December 1, 2008

Prototypes: Encoding OO-style inheritance in Haskell

In this post I will sketch an encoding for OO-style inheritance in Haskell, and show how this is used to in Yi to write code that can be customized.

This can also serve as an introduction to the concepts defined in module Data.Prototype (currently found in Yi sources)

Inheritance

Inheritance can create structures which are difficult to understand. Since a given method call can call dispatch to a number of methods at run-time, tracking what is going on might be tricky. Sometimes however, inheritance is exactly the construct we need.

Imagine you have the following piece of code:

a :: A
a = fa b c

b :: B
b = fb a c

c :: C
c = fc a b

That is, a, b and c are values defined in terms of each other.

You would like users to be able to customize a’s value. However, if the change actually occurs in the definition of c, you don’t want them to copy-paste the whole set of definitions. It would be preferable to amend only the definition for c and reuse the rest. Unfortunately, a’s value is closed, so this is not possible.

This situation seems to cry for inheritance. In an object oriented language, the solution is obvious: make a, b and c methods of a class. The user can then inherit it and override the definition of c.

In Yi, color themes have a similar structure: specific styles are defined in terms of base styles. If a user changes a base style, the change should be reflected automatically in all the styles that derive from it. As in the toy example above, we do not want the user to redefine everything from the ground up.

So, what can we do, since Haskell lacks inheritance?

Encoding prototypes

All is not lost! Pierce (TAPL, paragraph 18.10) has taught us that inheritance can be encoded as open recursion. The trick is to make the reference to the self object explicit. We can do so in Haskell by putting the definitions in a record and a lambda.

data Proto = Proto {a :: A, b :: A, c :: C}
proto = \self -> Proto {
  a = fa (b self) (c self),
  b = fb (a self) (c self),
  c = fc (a self) (b self)
 }

We can retrieve our original definitions by taking the fix-point:

abc = fix proto

Of course, this works only because Haskell is lazy (and because the original definition did not introduce an infinite recursion in the first place). If the fields of the record are marked strict, this ceases to work.

Given that definition, it is easy to customize the value as follows:

customizedProto = \self -> proto self {
   c = userFunction (a self) (b self)
 }

customizedABC = fix customizedProto

The Data.Prototype module generalizes this example, and defines types and functions to corresponding to the prototype and inheritance abstractions.

Conclusion

Yi is intended to be highly customizable. In many instances, we can use compositional abstractions to provide customization. In some other instances, we prefer to provide a prototype that user can patch.

Despite Haskell lacking inheritance, we see that the basic concepts of lambda expression and lazy evaluation can be combined to provide a very lightweight encoding for prototypes, and we take advantage of this in Yi.

Thursday, November 20, 2008

Incremental Parsing in Yi

In this post I will motivate Yi’s incremental parsing library and describe the main ideas behind it.

Why another parsing library?

Why bothering developing another parsing framework while there exist plenty already?

First, since we want to parse many languages, in many flavors, we want to be able to reuse pieces of grammars. Since we are using Haskell, the easiest way to achieve this is to through a parser-combinator library.

Second, we want to give timely feedback to users. Therefore, the parser has to be efficient. In particular, responsivity should not depend on the length of file. On the other hand, the input file will change incrementally (as the user edits it), and the parser should take advantage of this. It should reuse previous results to speed up the parse.

Third, the parser must gracefully handle errors in the input: while files are being edited, they will inevitably contain syntactically incorrect programs at several moments.

No parsing framework combining these characteristics exists.

Approach

Hughes and Swierstra have shown how online parsers can be constructed. An online parser takes advantage of lazy evaluation: the input is analyzed only as far as needed to produce the part of the result that is forced.

Effectively, this solves part of the incremental parsing problem: since the user can see only one page of text at a time, the editor will only force so much of the result tree, and only the corresponding part of the input will be analyzed, making the parser response time independent of the length of the input.

This does not completely solve the problem though. If the user edits the end of the file, the whole input will have to be analyzed at each key-press.

Caching results

The proposed solution is to cache intermediate parsing results. For given positions in the input (say every half-page), we will store a partially evaluated state of the parsing automaton. Whenever the input is modified, the new parsing result will be computed by using the most relevant cached state, and applying the new input to it. The cached states that became invalidated will also be recomputed on the basis of the most relevant state.

Of course, the cached states will only be computed lazily, so that no cost is paid for cached states that will be discarded.

Conclusion

The strategy sketched above has several advantages:

  • The design is relatively straightforward, and adds only a hundred lines of code compared to the polish parsers of Hughes and Swierstra.

  • There is no start-up cost. A non-online approach would need to parse the whole file the first time it is loaded. In our approach loading is instantaneous, and parsing proceeds as the user scrolls down the file.

  • The caching strategy is independent of the underlying parsing automaton. We only require it to accept partial inputs.

Its notable that the design has a strong functional flavor:

  • We never update a parse tree (no references, no zipper) and still achieve incremental parsing.

  • We take advantage of lazy evaluation in a cool way.

The main drawback is that the user code must use the parse tree lazily, and there is no way to enforce this in any current implementation of Haskell.

Wednesday, October 8, 2008

Yi 0.5.0 Release Notes

Yi

Yi is a text editor written and extensible in Haskell. The long-term goal of the Yi project is to provide the editor of choice for Haskell programmers.

Yi is not a finished product. This is a beta-quality release. However, Yi has become a lot more usable than previously, so if you feel like testing it, now might be a good time to take a look.

Installation

Using cabal install:

cabal install yi–0.5.0.1

If you want unix console support, pass the -fvty option to cabal install.

Features

  • A purely functional editor core
  • Key-bindings written as parsers of the input
  • Emacs, Vim and partial Cua emulations provided by default
  • Unix Console front-end (Gtk2Hs frontend is not supported in this release)
  • Static configuration (XMonad style) for fast load
  • Haskell support:
    • Lexical highlighting
    • Layout-aware parenthesis-matching
    • Auto-indentation
    • Call cabal build within the editor

Credits

This release is brought to you by:

  • Allan Clark
  • Corey O’Connor
  • Gustav Munkby
  • Gwern Branwen
  • Jean-Philippe Bernardy
  • Jeff Wheeler
  • Nicolas Pouillard
  • Thomas Schilling
  • Tristan Allwood

and all the contributors to the previous versions.

Saturday, October 4, 2008

I gave a Yi demo at the Haskell Symposium, and the video (recorded “guerrilla style”) is available! It should eventually appear on the ACM Digital Library too, hopefully in better quality, but don’t hold your breath.

The first part demonstrates the Haskell support capabilities, while the second one shows how Yi can be configured and extended.

The demo abstract can be found here.

Thursday, October 2, 2008

In this post I’ll give a walkthrough to a simple Yi configuration.

First, note that Yi has no special purpose configuration language. Yi provides building blocks, that the user can combine to create their own editor. This means that the configuration file is a top-level Haskell script , defining a main function.

import Yi
import Yi.Keymap.Emacs as Emacs
import Yi.String (modifyLines)

main :: IO ()
main = yi $ defaultConfig {
  defaultKm = 
     -- take the default Emacs keymap...
     Emacs.keymap <|> 
     -- ... and bind the function to 'Ctrl->'
      (ctrl (char '>') ?>>! increaseIndent)
 }


-- | Increase the indentation of the selection    
increaseIndent :: BufferM ()
increaseIndent = do
  r <- getSelectRegionB 
  r' <- unitWiseRegion Line r 
     -- extend the region to full lines
  modifyRegionB (modifyLines (' ':)) r'
     -- prepend each line with a space

In the above example, the user has defined a new BufferM action, increaseIndent, using the library of functions available in Yi. Then, he has created a new key-binding for it. Using the disjunction operator (<|>), this binding has been merged with the emacs emulation keymap.

This is a very simple example, but it shows how powerful the approach is: there is no limit to what functions the user can create. It is also trivial to use any Haskell package and make its function available in the editor. This is an obvious advantage over the traditional approach, where only a special purpose language is available in the configuration file.

Another advantage to this configuration style is is purely declarative falvour. The user defines the behaviour of the editor “from the ground up”. This can be constrasted to emacs (and lisp) style, where the configuration is a series of modifications of the state.

Finally, I shall say that this configuration model is not unique to Yi: it has already been used with success in the XMonad window manager.

Tuesday, September 9, 2008

Yi's top-level structure

In this post I’ll give a very high-level overview of Yi’s structure. This the first part of a series that should eventually constitute a guide for Yi hacking… but let’s get going!

Yi code can be categorized into four parts:

  • Actions, which are operations having some effect on the editor state. This can be opening or saving a file, or moving the cursor in the current buffer.

  • Keymaps, governing how user input maps to actions. Yi comes with keymaps for emacs and vi emulation (and users are encouraged to write their own keymaps). Keymaps are very much like parser-combinators, but they produce a stream of actions instead of a parse-tree.

  • UIs, which are responsible for rendering the editor state, and getting the stream of input events from the user. Yi comes with console, gtk and experimental cocoa UI.

  • Glue code, to tie the knot between Keymaps and UI.

The structure described above is very flexible: there is very low coupling between components. Indeed, one can easily swap out one component for another in the same category. For example, the user can choose between the different UIs and key-bindings at runtime.

The “actions” part makes up most of the editor code. This part is structured around a stack of three monadic DSLs.

  • BufferM: A monad for all buffer-local operations, like insertion and deletion of text, and annotation of buffer contents. It can be understood as a monad that encapsulates the state of one buffer.

  • EditorM: A monad for editor-level operations, e.g., opening and closing windows and buffers. Operations involving more than one buffer are handled at this level too.

  • YiM: A monad for IO-level operations. There, one can operate on files, processes, etc. This is the only level where IO can be performed.

All these parts are easily composable, and this makes convenient to extend (and configure) the editor. In the next post we’ll see in more detail how to do so.

Tuesday, September 2, 2008

Adding Vim-like tabs to Yi

Intro


This post walks through the process and thinking behind the initial implementation of a tabbar in Yi.This was originally written about a month ago and discusses this patch. This can be followed like a tutorial but a copy of the Yi repository right before the patch is required. This command:

Will construct the required Yi checkout. The remainder of this post is written assuming that checkout is the current state of Yi.

The goal is to add a tabbed view of the editor much like that in Vim 7.0. For those unfamiliar with Vim, just think of the tabs in Firefox. Yi already has support for multiple views on the text. These views currently work much like Vim and Emacs' multiple window support: The screen can be divided into a set of rectangular views, called windows, where each view presents a region of a text buffer to the user. The tabbed view is an extension of this feature. Each tab will contain a different set of windows and only one tab will be visible at a time.

Adding Tabs to the Editor State


In Yi, the state of the editor is described by the Editor data structure in the module Yi.Editor. This contains a property that describes the set of windows currently presented by the editor.

One requirement of tab support is that each tab contains a different set of windows. In order to take advantage of the zipper properties of the window set data structure (More on that later) the same abstract data structure[1] will be used to represent a set of tabs that each contain a set of windows.

Which brings us to the first change: Replacing the "windows" property of the Editor data structure with a "tabs" property. The "tabs" property is a set, where each element of the set is a set of windows.

hunk ./Yi/Editor.hs 49
- ,windows :: WindowSet Window
+ ,tabs :: WindowSet (WindowSet Window)

This change will break a lot of code. Any code that manipulated the window set of the editor will now fail to compile. I'm going to work under the assumption that all the existing code, under the context of an editor with tabs, is only manipulating the window set of the current tab. This way I can use a property of Haskell's records to make the old code compatible with the new structure.

Haskell's records can, for the most part[2], be though of as tuples combined with auto-generated field selector equations (AKA labels). These equations have the type "[RecordType] -> [FieldType]" and have the same name as the field. Which means the "windows" field selector for the original Editor data structure had the type "Editor -> WindowSet" and the name "windows" An equation with the same signature can be manually added. For existing code that uses the "windows" selector this new function will provide compatibility with the new implementation:

hunk ./Yi/Editor.hs 63
+windows :: Editor -> WindowSet Window
+windows editor =
+ WS.current $ tabs editor
+

(The above refactoring pattern has proved useful in other projects as well. I find it is a handy
pattern to use when needing to incrementally extend a Haskell data type.)

There are three additional areas that need to be updated before Yi will once again compile: The windows accessor; Construction of an Editor instance; Update of the windows on buffer deletion.

1. The windows accessor.


The windows accessor before tabs meant "Provide an accessor for the window set." The new equation will mean "Provide an accessor for the window set of the current tab."

There are many properties in Yi's state that are presented through an accessor interface. The idea behind the accessor interface is to simplify getting and setting a specific property of the editor in a stateful way. When using a non-pure language this is something that is trivial to do: "foo.bar = 1; print foo.bar;" For a pure language, like Haskell, things are different. The Yi/Accessors.hs module provides equations to abstract away the details for users of an API.

For developers, which is the part we are playing in this post, we have to worry about the details. A property that has a Yi.Accessor interface is composed of two parts: A getter equation and a setter equation.

The getter equation is the same as the "windows" equation above. Done!

The setter equation needs to extract out the window set of the current tab. Apply a modifier to the window set and update the editor's current tab with the modified window set.

hunk ./Yi/Editor.hs 103
-windowsA = Accessor windows (\f e -> e {windows = f (windows e)})
+windowsA = Accessor windows modifyWindows $
+ where modifyWindows f e =
+ let ws = WS.current (tabs e)
+ ws' = f ws
+ tabs' = (tabs e) { WS.current = ws' }
+ in e {tabs = tabs' }

2. Construction of an Editor instance.


The Editor instance was constructed with a window set containing a single window to a scratch buffer. Now an Editor instance will be constructed with a single tab that contains a window set containing a single window to a scratch buffer.

The last two sentences were written in verbose prose to make a point: If X was the text "a window set containing a single window to a scratch buffer." Then the first sentence contained just X. While the second contained X prefixed with "a single tab that contains ". The code that implements this change will follow the same pattern:

hunk ./Yi/Editor.hs 127
- ,windows = WS.new win
+ ,tabs = WS.new $ WS.new win


The equation was "WS.new win" Which is the implementation of X in the paragraph above. The new equation is "WS.new $ WS.new win" Which is the implementation of X prefixed with the implementation of "a single tab that contains "

3. Update of the windows on buffer deletion.


Resolving this is straight forward: "pickOther" is an equation that replaces references in a Window to the buffer being deleted with references to some other buffer. The original implementation used map to apply this equation to each Window in a set of windows. The new implementation, essentially, uses map to apply the original equation the set of windows contained in each tab.

hunk ./Yi/Editor.hs 190
- windows = fmap pickOther (windows e)
+ tabs = fmap (fmap pickOther) (tabs e)


At this point Yi will once again compile and run. No features will have been gained. Which is fine. Validating the change of implementation introduced no regressions in functionality is important. This is easier to do before the new features are implemented.

Assuming no regressions are found let's move on... (Nice assumption eh?)

Adding Vim Keymap Support


The Vim keymap is going to be modified to support tabs before the functions that actually manipulate the tabs will be added. This provides a good opportunity to develop the interfaces for these functions before considering the actual implementations. Once the potential interfaces are decided then the actual implementation details will be examined.

In Vim, my usual tab workflow consists of three actions:

  1. Using ":tabnew [path | buffer]" to create a new tab.

  2. Using "gt" to move to the next tab.

  3. 3. Using "gT" to move to previous tab.


We'll examine these in turn.

Adding ":tabnew [path | buffer]"


The optional argument to :tabnew will be ignored for now. The implementation of this will be a pattern match on "tabnew" that maps to the editor action "newTabE"

hunk ./Yi/Keymap/Vim.hs 909
+ fn "tabnew" = withEditor newTabE

Adding "gt" and "gT"


The other two actions, #2 and #3, are very similar: Both are actions, next tab and previous tab respectively, applied on the recognition of a sequence of characters input in command mode.

"pString" recognizes a sequence of characters. " >>! " declares that satisfying the left hand side implies the action on the right hand side. Using these to extend the cmd_eval part of the Vim command mode keymap is pretty straight forward:

hunk ./Yi/Keymap/Vim.hs 312
- choice
- ([c ?>>! action i | (c,action) <- singleCmdFM ] ++
+ choice $
+ [c ?>>! action i | (c,action) <- singleCmdFM ] ++
hunk ./Yi/Keymap/Vim.hs 315
- [char 'r' ?>> textChar >>= write . writeN . replicate i])
+ [char 'r' ?>> textChar >>= write . writeN . replicate i
+ ,pString "gt" >>! nextTabE
+ ,pString "gT" >>! previousTabE]
+

At this point Yi will fail to compile. Good. That is what is expected for implementing code that uses an API without actual implementing the API. Otherwise something is seriously amiss! :-)

Editor Tab API


Rewind back to where the abstract data type WindowSet* was mentioned to be a zipper In this case, the WindowSet is both a list and an iterator into this list. The iterator into the list represents the "current" element being viewed. For a UI application this abstraction is very useful. Among other reasons: Persistence of the data structure is maintained; And the "current" iterator is never invalidated.

newTabE will add a new tab containing a window set of a single window to the editor. As a window in Yi needs to view a buffer, I felt it was reasonable to create a window that views the current buffer the user is editing. The implementation of newTabE looks very much like what you'd expect if the tab set was represented by a list: The new tab is added to the list. However, as the tab set is a zipper the WS.add equation has the additional effect of changing the current tab to the newly added element.

nextTabE and previousTabE change the current tab in the zipper. nextTabE goes forward in a round robin fashion while previousTabE goes backward. Both can be thought of only changing the current tab iterator in the zipper and nothing more.

hunk ./Yi/Editor.hs 448
+-- | Creates a new tab containing a window that views the current buffer.
+newTabE :: EditorM ()
+newTabE = do
+ bk <- getBuffer
+ k <- newRef
+ let win = Window False bk 0 k
+ modify $ \e -> e { tabs = WS.add (WS.new win) (tabs e) }
+
+-- | Moves to the next tab in the round robin set of tabs
+nextTabE :: EditorM ()
+nextTabE = do
+ modify $ \e -> e { tabs = WS.forward (tabs e) }
+
+-- | Moves to the previous tab in the round robin set of tabs
+previousTabE :: EditorM ()
+previousTabE = do
+ modify $ \e -> e { tabs = WS.backward (tabs e) }

Compile and run. Now Yi has a very basic tabbed UI.

Footnotes



  1. The WindowSet data structure is more general than it's name implies. The generic name is "cursor" but that's confusing in the context of an editor so the name "Round Robin Set" probably makes more sense.

  2. I'm ignoring labelled updates and data types with multiple constructors. Probably some other details too. Ah well!