MagicHaskeller: A Search-based Inductive Functional Programming System
Susumu Katayama, University of Miyazaki
ÆüËܸì(Japanese)
MagicHaskeller on the Web
Overview
See publications at my top page:) Also, here is my presentation used at the 25th ToPS seminar (held at IPL, University of Tokyo).
(NB: this presentation is rather old, and lacks in survey on recent progress in the extended Summers-like approach.)
The name is abbreviation of "automagic Haskell programmer", but it may not be adequate because currently it supports just a subset of Haskell, notably without type classes.
Since May 2006 It has two forms: stand alone program including a cheap interpreter, and new library package which is supposed to be used with some recent versions of Glasgow Haskell Compiler interactive (GHCi) (because it uses Template Haskell).
What's new
Jan. 9, 2013 Version 0.9.1 is released as a library (in addition to the web service). The same version is used as the backend server of the current web service, so please check it out if you are interested in its source files and/or if you want to make a new service.
May 23, 2012 As you have already seen, the first alpha version of the web interface is out! Please let me know (instead of exploiting it) if you ever found a security hole....
Mar. 2, 2012 Version 0.8.6.3 fixes a bug in MagicHaskeller.ProgGenSF, which is used by MagicHaskeller.LibTH.exploit. (Well..., actually..., the bug caused segmentation fault!)
Dec. 18, 2011 Version 0.8.6.2 fixes some bugs in the analytical modules. Also, MagicHaskeller.LibTH.exploit is more useful and interesting now. Try something like
Prelude MagicHaskeller.LibTH> exploit $ \f -> f "abc" == "abcba"
Dec. 13, 2011 NOTICE: module MagicHaskeller.RunAnalytical is supposed to be run under GHCi, and it does not work in the a.out form. Also, it is only tested on Linux, and does not work with Windows.
Apr. 27, 2011 Version 0.8.6.1 fixes a bug related to dealing with background knowledge when doing analytical generate-and-test. Also added some additional functions.
Apr. 8, 2011 Version 0.8.6 includes some new modules for analytical synthesis. See MagicHaskeller-RunAnalytical.html.
Nov. 28, 2009 Version 0.8.5-1 fixes some minor problems, and user-defined datatypes have become usable again even with recent compiler versions.
July 14, 2009 Version 0.8.5.
- Open source
- Options rearranged. See the description for Options in the library documentation.
- Changed the module hierarchy.
- ConstrL and ConstrLSF are now constrL option with ProgGen and ProgGenSF
- Introduced ProgGenXF, that is an improved version of ProgGenSF
- Some minor fixes and improvements
June 10, 2008 Version 0.8.4 enables some options to fine tune program generation. Also I mined some old optimizations used until 0.7.5 --- see the library documentation.
Apr. 11, 2008 Version 0.8.3-1 (hopefully) fixes a bug in package dependency.
Apr. 7, 2008 Version 0.8.3 is out.
Sep. 3, 2007 Version 0.8.2 is out.
- merged 0.8.* and 0.7.*, using a different program generator for each. (See the library documentation.)
- fixed segfault related to coarbitrary instance of functions.
- worked around the change in Data.Typeable.tyConString. (User-defined types are usable again.)
The full changelog (generated by darcs changes) since May 2006 is here.
Download
The newest package is registered to the Hackage DB, so if you have the cabal command,
cabal install MagicHaskeller
should work, though you may want to specify --flags=GHCAPI for the full installation.
However, if it fails, try unpacking and then installing the package, namely,
cabal unpack MagicHaskeller-0.8.6.3
cd MagicHaskeller-0.8.6.3
cabal install
Source distribution
Since Version 0.8.5 I have decided to make its source available (in the BSD3 style license).
source tarball for Version 0.9.1
Recently I noticed that my office will never be finished cleaning up. Neither will my source. So the source is released almost as is, with some unnecessary comments, maybe in Japanese.
Binary distributions
NB: because I think this tool is still experimental, the older library distributions that are released in the binary form will be installed onto the current directory by default. Therefore, while you are out of the directory your copy of the library will not work.
Linux i686
Linux EM64T (known to work on RHEL4)
I do not know who wants this release, but I have such a box at hand.
Windows (known to work on XP)
Also download the sample component library for the stand alone program, or that for the library package.
How to Play
Source tarball
The standard method for installation should work, i.e.,
tar xvfz MagicHaskeller-0.8.6.1.tar.gz
cd MagicHaskeller-0.8.6.1
runhaskell Setup.hs configure --user
runhaskell Setup.hs haddock
runhaskell Setup.hs build
runhaskell Setup.hs install
(Instead of typing the four lines starting with runhaskell, you can just say
cabal install
if you have the cabal command.)
And now you have package MagicHaskeller-0.8.6.1 available. Leave the directory and type
ghci -fth -package MagicHaskeller
to launch ghci and try the following (or whatever you think of):
:m +MagicHaskeller.LibTH
initialize
printAll $ \f -> f "abc" == "aabbcc"
exploit $ \f -> f "abc" == "aabbcc" -- This uses a richer library, and thus the resulting programs are more readable.
-- The following will find the recursive form definition of the fibonacci function from its closed-form solution.
init075 -- The problem is tough, so you should be prepared to resort whatever measure you can take!
printAll $ \f -> all (\n -> (f :: Int->Int) n == let phi = (1 + sqrt 5)/2 in round ((phi^n - (1-phi)^n) / sqrt 5) ) [0..9]
-- You may also try analytically-generate-and-test synthesis, if you are using GHC 6.10.* or 6.12.*.
:m MagicHaskeller.RunAnalytical
quickStart [d| f [] = []; f [a] = [a,a] |] noBKQ $ \f -> f "abc" == "aabbcc"
Also, you may want to see the haddock-generated documentation (but the export list is subject to change).
With the library distributions user-defined types can be used. TryTreeExample.hs as an example. Also, this paper may be interesting, although it describes a little older version.
(Actually, if the type of everything, etc. uses type constructors which never appear in the primitive component set, that can cause an error. This is a fixable bug, but please wait until I fix it.)
Library package
Make sure you have either GHC 6.8.2 or GHC 6.6.1 installed. Try the following:
mkdir foo
cd foo
wget http://nautilus.cs.miyazaki-u.ac.jp/~skata/magichaskeller(version)-ghc(ghc version).tgz # pick up a filename from above links
wget http://nautilus.cs.miyazaki-u.ac.jp/~skata/LibTH.hs
tar xvzf magichaskeller(blah).tgz
./register.sh
ghc --make -O -fth -package MagicHaskeller -fglasgow-exts LibTH.hs
ghci -fth -package MagicHaskeller -fglasgow-exts LibTH.hs
(Or alternatively, download, edit, and execute either Quickstart.sh (for Linux) or Quickstart.bat (for Windows) for the same effect. You may still want to execute the ghc and ghci lines by hand on Windows.)
Then, in GHCi try the following (or whatever you think of):
initialize
printAny (\f -> f "abc" == "aabbcc")
-- The following will find the recursive form definition of the fibonacci function from its closed-form solution.
init075 -- The problem is tough, so you should be prepared to resort whatever measure you can take!
printAny (\f -> all (\n -> (f :: Int->Int) n == let phi = (1 + sqrt 5)/2 in round ((phi^n - (1-phi)^n) / sqrt 5) ) [0..9])
Also, you may want to see the haddock-generated documentation (but the export list is subject to change).
With the library distributions user-defined types can be used. TryTreeExample.hs as an example. Also, this paper may be interesting, although it describes a little older version.
(Actually, if the type of everything, etc. uses type constructors which never appear in the primitive component set, that can cause an error. This is a fixable bug, but please wait until I fix it.)
After having fun for a while, you can unregister it by
./unregister.sh
before leaving the directory.
Stand alone program
Download the executable and the sample component library. Invoke the executable and type in a predicate (a function returning Bool), e.g.
\f -> f "abc" == "abccba"
The system infers its type and searches for an expression that satisfies the predicate. In the above case it should print
(\ a -> list_para a (\ b -> b) (\ b c d e -> cons b (d (cons b e))) nil)
This is a valid Haskell expression, and so you can use it within a Haskell interpreter after loading the component library. For example, if the Hugs interpreter is installed in your computer, by typing
hugs LibDeb.hs
(\ a -> list_para a (\ b -> b) (\ b c d e -> cons b (d (cons b e))) nil) "abcd"
in your shell, you get
"abcddcba"
(Actually there is no need to run an external interpreter since 0.7.1.)
In the above case the request type is inferred, and thus is a monomorphic type. If you like to limit the search to a more polymorphic type, you can provide the type, like
\f -> f "abc" == "abccba" ;; [a] -> [a]
(Note the semicolons. You know the actual type of the predicate we want is ([a]->[a])->Bool)
Tips
- Start with a small number of examples, and if you get a wrong answer add other examples. For example, if you type
\f -> f 3 "abcde" == Just 'c'
the answer will be
(\ a b -> hd (tl (tl b)))
(note that in the sample component library hd returns a Maybe value because I prefer total functions) and thus evaluation of
(\ a b -> hd (tl (tl b))) 4 "abcde"
is also
Just 'c'
which is incorrect if you intended to get `the 4th element of "abcde"'.
Based on this you can add an example, like
\f -> f 3 "abcde" == Just 'c' && (f 4 "abcde" == Just 'd')
(the parentheses are required because the precedence is currently not dealt with correctly, as written below.)
You may also need to add the Nothing case.
- Comment out unnecessary combinators from the component library if you want quicker results. (In fact, I usually use the library only with Nat-related and List-related components.) Currently expressions like hd nil or tl (cons blah), i.e. hd/tl nil/(cons _ _), are not suppressed, causing unnecessary bloat in the search space.
- Try :set -q for further speed up.
-
In order to narrow the search space, functions like undefined or error, which return a single type variable that does not appear in the arguments, should not appear in the component library (unless they are bound in some contexts, though type classes are not yet implemented).
Also, an argument type that is a single type variable, like a's in S :: (a->b->c) -> (a->b) -> a -> c, K :: b -> a -> b, I* :: (a->b)->a->b, etc. can be harmful. In general, primitive combinators should not (and need not, I believe) be included.
- Because the system generates all the expressions including those with non-linear recursions, you should note that there exist some expressions which take extraordinarily long time. Imagine a function that takes an integer n and increments 0 for 2^(2^n) times. Another example that has smaller program size is blah in
blah :: Int -> Int -> Int
blah 0 i = i
blah k i = foo (blah (k-1)) i
foo h 0 = 2
foo h s = h (f h (s-1))
Giving small values in your examples is a good policy, especially when not using the --timeout option (in the stand alone program) or when using unsetTimeout (in the library).
Commands in the stand alone program
- (a unary predicate expression) [;; (type)] invokes a search, as explained in the previous section.
- (other kinds of expression) just evaluates the expression using the internal (poor) interpreter. For example, try typing (\x -> 3+x) 5
- :set (options) sets options. Available options can be viewed by using a wrong option such as :set -?. Those options can be given also in the command line.
- :load (filename) loads the (filename) as the component library. When you start the system :load LibDeb is silently executed.
- :reload reloads the current component library. Useful when you edited the file, and when you want to clear the memo table to see the exact computation time of the next synthesis you try.
- :monitor [(expression)] sets what value to monitor. By typing :monitor \f -> f you set to see the expressions tried, but my recommendation is to use :set -v -+-. You can turn this off by just typing :monitor without an expression. (This is the default.)
This command can be used to do a very lightweight random testing to estimate how many of the tried expressions are actually equivalent (and redundant). For example, if you want to compare how many expressions are tried and how many different expressions are tried (where "different" means difference as a value or a function, returning a different value for some input) when synthesizing take function, you can make a file infile with the contents
:load LibDeb
:set -q
:monitor \f -> (f 2 "23443", f 0 "234jk", f 2 "k", f 0 "")
\f -> f 3 "dljka" == "dlj"
:quit
and do the following in your shell:
./tvn < infile > outfile
wc outfile
sort outfile | uniq | wc
- :quit
Limitations
- Synthesis of "Hello World!" program requires the "Hello World!" string in the component library.
The system is not supposed to do the stupid search for the exact combination of letters.
-
It only finds expressions typeable within Hindley-Milner.
-
Functions in containers like [a->b], (a,b->c), etc. were prohibited just for efficiency reasons in old distributions (though such prohibition turned out to be no use). Thus e.g. catamorphisms had to be written in the curried form.
-
Component library functions which return a type variable are assumed to be consumers, and if its first argument is not a function it is assumed to be the strict argument, where constant values cannot appear. This is also for efficiency reasons.
Limitations in the interpreter of the stand alone program
There are a lot of limitations in the interpreter, some of which can be guessed from the sample component library.
- There are no lets, no does, no list comprehensions. Also, case expressions are assumed to be for lists, where the first alternative must be for [] and the second be for (?:?)
- The precedence of operators is not correctly inferred --- without parentheses all the operators are left associative.
- The bindings in the component library must be foo = \a b -> form, not foo a b = form.
- and lots of others.... Please use the library package instead of the stand alone program if you do not like them.