Darcsden improvements and Darcs sprint

This post is intended to be a short summary of my SoC project, as well as my recent trip to Darcs sprint.

Introduction

I am finishing up this post on the train back from the Autumn 2015 Darcs sprint. Today (Sept 20, Sun) was a very fun day full of darcs chatting and coding. By the end of the day we’ve heard a number of presentations

  • Ganesh described his work on "stash" command for darcs (naming subject to change!). It involves some refactoring of the rebase code. I hope we would hear more from him on that, because the internal workings are actually quite interesting — I believe it’s the first time singleton types and DataKinds are used in the darcs codebase;
  • Florent Becker gave a presentation about Pijul and the theory behind it — A Categorical Theory of Patches by Samuel Mimram and Cinzia Di Giusto, see arXiv:1311.3903;
  • Vinh Dang talked about his improvements on the darcs wiki (it’s about time to organize the website), his goal was to make it more accessible to the newcomers;
  • Yours truly gave a small presentation, outline of which you will find below:

Looking back

I have spent this summer hacking on DarcsDen as part of the Google Summer of Code program.

My basic goal was to create a "local" version of darcsden. It was not a trivial task to install darcsden (and probably installation is still not very easy!). It uses a third-party software like Redis and CouchDB. During my coding process I modifed darcsden such that it now can be a good choice for local (or lightweight single user) darcs UI. The local darcsden version can be used without any databases, tracking the repositories in the local file system. This way darcsden can be used by a developer on her local computer, like darcsum, (for working with/comparing repositories) as well as a replacement for darcsweb/cgit — a single user web front for darcs repositories.

Besides that a user of a local version can use darcsden’s interactive UI for recording new patches, as well as a command-line tool den for a quick way of browsing the repositories.

Installing darcsden-local is currently not as easy as I want to it be, but I hope that soon you will be able to install it just by running cabal install darcsden or brew install darcsden. As for now, one could do the following:

  1. darcs get --lazy http://hub.darcs.net/co-dan/darcsden-local
  2. cabal install . or stack install

This should install the darcsden binary and all the related css/js files. You can start darcsden by running darcsden --local. If you open your web browser you should see a list of repositories in the current directory.

However, you might have repositories scattered all over the place, and scanning your whole system for darcs repositories is just inefficient. For this purposes darcs keeps a list of repositories in a file inside your ~/.darcs directory. You can manage that list either by hand, or using the command-line den tool:

  • den $PATH — add $PATH to the list of repositories in ~/.darcs/darcsden_repos (if it’s not already present there), start darcsden server in the background and launch the browser pointing to $PATH;
  • den — the same as den .;
  • den --add $PATH — add $PATH to the list of repositories in ~/.darcs/darcsden_repos;
  • den --remove $PATH — remove $PATH from the list of repositories in ~/.darcs/darcsden_repos.

In order to further customize darcsden, one can tweak the configuration file located at ~/.darcs/darcsden.conf. Apart from the usual darcsden settings one may pay attention to the following variables:

  • homeDir (default .), points to the "root" directory with repositories. If the list file ~/.darcs/darcsden_repos is not present darcsden will recursively search repositories in that directory
  • unLocal, pwLocal: the username and the password of the "local" user

The user/password credentials are required for editing the repositories and recording new patches. However, the den binary should automatically pick them up from the config file and log you in.

Once you are logged in, and you have unrecorded changes in the repository, you can use darcsden UI to record a new patch.

DarcsDen record

DarcsDen record

Below you can see an example of recording and merging patches from a branch.

DarcsDen merge

DarcsDen merge

Darsden allows you to create forks/branches of your repositories, and it keeps track of the patch dependencies in your branches.

More "internal" changes:

  • Instead of having to specify some parts of the configuration in DarcsDen.Settings, darcsden now uses runtime flags: –hub for using hub-specific modifications, –local for using the local backend and no flag for default behaviour
  • The flag actually choose what are called instances — something that a bit less fine-grained than settings. Instances allow you to pick backend, overwrite settings, modify the looks of the front page.
  • HTTP-testing using wreq. The previous test suite used selenium and it got bit-rotten. The wreq-based is easier to run and perhaps slightly easier to maintain.
  • HTTP auth, which is used as part of the local instance; the den tool utilizes it to log the user in automatically.
  • Support for repositories inside directories and nested repositories.
  • All the backend code that is used for handling repositories and meta-data on the file system.
  • Functionality for downloading zipped dist archives of darcs repositories.
  • Assorted mini-fixes

What now?

During the sprint I hacked together some code for viewing suspended patches along the regular ones. The next step would be to have a similar interface for managing the suspended patches.

We have also discussed the possibility of adding rewrite rules implementing short-cut fusion for the directed types in Darcs. In order to see if it’s really worth it we would have to bring back to life the benchmarking suite (or at least check on it!).

It was a really exciting weekend for me and I was delighted to meet some of my IRC friends. As it turns out, it is a small world and despite being from different parts of it we have a bunch of common IRL friends, professors. As the French would (probably not) say, très bien. The next darcs sprint will probably be in January, and probably in Europe, again.

Darcs internals, part 1: typesafe directed datastructures

I am editing this post from IRILL, where the darcs sprint is taking place

One of the things that I particularly like about the Darcs codebase is that you can see that the developers were not shy about using intermediate-slash-advanced Haskell/GHC features to help achieving type safety. You can see GADTs, associated types, phantom types, existential types actively used.

In this post I would like to discuss the representation of patches and the use of type witnesses in darcs.

A word about patches and contexts

This post is intended for people interested in patch theory and its implementation in darcs. A passing familiarity with patches, contexts, inverses, etc is useful, but not required, as we restate some basic definitions in this section.

A primitive patch is a basic unit in patch theory. It constitutes an atomic change in the repository. The definition of a primitive patch may vary, but usually the following changes are considered primitive patches:

  • Removing a file from the repository
  • Adding file to the repository
  • Changing a line n in a file in the repository
  • Identity patch, i.e. an empty/non-existing change that does not modify the state of the repository at all

Every primitive patch has a pre-context and a post-context. Roughly, you can think of a pre-context as the full state of the repository before the change was made, and of the post-context as the full state of the repository after the change was applied. We write (x)-A->(y) for a patch A with a pre-context x and a post-context y.

If a primitive patch A has a pre-context a, a post-context o, and a primitive patch B has a pre-context o, a post-context b, then we can combine two patches to obtain a sequential patch AB with the pre-context a and the post-context b.

Every primitive patch (x)-A->(y) has an inverse (y)-A^{-1}->(x), such that (x)-A->(y)-A^{-1}->(x) is equivalent to the identity patch (x)-1->(x).

In the next sections we will see how those things are implemented in darcs.

Primitive patches and witnesses

A primitive patch, which constitutes a single fine grained change, can be represented as a (G)ADT:

data Prim where
    Move :: FileName -> FileName -> Prim
    RmFile :: FileName -> Prim
    AddFile :: FileName -> Prim
    Hunk :: FileName -> Int -> ByteString -> ByteString -> Prim
    RmDir :: FileName -> Prim
    …

We can represent complex patches as sequences of primitive patches:

data Patch where
    NilP :: Patch
    PrimP :: Prim -> Patch
    SeqP :: Patch -> Patch -> Patch

This seems reasonable enough. But if we implement our patch theory this way we seem to be missing something — patches have (pre- and post-) contexts. Having contexs allows us to enforce patch composition on the level of type system. Consider the following definition, which uses phantom types as type witnesses for contexts.

data Prim x y where
    Move :: FileName -> FileName -> Prim x y
    RmFile :: FileName -> Prim x y
    AddFile :: FileName -> Prim x y
    Hunk :: FileName -> Int -> ByteString -> ByteString -> Prim x y
    RmDir :: FileName -> Prim x y
    …

data Patch x y where
    NilP :: Patch x x
    PrimP :: Prim x y -> Patch x y
    SeqP :: Patch x y -> Patch y z -> Patch x z

We call the types with witnesses representing pre- and post-contexts directed types. Intuitively, the directed type D x y has a “direction” from x to y, written as (x)->(y). The Prim datatype looks pretty much like the type actually used in Darcs. The Patch datatype, however, is completely artificial. We will see in the next sections how Darcs really models complex patches.

Directed programming

FL & RL

Two particularly useful directed types used in darcs are directed lists: forwards lists of the type FL a x y and reverse lists RL a x y. Forward lists “go” from the head to the tail; reverse lists “go” from the tail to the head. The lists are polymorphic over a just like regular lists.

data FL a x y where
    NilFL :: FL a x x    {- The empty list stays at (x) -}
    (:>:) :: a x y -> FL a y z -> FL a x z
infixr 5 :>:

For myself, I visualise forward lists like this:

(x) —a—> (y) —b—> (z) ——Nil——> (z)
a :: Patch x y
b :: Patch y z
(a :>: b :>: NilFL) :: FL Patch x z

The reverse lists are “going” from tail to head1

data RL a x y where
    NilRL :: RL a x x
    (:<:) :: RL a x y -> a y z -> RL a x z
infixl 5 :<:

(Mnemonically, the the head is always “greater” than the tail)

The reason I used the word “go” inside quotation marks so far is the following. Reverse lists and forward lists represent the same concept: a sequence of patches (or a sequence of directed things for that matter). They only differ in the associativity of the elements. Forward lists associate to the right, but reverse lists associate to the left.

p = Move "foo" "bar"

-- [[(x) --Nil--> (x) -p-> (y)] -id-> (y)] -p^{-1}-> (x)
example :: RL Patch x x
example = NilRL :<: p :<: Identity :<: inverse p

-- (x) -p-> [(y) -p^{-1}-> (x) --Nil--> (x)]
example2 :: FL Patch x x
example2 = p :>: inverse p :>: NilFL

The right-associated/reverse lists provide easy access to the last element of the sequence; the left-associated/forward lists provide easy access to the first element of the sequence. Therefore, if we view a repository as a directed sequence of patches, right-associated lists are useful for operations that work on the “latest” patches in the repository (such as record/unrecord), and left-associated lists are useful for commands that scan the repository from the beginning (like clone).

We can reassociate the lists easily, and verify that the two representations are isomoprhic:2

reverseRL :: RL a wX wY -> FL a wX wY
reverseRL = r NilFL
-- the type signature of r basically gives us an invariant
-- wZ is slowly "decreasing" reaching the point where
-- wZ = wX; at that point the first argument is of type FL a wX wY
  where r :: FL a wZ wY -> RL a wX wZ -> FL a wX wY
        r a NilRL = a
        r a (xs :<: x) = r (x :>: a) xs

For example,

-- [[(x) --Nil--> (x) -p-> (y1)] -q-> (y2)] -r-> (z)

Turns into

-- (x) -p-> [(y1) -q-> (y2) -r-> [(z) --Nil--> (z)]]

Exercise: write a function reverseFL :: FL a wX wY -> RL a wX wY

We can write a lot of directed analogues of familiar list functions. For example, here is a directed append:

infixl 5 +<+
(+<+) :: RL a wX wY -> RL a wY wZ -> RL a wX wZ
xs +<+ NilRL = xs
xs +<+ (ys :<: y) = (xs +<+ ys) :<: y

Exercise: write directed append for forward lists: (+>+) :: FL a wX wY -> FL a wY wZ -> FL a wX wZ

Type witnesses

So we wrote a bunch of standard list functions for directed lists; what about some of the other functions? Can we, for example, implement filter for directed lists. Can it look like filterFL :: (a wX wY -> Bool) -> FL a wX wY? Well, we can try writing

filterFL :: (a wX wY -> Bool) -> FL a wX wY
filterFL p NilFL = NilFL
filterFL p (x :>: xs) | p x = filterFL p xs
                      | otherwise = x :>: filterFL p xs

However, under closer scrutiny we realize that it does not typecheck! In the second clause of filterFL we have the following information:

x :: a x y, xs :: FL a y z
filterFL xs :: FL a y z

Thus, in the first case (in which p x holds) we try to return something of the type FL a wY wZ, when FL a wX wZ was expected. It is clear that generally we can do this only if x :: a wX wX, i.e. wY = wX. But a simple predicate of the type p :: a wX wZ -> Bool won’t tell us anything about that. We need an additional type witness in our system telling us that if p x holds, then x :: a wX wX. For that purpose we introdue the EqCheck datatype.

data EqCheck wX wY where
    IsEq :: EqCheck wX wX
    NotEq :: EqCheck wX wY

then the type of a predicate would be

type Pred a = forall wX wY. a wX wY -> EqCheck wX wY

If (p x) = IsEq, then the typechecker will know that x :: a wX wX. We can then finally write

filterFL :: Pred a -> FL a wX wY -> FL a wX wY
filterFL p NilFL = NilFL
filterFL p (x :>: xs) | IsEq <- p x = filterFL p xs
                      | otherwise   = x :>: filterFL p xs

EqCheck is used this way in the darcs source code to e.g., filter our internal patches. Sometimes darcs stores information — like suspended patches — in the so called internal patches. Every patch type implements the internal patch checker (code slightly adapted):

-- |Provides a hook for flagging whether a patch is "internal" to the repo
-- and therefore shouldn't be referred to externally, e.g. by inclusion in tags.
-- Note that despite the name, every patch type has to implement it, but for
-- normal (non-internal) types the default implementation is fine.
-- Currently only used for rebase internal patches.
class MaybeInternal p where
    -- | @maybe (const NotEq) (fmap isInternal patchInternalChecker) p@
    -- returns 'IsEq' if @p@ is internal, and 'NotEq' otherwise.
    -- The two-level structure is purely for efficiency: 'Nothing' and 'Just (InternalChecker (const NotEq))' are
    -- semantically identical, but 'Nothing' allows clients to avoid traversing an entire list.
    -- The patch type is passed as an 'FL' because that's how the internals of named patches are stored.
    patchInternalChecker :: Maybe (forall wX wY . FL p wX wY -> EqCheck wX wY)
    patchInternalChecker = Nothing

When the user runs darcs tag in the repository, darcs creates a dummy patch that explicitly depends on all the previous patches — apart from the internal ones of course. Thus, the tag command uses the following function (slightly adapted):

filterNonInternal :: MaybeInternal p => PatchSet p wX wY -> PatchSet p wX wY
filterNonInternal =
  case patchInternalChecker of
     Nothing -> id
     Just f -> \l -> PatchSet (filterRL (f . patchcontents . hopefully) (unPatchSet l))

where a PatchSet is the list of PatchInfoAnd patches — patches together with the meta-information.

It is worth noting that EqCheck x y is isomorphic Maybe (x :~: y), but the propositional equality datatype has only been added to base since 4.7.0.0. In the future versions darcs will probably switch to using Data.Type.Equality.

Conclusion

We’ve briefly touched upon patch representation in darcs and talked about directed types and directed programming.

A good if a bit outdated reference is Jason Dagit’s master thesis (specifically the bits from chapter 4). The wiki is currently lacking in material, but I hope to improve the situation eventually.

Next time we will probably discuss either directed pairs and their use in darcs, or sealed datatypes, or both.


  1. Essentially, a reverse list is a directed version of a snoc-list.

  2. Before darcs 2.10.1 the right-associated lists had a slightly different datatype, and for the old lists those functions would “reverse” the list. Hence the names of the functions.

Darcs rebase by example

Darcs is a patch-centric version control system. In Darcs, there is no “correct” linear history of a repository – rather, there is a poset of patches. That means that most of the time you are pushing and pulling changes you can cherry-pick patches without a problem. However, in some cases you cannot perform a pull (or some other operation on the repository) smoothly. Sometimes it is necessary to rewrite the “history” – i.e. modify a patch that is a dependency of one or more other patches. For those cases darcs rebase comes in handy. To put it in the words of the implementor “Rebase is a workaround for cases where commutation doesn’t do enough”.

A repository can change it’s state from rebase-in-progress back to normal if there are no suspended patches left. However, be aware that you cannot unsuspend a patch1 if you have unrecorded changes in the repository. In light of this, I suggest recording a temporary patch with current changes

darcs record -am DRAFT

You can suspend that patch at the beginning of your rebase process and apply it at the end.

General overview of rebase

darcs rebase is an operation (or, rather, a family of operations) that allows one to make changes “deep down” in the repository history. One of the crucial things that allows for rebase to work is the fact that since darcs 2.10 patches can be suspended. When one performs any of the darcs rebase commands, the repository moves to a special rebase-in-progress state. In this state repository contains a pristine, a set of patches, a working copy, and — in addition to all the usual stuff — a set of suspended patches. Suspended patches are not active in the repository — that is, they are not applied.

Let’s go over the rebase subcommands

rebase log/rebase changes

This is simple: list the suspended patches

rebase suspend

Moves selected patches into the suspended state. Once the patch is suspended it is no longer active in the repository.

Note: once you suspend a patch, it changes its identity. That means that even if you suspend a patch and unsuspend it immediately, you will get a different repository that you have started with. Let this be a good reason (one of a couple!) for doing rebase on a separate branch.

> cat file
test
123
> darcs rebase suspend
	patch 64523bc4622fad02a4bdb9261887628b7997ebdd
Author: Daniil Frumin 
Date:   Thu Jul 23 18:49:30 MSK 2015
  * 123
Shall I suspend this patch? (1/5)  [ynW…], or ? for more options: y
patch cc54d7cf4b9e3d13a24ce0b1b77b76581d98d75d
Author: Daniil Frumin 
Date:   Thu Jul 23 18:43:53 MSK 2015
  * Test
Shall I suspend this patch? (2/5)  [ynW…], or ? for more options: d
Rebase in progress: 1 suspended patches
> darcs rebase log
patch 64523bc4622fad02a4bdb9261887628b7997ebdd
Author: Daniil Frumin 
Date:   Thu Jul 23 18:49:30 MSK 2015
  * 123
Shall I view this patch? (1/?) [yN…], or ? for more options: y
[123
Daniil Frumin **20150723154930
 Ignore-this: 43e09e6503ac74688e74441dc29bce25
] hunk ./file 2
+123
Rebase in progress: 1 suspended patches
> cat file
test

rebase unsuspend

Does the opposite of suspend: applies a suspended patch to the repository and changes its state to normal.

> darcs rebase unsuspend                                                                                       
patch 64523bc4622fad02a4bdb9261887628b7997ebdd
Author: Daniil Frumin 
Date:   Thu Jul 23 18:49:30 MSK 2015
  * 123
Shall I unsuspend this patch? (1/1)  [ynW…], or ? for more options: y
Do you want to unsuspend these patches? [Yglqk…], or ? for more options: y
Rebase finished!

rebase apply

Rebase apply takes a patch bundle and tries to apply all the patches in the bundle to the current repository. If a patch from the bundle conflicts with a local patch, then the local patch gets suspended. You will thus have a chance to resolve the conflict by amending your conflicting patches, at a price of.. well, changing the identity of your local patches.

rebase pull

Sort of like rebase apply, but instead of a patch bundle it obtains the patches from a remote repository.

Specifically, rebase pull applies all the remote patches, one-by-one, suspending any local patches that conflict. We will see more of rebase pull in the second example.

Example 1: suspending local changes

Imagine the following situation: at point A you add a configuration file to your repository, then you record a patch B that updates the settings in the configuration file. After that you make some more records before you realize that you’ve included by accident your private password in patch A! You want to get rid of it in your entire history, but you can’t just unrecord A, because B depends on A, and possibly some other patches depend on B.

The contents of the configuration file after patch A:

port = 6667
host = irc.freenode.net
password = awesomevcs 

Patch B, diff:

@@ -1,3 +1,4 @@
-port = 6667
+port = 6697
+usessl = True
host = irc.freenode.net
password = awesomevcs

You cannot just amend patch A, because the patch B depends on A:

> darcs amend                                                                                                         
patch 1925d640f1f3180cb5b9e64260c1b5f374fce4ca
Author: Daniil Frumin 
Date:   Tue Jul 21 13:23:07 MSK 2015
  * B

Shall I amend this patch? [yNjk…], or ? for more options: n

Skipping depended-upon patch:
patch 22d7c8da83141f8b1f80bdd3eff02064d4f45c6b
Author: Daniil Frumin 
Date:   Tue Jul 21 13:22:24 MSK 2015
  * A

Cancelling amend since no patch was selected.

What we will have to do is temporarily suspend patch B, amend patch A, and then unsuspend B.

> darcs rebase suspend
patch 1925d640f1f3180cb5b9e64260c1b5f374fce4ca
Author: Daniil Frumin 
Date:   Tue Jul 21 13:23:07 MSK 2015
  * B

Shall I suspend this patch? (1/2)  [ynW…], or ? for more options: y
patch 22d7c8da83141f8b1f80bdd3eff02064d4f45c6b
Author: Daniil Frumin 
Date:   Tue Jul 21 13:22:24 MSK 2015
  * A

Shall I suspend this patch? (2/2)  [ynW…], or ? for more options: d
Rebase in progress: 1 suspended patches

At this point, the state of our repository is the following: there is one (active) patch A, and one suspended patch B.

> darcs rebase changes -a
patch 4c5d45230dc146932b21964aea938e2a978523eb
Author: Daniil Frumin 
Date:   Tue Jul 21 13:28:58 MSK 2015
  * B

Rebase in progress: 1 suspended patches
> darcs changes -a
patch 21f56dfb425e4c49787bae5db4f8869e96787fb2
Author: Daniil Frumin 
Date:   Tue Jul 21 13:28:49 MSK 2015
  * A

Rebase in progress: 1 suspended patches
> cat config
port = 6667
host = irc.freenode.net
password = awesomevcs
> $EDITOR config # remove the password bit
> darcs amend
patch 22d7c8da83141f8b1f80bdd3eff02064d4f45c6b
Author: Daniil Frumin 
Date:   Tue Jul 21 13:22:24 MSK 2015
  * A

Shall I amend this patch? [yNjk…], or ? for more options: y
hunk ./config 3
-password = awesomevcs
Shall I record this change? (1/1)  [ynW…], or ? for more options: y
Do you want to record these changes? [Yglqk…], or ? for more options: y
Finished amending patch:
patch 21f56dfb425e4c49787bae5db4f8869e96787fb2
Author: Daniil Frumin 
Date:   Tue Jul 21 13:28:49 MSK 2015
  * A

Rebase in progress: 1 suspended patches

Now that we’ve removed the password from the history, we can safely unsuspend patch B (in this particular situation we actually know that applying B to the current state of the repository won’t be a problem, because B does not conflict with our modified A)

> darcs rebase unsuspend
patch 1925d640f1f3180cb5b9e64260c1b5f374fce4ca
Author: Daniil Frumin 
Date:   Tue Jul 21 13:23:07 MSK 2015
  * B

Shall I unsuspend this patch? (1/1)  [ynW…], or ? for more  options: y
Do you want to unsuspend these patches? [Yglqk…], or ? for more options: y
Rebase finished!

And that’s done!

> cat config
port = 6697
usessl = True
host = irc.freenode.net

You may use this rebase strategy for removing sensitive information from the repository, for removing that 1GB binary .iso that you added to your repository by accident, or for combining two patches into one deep down in the patchset.

Example 2: developing against a changing upstream – rebase pull

Imagine you have a fork R’ of a repository R that you are working on. You are implementing a feature that involves a couple of commits. During your work you record a commit L1 that refractors some common datum from modules A.hs and B.hs. You proceed with your work recording a patch L2. At this point you realise that after you forked R, the upstream recorded two more patches U1 and U2, messing with the common datum in A.hs. If you just pull U1 into your fork R’, you will have a conflict, that you will have to resolve by recording another patch on top.

      S
     / \
    /   \
   L1   U1

Note: if you run darcs rebase pull in R’, then the only patches that will be suspended are the ones which are already in R’. Because suspended patches gain new identity, make sure that you do not have other people’s conflicting patches present in R’.

The way to solve this would be to first do darcs rebase pull, which would suspend the conflicting patches, and then start unsuspending the patches one by one, making sure that you fix any conflicts that may arise after each unsuspend.

Consider a concrete example with two repositories rep1 and rep1_0.

rep1_0 > darcs changes 
patch ebaccd5c36667b7e3ee6a49d25ef262f0c7edf2b
Author: Daniil Frumin 
Date:   Mon Jul 27 20:56:25 MSK 2015
  * commit2

patch a7e0d92a53b0523d0224ef8ffae4362adf854485
Author: Daniil Frumin 
Date:   Mon Jul 27 20:56:25 MSK 2015
  * commit1
	rep1_0 > darcs diff —from-patch=commit2
patch ebaccd5c36667b7e3ee6a49d25ef262f0c7edf2b
Author: Daniil Frumin 
Date:   Mon Jul 27 20:56:25 MSK 2015
  * commit2
diff -rN -u old-rep1_0/dir1/file2 new-rep1_0/dir1/file2
— old-rep1_0/dir1/file2	1970-01-01 03:00:00.000000000 +0300
    +++ new-rep1_0/dir1/file2	2015-07-28 12:25:54.000000000 +0300
@@ -0,0 +1 @@
+double whatsup
	rep1_0 > cd ../rep1
rep1 > darcs changes
patch e3df0e23a3915910a81eb8181d7b3669e8f270a9
Author: Daniil Frumin 
Date:   Tue Jul 28 12:27:55 MSK 2015
  * commit2’

patch a7e0d92a53b0523d0224ef8ffae4362adf854485
Author: Daniil Frumin 
Date:   Mon Jul 27 20:56:25 MSK 2015
  * commit1
rep1 > darcs diff —from-patch=“commit2’”
patch e3df0e23a3915910a81eb8181d7b3669e8f270a9
Author: Daniil Frumin 
Date:   Tue Jul 28 12:27:55 MSK 2015
  * commit2’
diff -rN -u old-rep1/dir1/file2 new-rep1/dir1/file2
— old-rep1/dir1/file2	1970-01-01 03:00:00.000000000 +0300
+++ new-rep1/dir1/file2	2015-07-28 12:28:39.000000000 +0300
@@ -0,0 +1 @@
+touch file2
\ No newline at end of file
diff -rN -u old-rep1/file1 new-rep1/file1 — old-rep1/file1	2015-07-28 12:28:39.000000000 +0300 +++ new-rep1/file1	2015-07-28 12:28:39.000000000 +0300
@@ -1 +1 @@
-whatsup
\ No newline at end of file
+double whatsup
\ No newline at end of file

The patch commit2 from rep1_0 conflicts with commit2’ from rep1.

rep1 > darcs rebase pull ../rep1_0
patch ebaccd5c36667b7e3ee6a49d25ef262f0c7edf2b
Author: Daniil Frumin 
Date:   Mon Jul 27 20:56:25 MSK 2015
  * commit2
Shall I pull this patch? (1/1)  [ynW…], or ? for more options: y
Do you want to pull these patches? [Yglqk…], or ? for more options: y
The following local patches are in conflict:
patch e3df0e23a3915910a81eb8181d7b3669e8f270a9
Author: Daniil Frumin 
Date:   Tue Jul 28 12:27:55 MSK 2015
  * commit2’
Shall I suspend this patch? (1/1)  [ynW…], or ? for more options: y
Do you want to suspend these patches? [Yglqk…], or ? for more options: y
Finished pulling.
Rebase in progress: 1 suspended patches

Now we have one patch — commit2’ — in the suspended state. We want to resolve the conflict by amending commit2’. We will do that by unsuspending it and manually editing out the conflicting lines. This will also make it depend on commit2.

rep1 > darcs rebase unsuspend
patch e3df0e23a3915910a81eb8181d7b3669e8f270a9
Author: Daniil Frumin 
Date:   Tue Jul 28 12:27:55 MSK 2015
  * commit2’
Shall I unsuspend this patch? (1/1)  [ynW…], or ? for more options: y
Do you want to unsuspend these patches? [Yglqk…], or ? for more options: d
We have conflicts in the following files:
./dir1/file2

Rebase finished!
rep1 > cat dir1/file2
v v v v v v v

=============
double whatsup
*************
touch file2
^ ^ ^ ^ ^ ^ ^
rep1 > $EDITOR dir1/file2
rep1 > darcs amend -a
patch 40b3b4123c78dba6a6797feb619572072654a9cd
Author: Daniil Frumin 
Date:   Tue Jul 28 12:32:56 MSK 2015
  * commit2’
Shall I amend this patch? [yNjk…], or ? for more options: y
Finished amending patch:
patch c35867259f187c1bc30310f1cacb34c1bb2cce41
Author: Daniil Frumin 
Date:   Tue Jul 28 12:34:30 MSK 2015
  * commit2’
rep1 > darcs mark-conflicts
No conflicts to mark.

Another repository saved from conflicting patches, yay!


  1. See this discussion for details

Darcs 2.10.1 (Mac OSX build)

Darcs 2.10.1 has been released!

Citing the official release notes

> The darcs team is pleased to announce the release of darcs 2.10.1 !
> ..
> 
> # What's new in 2.10.1 (since 2.10.0) #
> 
> - generalized doFastZip for darcsden support
> - support terminfo 0.4, network 2.6, zlib 0.6, quickcheck 2.8 and
> attoparsec 0.13
> - errorDoc now prints a stack trace (if profiling was enabled) (Ben Franksen)
> - beautified error messages for command line and default files (Ben Franksen)
> - fixed the following bugs:
>       - issue2449: test harness/shelly: need to handle
> mis-encoded/binary data (Ganesh Sittampalam)
>       - issue2423: diff only respecting --diff-command when a diff.exe
> is present (Alain91)
>       - issue2447: get contents of deleted file (Ben Franksen)
>       - issue2307: add information about 'darcs help manpage' and
> 'darcs help markdown' (Dan Frumin)
>       - issue2461: darcs log --repo=remoterepo creates and populates
> _darcs (Ben Franksen)
>       - issue2459: cloning remote repo fails to use packs if cache is
> on a different partition (Ben Franksen)
> 
> # Feedback #
> 
> If you have an issue with darcs 2.10.0, you can report it on
> http://bugs.darcs.net/ . You can also report bugs by email to
> bugs at darcs.net, or come to #darcs on irc.freenode.net.

I’ve updated Mac OS to version 2.10.1. You can install it with

brew install http://darcs.covariant.me/darcs.rb

HTTP Basic auth in Snap

Recently, I’ve implemented HTTP Basic auth for darcsden and wrote a simple wreq test for it. In this post I would like to outline the main technical details.

Server side

Transient storage

A lot of darcsden code is (especially the parts that are closer to the users’ web browser — handlers, pages, so on) is written around sessions. Sessions are stored in a special storage — implemented by the DarcsDen.Backend.Transient, but if we abstract away from the details we have a Session datatype. Authorization and authentication information is handled by sessions using functions setUser :: (BTIO bt) => Maybe User -> Session -> Snap Session, notice :: (BTIO bt) => String -> Session -> Snap () (display a message to the user) and others. The BTIO bt part is just a synonym for

type BTIO bt = (BackendTransient bt, ?backendTransient :: bt, MonadIO (BackendTransientM bt))

Which basically says that we are operating with a transient backend that supports all of necessary operations, and we can also do IO in it. Right now there are only two transient backends (read: two ways of storing sessions): Redis and in-process memory.

Running sessions

If we have a piece of Snap code that we want to “sessionify” we use the following interface:

withSession :: (BTIO bt) => (Session -> Snap ()) -> Snap ()

What this does is it basically checks for a cookie — in case it is present it grabs the session information from the storage (in accordance with the cookie); if the cookie is not present it creates a new session and stores it in a cookie.

If we have a page of a type Session -> Snap (), we might want to give user an option to do HTTP authentication on that page. We introduce another function

withBasicAuth :: (BP bp, BTIO bt) 
              => (Session -> Snap ()) 
              -> (Session -> Snap ())
withBasicAuth act s = do
    rq  do
          rawHeader <- maybe throwChallenge return $ getHeader “Authorization” rq
          let (Just (usr,pw)) = getCredentials rawHeader
          c  errorPage “Unknown user”
            Just u -> if checkPassword (fromBS pw) u
                         then doLogin (fromBS usr)
                         else errorPage “Bad password”
      _ -> act s

So, what is going on in here? First of all, we check if the “login” parameter is set to “true”. If it does, we try to get the “Authorization” header, de-encode it, and check whether the credentials are good.

throwChallenge :: Snap a
throwChallenge = do
    modifyResponse $ (setResponseStatus 401 “Unauthorized”) .
                     (setHeader “WWW-Authenticate” “Basic realm=darcsdenrealm”)
    getResponse >>= finishWith

If the response header is present, it is of a form Basic x, where x is a base64 encoded string user:password. We can extract the credentials from the header like this:

import qualified Data.ByteString          as B
import qualified Data.ByteString.Base64   as B
…
getCredentials :: B.ByteString  — ^ Header
               -> Maybe (B.ByteString, B.ByteString) — ^ Possibly (username, password)
getCredentials header =
    if (isInfixOf “Basic “ header)
      then fmap extract (hush (B.decode (B.drop 6 header)))
      else Nothing
  where
    extract cred = case (B.breakByte (c2w ‘:’) cred) of
                     (usr, pw) -> (usr, safeTail pw)

On the client

The tests that I am currently writing for darcsden are all of the same form: I use wreq to do requests to the darcsden server, then, if necessary, I run taggy to extract information from the webpage, and compare it to the “canonical” information.

As it turns out, doing HTTP Basic Auth is very easy with wreq! First of all, we define a function for doing a GET request that will do some exception handling for us:

getSafeOpts :: Options -> String -> IO (Either Status (Response ByteString))
getSafeOpts opts url = fmap Right (getWith opts url) `catch` hndlr
  where
    hndlr (StatusCodeException s _ _) = return (Left s)
    hndlr e = throwIO e

This way, we won’t get a runtime exception when accessing a non-existing page or getting a server error. Doing a GET request with HTTP Basic Auth is now very easy:

getWithAuth :: (String, String) -> String -> IO (Either Status (Response ByteString))
getWithAuth (username,pw) url = getSafeOpts opts url
  where
    opts = defaults & auth ?~ basicAuth (toBS username) (toBS pw)
                    & param “login” .~ [“true”]

In that snippet we use lenses to set up an auth header and an HTTP parameter (?login=true).

Finally, after obtaining a Response ByteString, we can parse it with taggy-lens:

parsed :: Fold (Response L.ByteString) Text.Taggy.Node
parsed = responseBody 
       . to (decodeUtf8With lenientDecode)
       . html

We can then play with it in GHCi

*Main> Right r  r ^.. parsed
[NodeElement (Element {eltName = “DOCTYPE”, eltAttrs = fromList [(“html”,””)], eltChildren = [NodeElement (Element {eltName = “html”, eltAttrs = fromList [], eltChildren = [NodeElement (Element {eltName = “head”, eltAttrs = fromList [], eltChildren = [NodeElement (Element {eltName = “title”, eltAttrs = fromList [], eltChildren = [NodeContent “localhost”]}),NodeElement (Element {eltName = “link”, eltAttrs = fromList [(“href”,”http://localhost:8900/public/images/favicon.ico”),…

If we want to check that we are indeed logged in correctly, we should look for the “log out” button. Taggy does all the heavy lifting for us, we just have to write down a lens (more precisely, a fold) to “extract” a logout button from the page

logoutButton :: HasElement e => Fold e Text
logoutButton = allAttributed (ix “class” . only “logout”)
             . allNamed (only “a”)
             . contents

logoutButton searches for <div class=“logout”> and returns the text in the link inside the div. There might be many such links inside the node, hence we use a fold.

*Main> r ^.. parsed . logoutButton
[“log out”]

In this case, since we only care if such link is present, we can use a preview

*Main> r ^? parsed . logoutButton
Just “log out”

Conclusion

Well, that’s about it for now. I regret taking way too much time writing this update, and I hope to deliver another one soon. Meanwhile, some information regarding the darcsden SoC project can be found on the wiki.

Hoogle inside the sandbox

Introduction

This is my first post from the (hopefuly fruitful!) series of blog posts as part of my Haskell SoC project. I will spend a great chunk of my summer hacking away on DarcsDen; in addition, I will document my hardships and successes here. You can follow my progress on my DarcsHub.

This particular post will be about my working environment.

The problem

Hoogle is an amazing tool that usually needs no introduction. Understandably, the online version at haskell.org indexes only so many packages. This means that if I want to use hoogle to search for functions and values in packages like darcs and darcsden, I will have to set up a local copy.

Cabal sandboxing is a relatively recent feature of the Cabal package manager, but I don’t think it is reasonable in this day to install from the source (let alone develop) a Haskell package without using sandboxing.

The problem seems to be that the mentioned tools do not play well together out of the box, and some amount of magic is required. In this note I sketch the solution, on which I’ve eventually arrived after a couple of tries.

Using hoogle inside a Cabal sandbox

The presumed setup: a user is working on a package X using the cabal sandboxes. The source code is located in the directory X and the path to the cabal sandbox is X/.cabal-sandbox.

Step 1: Install hoogle inside the sandbox. This is simply a matter of running cabal install hoogle inside X. If you want to have a standard database alongside the database for your packages in development, now is the time to do .cabal-sandbox/bin/hoogle data.

Step 2: Generate haddocks for the packages Y,Z you want to use with hoogle. In my case, I wanted to generate haddocks for darcs and darcsden. This is just a matter of running cabal haddock --hoogle in the correct directory.

Step 3: Convert haddocks to .hoo files. Run the following commands in X/:

.cabal-sandbox/bin/hoogle convert /path/to/packageY/dist/doc/html/*/*.txt

You should see something like

Converting /path/to/packageY/dist/doc/html/Y/Y.txt
Converting Y... done

after which the file Y.hoo appears in /path/to/packageY/dist/doc/html/Y/

Step 4: Moving and combining databases. The hoogle database should be stored in .cabal-sandbox/share/*/hoogle-*/databases. Create such directory, if it’s not present already. Then copy the ‘default’ database to that folder:

cp .cabal-sandbox/hoogle/databases/default.hoo .cabal-sandbox/share/*/hoogle-*/databases

Finally, you can combine your Y.hoo with the default database.

.cabal-sandbox/bin/hoogle combine /path/to/packageY/dist/doc/html/*/*.hoo .cabal-sandbox/share/*/hoogle-*/databases/default.hoo
mv default.hoo .cabal-sandbox/share/*/hoogle-*/databases/default.hoo

And you are done! You can test your installation

$ .cabal-sandbox/bin/hoogle rOwner
DarcsDen.State.Repo rOwner :: Simple Lens (Repository bp) String

For additional usability, consider adding .cabal-sandbox/bin to your $PATH.

ANN: Hastache version 0.6.1

Announcing: hastache 0.6.1

Happy holidays, everyone!

I would like to announce a new version of the Hastache library, version 0.6.1. Some interesting and useful changes, as well as improvements and bugfixes are included in the release. See below for an extended changelog.

Hastache is a Haskell implementation of the mustache templating system.

Quick start

cabal update
cabal install hastache

A simple example:

import Text.Hastache 
import Text.Hastache.Context 
import qualified Data.Text.Lazy.IO as TL

main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context)
    >>= TL.putStrLn

template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages." 

context "name" = MuVariable "Haskell"
context "unread" = MuVariable (100 :: Int)

Read Mustache documentation for template syntax; consult README for more details.

Whats’s new in 0.6.1?

Most of the new features in this release deal with generic contexts.

Context merging

composeCtx is a left-leaning composition of contexts. Given contexts c1 and c2, the behaviour of (c1 <> c2) x is following: if c1 x produces ‘MuNothing’, then the result is c2 x. Otherwise the result is c1 x. Even if c1 x is ‘MuNothing’, the monadic effects of c1 are still to take place.

Generic contexts for more datatypes

The mkGenericContext function now supports additional datatypes like Maybe (with Nothing being an empty/null value) and Either.

Context modifiers and renaming

The new mkGenericContext' is a generalized version of mkGenericContext and it takes two addition arguments. The first one, of type (String -> String) is simply a renaming function, similar to fieldLabelModifier of aeson. To see this feature in action, consider the following example:

{-# LANGUAGE DeriveDataTypeable #-}

import Text.Hastache 
import Text.Hastache.Context 

import qualified Data.Text.Lazy as TL 
import qualified Data.Text.Lazy.IO as TL 

import Data.Data (Data, Typeable)
import Data.Decimal
import Data.Generics.Aliases (extQ)

data Test = Test {f :: Int}
     deriving (Data, Typeable)

val1 :: Test
val1 = Test 1

val2 :: Test
val2 = Test 2

r "f" = "foo"
r x   = x

example :: Test -> IO TL.Text
example v = hastacheStr defaultConfig
                        (encodeStr template)
                        (mkGenericContext' r defaultExt v)

template = "An integer: {{foo}}"

main = do
  example val1 >>= TL.putStrLn
  example val2 >>= TL.putStrLn

In the example we use the renaming function r to rename a field “f” to “foo”.

The second additional argument is a query extension, of type Ext:

type Ext = forall b. (Data b, Typeable b) => b -> String

A query extension is a way of turning arbitrary datatypes into strings. This might come in very handy, if you want to generate mustache contexts from records/datatypes that contain non-primitive datatypes (from non-base modules) that you want to display. Before 0.6.1, if you had a record that contained, for example, a Decimal field, and you wanted to convert it to a context and access that field, you were simply out of luck. With this release you can basically extend the mkGenericContext' function to support any datatypes you want! Once again, I believe an example is worth a thousand words, so let us consider a slightly modified version of the example above:

{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE StandaloneDeriving #-}

-- Custom extension function for types that are not supported out of
-- the box in generic contexts
import Text.Hastache 
import Text.Hastache.Context 

import qualified Data.Text.Lazy as TL 
import qualified Data.Text.Lazy.IO as TL 

import Data.Data (Data, Typeable)
import Data.Decimal
import Data.Generics.Aliases (extQ)

data DecimalOrInf = Inf | Dec Decimal deriving (Data, Typeable)
deriving instance Data Decimal
data Test = Test {n::Int, m::DecimalOrInf} deriving (Data, Typeable)


val1 :: Test
val1 = Test 1 (Dec $ Decimal 3 1500)

val2 :: Test
val2 = Test 2 Inf

query :: Ext
query = defaultExt `extQ` f
  where f Inf = "+inf"
        f (Dec i) = show i

r "m" = "moo"
r x   = x

example :: Test -> IO TL.Text
example v = hastacheStr defaultConfig
                        (encodeStr template)
                        (mkGenericContext' r query v)

template = concat [ 
     "An int: {{n}}\n",
     "{{#moo.Dec}}A decimal number: {{moo.Dec}}{{/moo.Dec}}",
     "{{#moo.Inf}}An infinity: {{moo.Inf}}{{/moo.Inf}}"
     ] 

main = do
  example val1 >>= TL.putStrLn
  example val2 >>= TL.putStrLn

As you can see, the query extensions are combined using the extQ function from Data.Generics, and the “unit” of this whole thing is defaultExt function.

Links

Acknowledgments

This release would not have been possible without Tobias Florek, Edsko de Vries, Janne Hellsten, @clinty on Github, Stefan Kersten, Herbert Valerio Riedel, and other people who were submitting issues, patches, and requests.

ANN: hastache 0.6

Announcing: hastache 0.6

Hastache is a Haskell implementation of the mustache templating system.

Quick start

cabal update
cabal install hastache

A simple example:

import Text.Hastache 
import Text.Hastache.Context 
import qualified Data.Text.Lazy.IO as TL

main = hastacheStr defaultConfig (encodeStr template) (mkStrContext context)
    >>= TL.putStrLn

template = "Hello, {{name}}!\n\nYou have {{unread}} unread messages." 

context "name" = MuVariable "Haskell"
context "unread" = MuVariable (100 :: Int)

Read Mustache documentation for template syntax; consult README for more details.

Whats’s new in 0.6?

The interface of the library has been switched from ByteString to (lazy) Text. That means, for example, that the type of hastcheStr function is now the following:

hastacheStr
:: MonadIO m => MuConfig m	-> Text	-> MuContext m -> m Text

The generic context generation (mkGenericContext) now supports functions of the types Text -> Text and Text -> m Text, as well as types with multiple constructors. That is, given a Haskell datastructure

data A = A { str :: String }
       | B { num :: Int }

it is possible to write a template like this:

{{#A}}
A : {{str}}
{{/A}}
{{#B}}
B : {{num}}
{{/B}}

Please take a look at the multiple constructors example if you are interested.

Additionally, a couple of new MuVar instances has been added.

Acknowledgments

Special thanks to Mark Lentczner, Alexander Voikov and others who reported issues, tested the library and provided feedback.