Author : Kenta Inoue
Introduction
In functional programming, which is characterized by not creating side effects, managing "state," which is a side effect itself, may seem difficult at first glance. However, by using the State monad, it is possible to easily manage state while maintaining purity. In this article, we will look at how to do this and how to think about it using the pure functional language Haskell.
Examples of condition management
For example, let us write a program to find the number of times a function is called.
Let us consider a function that finds the greatest common divisor by Euclid's reciprocal division method.
In Java
Euclid's reciprocal divisor in Java can be written, for example, as follows.
public static int euclid(int n, int m) {
if (m == 0) {
return n;
} else {
return euclid(m,(n % m));
}
}
This function euclid
is not affected by external variables and is a pure function that does not
rewrite external states. If you want to find the number of times this euclid
function is called,
you can use the external variable counter
to count, for example
static int counter = 0;
public static int euclid(int n, int m) {
counter++;
if (m == 0) {
return n;
} else {
return euclid(m,(n % m));
}
}
The variable counter
is a state variable, so to speak, and is modified within the Euclid function.
In other words, counting is performed using side effects. However, this kind of implementation is not allowed in
pure functional languages, where (almost) all functions do not have side-effects. So how should it be written?
For functional languages
First, let's see the implementation of the Euclid function in the pure functional language Haskell.
euclid :: Int -> Int -> Int
euclid n m =
if m == 0
then n
else euclid m $ n `mod` m
Although it is not possible to change the external state with a pure function, there is a technique to rewrite the external variable in a pseudo manner by passing that external variable as a function argument and changing the variable passed.
In the above example, in addition to the conventional arguments n
and m
, we add a count
variable c::Counter
that counts the number of calls.
type Counter = Int
euclidCount :: Int -> Int -> Counter -> (Int,Counter)
euclidCount n m c =
if m == 0
then (n,c + 1)
else euclidCount m (n `mod` m) $ c + 1
Then, by adding the count value to the return value, we obtain the value of the counter after the change.
Let's use this euclidCount
to find the number of function calls.
First, let's count the final number of calls when the Euclid function is used for variables n1
and
m1
and then also for n2
and m2
.
n1 :: Int
n1 = 15
m1 :: Int
m1 = 10
n2 :: Int
n2 = 24
m2 :: Int
m2 = 15
main :: IO ()
main =
let
(result1,counter1) = euclidCount n1 m1 0
(result2,counter2) = euclidCount n2 m2 counter1
counter = counter2
in
do
print result1 -- 5
print result2 -- 3
print counter -- 8
You can implement it by writing it like this, but you will need variables like counter1
and
counter2
every time. Actually, this problem can be solved using the State monad.
Rewrite using the State
monad.
In the previous euclid
example, the original function type Int -> Int -> Int
was
changed to Int -> Int -> Counter -> (Int,Counter)
. This is more generally equivalent to
A_0 -> ... -> A_n -> B
for a function of type s of an external variable
A_0 -> ... -> A_n -> s -> (B,s)
. This last part of creating the type s -> (B,s)
from
the type B
is common, and in fact there is a typeclass State
that can express this.
newtype State s a = State {runState :: s -> (a,s)}
{- The actual implementation is
type State s = StateT s Identity
but for the sake of simplicity, we will use this definition. -}
state :: (s -> (a,s)) -> State s a
state = State
{- Note that this is also the original definition of
state :: (Monad m) => (s -> (a,s)) -> StateT s m a
state f = StateT (return . f)
-}
Let's write the euclidCount
mentioned earlier with this State
.
euclidSCount :: Int -> Int -> State Counter Int
euclidSCount n m = state $ euclidCount n m
It can be written like Java code, taking advantage of the fact that State is a monad.
count :: State Counter ()
count = state $ \c -> ((), c + 1)
euclidS :: Int -> Int -> State Counter Int
euclidS n m = do
count
if m == 0
then return n
else euclidS m $ n `mod` m
Written in this way, count
can be described as if it were a procedure for changing an external
variable.
Using this, the main
statement described earlier becomes the following.
euclidIO :: State Counter (IO ())
euclidIO = do
result1 <- euclidS n1 m1
result2 <- euclidS n2 m2
return $ do
print result1
print result2
main :: IO ()
main = do
p
print counter
where
(p,counter) = runState euclidIO 0
As in the previous example, it is no longer necessary to write variables to represent intermediate states, but the State monad and the IO monad which performs input and output, are in separate layers, and separate monad processing will be required for each.
In fact, this can be combined into one monad.
Using StateT
.
First, consider what is called a monad transformer. This is a type transformer that takes a monad and returns a new monad.
For example, the monad transformer t
takes a monad m
and returns a new monad t
m
.
Let's look at the actual definition.
class (forall m. Monad m => Monad (t m)) => MonadTrans t where
lift :: Monad m => m a -> t m a
lift
is called a lifting function and is responsible for bringing the original m
monad
process to the t
m
world, as in the definition.
With this in mind, we will look at StateT
, one of the monad converters.
newtype StateT s m a = StateT {runStateT :: s -> m (a,s)}
instance MonadTrans (StateT s) where
lift m = StateT $ \c -> do
a <- m
return (a,s)
The previous State
monad was a
type that returns s -> (a,s)
for type
a
for any type s
. StateT
returns s -> m (a,s)
, which is
actually the definition of the State
monad with m
as the identity monad
Identity
.
Euclid Function Examples
Rewrite the Euclid function defined earlier with one using StateT
. In fact, State s
was
precisely StateT s Identity
, so we generalize the function not only with Identity
but also with any monad m
.
count2 :: Monad m => StateT Counter m ()
count2 = state $ \c -> ((),c + 1)
-- count2 = StateT $ \c -> return ((),c + 1)
euclidS2 :: Monad t => Int -> Int -> StateT Counter t Int
euclidS2 n m = do
count2
if m == 0
then return n
else euclidS2 m $ n `mod` m
The only change is the introduction of a type variable for each. This is a generalization of count
and euclidS
from earlier, respectively, so rewriting the definitions in the previous example to
these will work the same way.
main
statement using this.
euclidIO2 :: StateT Counter IO ()
euclidIO2 = do
result1 <- euclidS2 n1 m1
result2 <- euclidS2 n2 m2
lift $ print result1
lift $ print result2
main :: IO ()
main = do
(_,counter) <- runStateT euclidIO2 0
print counter
In the euclid2
section, it is now possible to process the same monad that was separated from the
monad processing earlier. Specifically, the IO
monad processing is converted to
StateT Counter IO
monad processing using the lift
function.
Note that euclidIO
in the previous example was effectively of type
Counter -> (IO (),Counter)
, while euclidIO2
is effectively of type
Counter -> IO ((), Counter)
. In other words, the return value of the state is now in
IO
and cannot be taken out as it is. If you have trouble with this, you can use the example above.
Summary
Describes how to think about and handle state in a pure functional language.