Unfolding Paginated APIs in Haskell
Cursor-based pagination is a common strategy for breaking up query results into smaller chunks. Each chunk contains a portion of the results, and a reference to the last item in the chunk.
For example, the following request will get 2 accounts after the
cursor x
GET /accounts?page[size]=2&page[after]=x
We can express this idea with a Haskell data type
type Cursor = Int
data Paginated a
= Initial
| Chunk { _content :: [a] , _cursor :: Cursor }
| Final [a]
deriving Show
Let’s also create a dummy endpoint.
-- | Returns a paginated list of integers from 1 to 15
endpoint :: Maybe Int -- page size
-> Maybe Cursor -- page after
-> Maybe ([Int], Maybe Cursor)
=
endpoint size after let size' = maybe 6 id size
= maybe 0 id after
after' = take size' [after'+1..15]
result = headMaybe (drop (size'-1) result)
newAfter' in if null result then Nothing else Just (result, newAfter')
headMaybe :: [a] -> Maybe a
:_xs) = Just x
headMaybe (x= Nothing headMaybe []
Requests to our endpoint look like
request :: Paginated Int -> Paginated Int
Initial =
request case endpoint Nothing Nothing of
Just (result, Just newCursor) -> Chunk result newCursor
Just (result, Nothing) -> Final result
Nothing -> Final []
@(Chunk acc cursor) =
request ccase endpoint Nothing (Just cursor) of
Just (result, Just newCursor) -> Chunk result newCursor
Just (result, Nothing) -> Final result
Nothing -> Final []
@(Final _acc) = e request e
We are pattern matching on the input Paginated Int
to determine the
behaviour of request
. Inital
ly, we query the endpoint to obtain a page.
This can either be Chunk
or Final
.
If we have a Chunk
, we can obtain another Chunk
, or a Final
.
Note that we are not concatenating the pages in this function.
We are simply returning the next state from our current state.
run :: IO ()
= print (take 4 (iterate request Initial)) run
*Main> run
[ Initial,
Chunk {_content = [1,2,3,4,5,6], _cursor = 6},
Chunk {_content = [7,8,9,10,11,12], _cursor = 12},
Final [13,14,15]
]
It works! Notice how this list is generated from an initial value. The next
value is obtained by applying a function to the previous value. This is
precisely what iterate
does.
Unfoldr
We can also use recursion to generate the full response.
This function will take an initial value and return a list of chunks
(represented as lists of Int
).
recursiveRequest :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor))
-> Maybe Cursor
-> [[Int]]
= case f c of
recursiveRequest f c Just (r, Just c') -> r : recursiveRequest f (Just c')
Just (r, Nothing) -> [r]
Nothing -> []
Let’s make this function generic. Replace the [Int]
s with a
s
and the Maybe Cursor
s with b
s.
recursiveRequest :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor))
-> Maybe Cursor
-> [[Int]]
genericRequest :: (b -> Maybe ( a , b ))
-> b
-> [ a ]
= case f b of
genericRequest f b Just (a, b') -> a : genericRequest f b'
Nothing -> []
The first argument is a function that will return an a
and a new b
.
The second argument is a b
. We apply b
to the function, collect the generated a
and use the new b
to call the function again.
Let’s query the endpoint with this function.
-- this program does not terminate
run :: IO ()
= print (genericRequest (endpoint Nothing) Nothing) run
The main problem here is the type of b
.
In this context, it is a Maybe Cursor
which is not enough to encode
being in the start, middle or end of the recursion. We need a more
expressive type for b
.
data State a = Start | Next a | End
Recall the type of our endpoint. When Nothing
is applied to this function,
it returns a function of type Maybe Cursor -> Maybe ([Int], Maybe Cursor)
.
*Main> :t endpoint
endpoint :: Maybe Int -> Maybe Cursor -> Maybe ([Int], Maybe Cursor)
*Main> :t endpoint Nothing
Nothing :: Maybe Cursor -> Maybe ([Int], Maybe Cursor) endpoint
Naively using this function as the first argument of genericRequest
will result in an infinite
loop. We need to wrap it with another function that is able to keep track of our state.
wrapped :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor))
-> State Cursor
-> Maybe ([Int], State Cursor)
Start = case f Nothing of
wrapped f Just (r, Just b') -> Just (r, Next b')
Just (r, Nothing) -> Just (r, End)
Nothing -> Nothing
Next b) = case f (Just b) of
wrapped f (Just (r, Just b') -> Just (r, Next b')
Just (r, Nothing) -> Just (r, End)
Nothing -> Nothing
End = Nothing wrapped f
Huh, this looks a lot like our initial request
.
Let’s take a step back and understand what this function is really doing.
If the endpoint returns a cursor Just b'
, we want to encode this state as Next b'
.
Otherwise, there are no more pages and we encode the state as End
.
If we define a mapping between Maybe
and State
, we can write wrapped
succintly
by updating the second value of the returned tuple.
maybeToState :: Maybe a -> State a
Just x) = Next x
maybeToState (Nothing = End
maybeToState
wrapped :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor)) -> State Cursor -> Maybe ([Int], State Cursor)
Start = fmap (\(a, b') -> (a, maybeToState b')) (f Nothing)
wrapped f Next b) = fmap (\(a, b') -> (a, maybeToState b')) (f (Just b))
wrapped f (End = Nothing wrapped f
But wait, there’s more! We can make this function generic, and use Control.Arrow.second
instead of the lambda. In summary, we have
genericRequest :: (b -> Maybe (a, b)) -> b -> [a]
= case f b of
genericRequest f b Just (a, b') -> a : genericRequest f b'
Nothing -> []
genericWrapped :: (Maybe b -> Maybe (a, Maybe b)) -> State b -> Maybe (a, State b)
Start = fmap (second maybeToState) (f Nothing)
genericWrapped f Next b) = fmap (second maybeToState) (f (Just b))
genericWrapped f (End = Nothing
genericWrapped f
maybeToState :: Maybe a -> State a
Just x) = Next x
maybeToState (Nothing = End
maybeToState
run :: IO ()
= print (genericRequest (genericWrapped (endpoint Nothing)) Start) run
*Main> run
1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15]] [[
Again, the idea here is that we are generating a list of things from a value.
The type of genericRequest
seems suspicious. Let’s Hoogle it.
It turns out we have stumbled upon unfoldr
.
while
foldr
reduces a list to a summary value,unfoldr
builds a list from a seed value
We can get rid of genericRequest
altogether and use unfoldr
directly
run :: IO ()
= print (unfoldr (genericWrapped (endpoint Nothing)) Start) run
*Main> run
1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15]] [[