Leo Orpilla III
01 Oct 2020 | post

Unfolding Paginated APIs in Haskell


Cursor-based pagination is a common strategy for breaking up query results into smaller chunks. Each chunk contains a portion of the results, and a reference to the last item in the chunk.

For example, the following request will get 2 accounts after the cursor x

GET /accounts?page[size]=2&page[after]=x

We can express this idea with a Haskell data type

type Cursor = Int

data Paginated a 
  = Initial
  | Chunk { _content :: [a] , _cursor  :: Cursor }
  | Final [a]
  deriving Show

Let’s also create a dummy endpoint.

-- | Returns a paginated list of integers from 1 to 15
endpoint
  :: Maybe Int     -- page size 
  -> Maybe Cursor  -- page after
  -> Maybe ([Int], Maybe Cursor)
endpoint size after =
  let size' = maybe 6 id size
      after' = maybe 0 id after
      result = take size' [after'+1..15]
      newAfter' = headMaybe (drop (size'-1) result)
   in if null result then Nothing else Just (result, newAfter')
      
headMaybe :: [a] -> Maybe a
headMaybe (x:_xs) = Just x
headMaybe [] = Nothing

Requests to our endpoint look like

request :: Paginated Int -> Paginated Int
request Initial =
  case endpoint Nothing Nothing of
    Just (result, Just newCursor) -> Chunk result newCursor
    Just (result, Nothing) -> Final result
    Nothing -> Final []
request c@(Chunk acc cursor) =
  case endpoint Nothing (Just cursor) of
    Just (result, Just newCursor) -> Chunk result newCursor
    Just (result, Nothing) -> Final result
    Nothing -> Final []
request e@(Final _acc) = e

We are pattern matching on the input Paginated Int to determine the behaviour of request. Initally, we query the endpoint to obtain a page. This can either be Chunk or Final.

If we have a Chunk, we can obtain another Chunk, or a Final. Note that we are not concatenating the pages in this function. We are simply returning the next state from our current state.

run :: IO ()
run = print (take 4 (iterate request Initial))
*Main> run
[ 
  Initial,
  Chunk {_content = [1,2,3,4,5,6], _cursor = 6},
  Chunk {_content = [7,8,9,10,11,12], _cursor = 12},
  Final [13,14,15]
]

It works! Notice how this list is generated from an initial value. The next value is obtained by applying a function to the previous value. This is precisely what iterate does.

Unfoldr

We can also use recursion to generate the full response. This function will take an initial value and return a list of chunks (represented as lists of Int).

recursiveRequest :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor))
                  -> Maybe Cursor
                  -> [[Int]]
recursiveRequest f c = case f c of
  Just (r, Just c') -> r : recursiveRequest f (Just c')
  Just (r, Nothing) -> [r]
  Nothing -> []

Let’s make this function generic. Replace the [Int]s with as and the Maybe Cursors with bs.

recursiveRequest :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor)) 
                 -> Maybe Cursor 
                 -> [[Int]]
genericRequest   :: (b            -> Maybe ( a   , b           )) 
                 -> b            
                 -> [  a  ]
genericRequest f b = case f b of
  Just (a, b') -> a : genericRequest f b'
  Nothing -> []

The first argument is a function that will return an a and a new b. The second argument is a b. We apply b to the function, collect the generated a and use the new b to call the function again.

Let’s query the endpoint with this function.

-- this program does not terminate
run :: IO ()
run = print (genericRequest (endpoint Nothing) Nothing)

The main problem here is the type of b. In this context, it is a Maybe Cursor which is not enough to encode being in the start, middle or end of the recursion. We need a more expressive type for b.

data State a = Start | Next a | End

Recall the type of our endpoint. When Nothing is applied to this function, it returns a function of type Maybe Cursor -> Maybe ([Int], Maybe Cursor).

*Main> :t endpoint
endpoint :: Maybe Int -> Maybe Cursor -> Maybe ([Int], Maybe Cursor)
*Main> :t endpoint Nothing
endpoint Nothing :: Maybe Cursor -> Maybe ([Int], Maybe Cursor)

Naively using this function as the first argument of genericRequest will result in an infinite loop. We need to wrap it with another function that is able to keep track of our state.

wrapped :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor)) 
        ->  State Cursor 
        -> Maybe ([Int], State Cursor)
wrapped f Start = case f Nothing of
  Just (r, Just b') -> Just (r, Next b')
  Just (r, Nothing) -> Just (r, End)
  Nothing -> Nothing
wrapped f (Next b) = case f (Just b) of
  Just (r, Just b') -> Just (r, Next b')
  Just (r, Nothing) -> Just (r, End)
  Nothing -> Nothing
wrapped f End = Nothing

Huh, this looks a lot like our initial request. Let’s take a step back and understand what this function is really doing.

If the endpoint returns a cursor Just b', we want to encode this state as Next b'. Otherwise, there are no more pages and we encode the state as End.

If we define a mapping between Maybe and State, we can write wrapped succintly by updating the second value of the returned tuple.

maybeToState :: Maybe a -> State a
maybeToState (Just x) = Next x
maybeToState Nothing = End

wrapped :: (Maybe Cursor -> Maybe ([Int], Maybe Cursor)) ->  State Cursor -> Maybe ([Int], State Cursor)
wrapped f Start = fmap (\(a, b') -> (a, maybeToState b')) (f Nothing)
wrapped f (Next b) = fmap (\(a, b') -> (a, maybeToState b')) (f (Just b))
wrapped f End = Nothing

But wait, there’s more! We can make this function generic, and use Control.Arrow.second instead of the lambda. In summary, we have

genericRequest :: (b -> Maybe (a, b)) -> b -> [a]
genericRequest f b = case f b of
  Just (a, b') -> a : genericRequest f b'
  Nothing -> []

genericWrapped :: (Maybe b -> Maybe (a, Maybe b)) ->  State b -> Maybe (a, State b)
genericWrapped f Start = fmap (second maybeToState) (f Nothing)
genericWrapped f (Next b) = fmap (second maybeToState) (f (Just b))
genericWrapped f End = Nothing

maybeToState :: Maybe a -> State a
maybeToState (Just x) = Next x
maybeToState Nothing = End

run :: IO ()
run = print (genericRequest (genericWrapped (endpoint Nothing)) Start)
*Main> run
[[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15]]

Again, the idea here is that we are generating a list of things from a value. The type of genericRequest seems suspicious. Let’s Hoogle it.

something something tool support

It turns out we have stumbled upon unfoldr.

while foldr reduces a list to a summary value, unfoldr builds a list from a seed value

We can get rid of genericRequest altogether and use unfoldr directly

run :: IO ()
run = print (unfoldr (genericWrapped (endpoint Nothing)) Start)
*Main> run
[[1,2,3,4,5,6],[7,8,9,10,11,12],[13,14,15]]