3
0 Comments

SPL assists MongoDB: Grouping adjacent documents

The data of a collection (named states) in the MongoDB database is as follows:

[
  {order: 1, state: 'one'},
  {order: 2, state: 'one'},
  {order: 3, state: 'one'},
  {order: 4, state: 'two'},
  {order: 5, state: 'two'},
  {order: 6, state: 'one'},
  {order: 7, state: 'two'},
  {order: 8, state: 'three'},
  {order: 9, state: 'three'}
]

Requirement: Group by adjacent states.
The expected results are as follows:

[
  [
    {order: 1, state: 'one'},
    {order: 2, state: 'one'},
    {order: 3, state: 'one'}
  ],
  [
    {order: 4, state: 'two'},
    {order: 5, state: 'two'}
  ],
  [
    {order: 6, state: 'one'}
  ],
  [
    {order: 7, state: 'two'}
  ],
  [
    {order: 8, state: 'three'},
    {order: 9, state: 'three'}
  ]
]

In MongoDB, it is difficult to directly use query statements to implement the function of grouping adjacent states, because MongoDB’s query language itself does not support this complex grouping logic. This requirement can be indirectly implemented by combining an Aggregation Pipeline with some custom logic.

Here is a way to write it:

db.states.aggregate([
  {
    $sort: { order: 1 }
  },
  {
    $group: {
      _id: null,
      documents: { $push: "$$ROOT" }
    }
  },
  {
    $project: {
      grouped: {
        $reduce: {
          input: "$documents",
          initialValue: {
            previousState: null,
            groups: []
          },
          in: {
            $let: {
              vars: {
                currentState: "$$this.state",
                lastGroup: { $arrayElemAt: ["$$value.groups", -1] }
              },
              in: {
                $cond: [
                  { $eq: ["$$value.previousState", "$$this.state"] },
                  {
                    previousState: "$$this.state",
                    groups: {
                      $concatArrays: [
                        { $slice: ["$$value.groups", { $subtract: [{ $size: "$$value.groups" }, 1] }] },
                        [
                          {
                            $concatArrays: [
                              "$$lastGroup",
                              ["$$this"]
                            ]
                          }
                        ]
                      ]
                    }
                  },
                  {
                    previousState: "$$this.state",
                    groups: {
                      $concatArrays: [
                        "$$value.groups",
                        [["$$this"]]
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      }
    }
  },
  {
    $project: {
      _id: 0,
      grouped: "$grouped.groups"
    }
  },
  {
    $unwind: "$grouped"
  },
  {
    $project: {
      _id: 0,
      documents: "$grouped"
    }
  },
  {
    $group: {
      _id: null,
      result: { $push: "$documents" }
    }
  },
  {
    $project: {
      _id: 0,
      result: 1
    }
  }
])

The running result is:

{
  result: [
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a427'),
        order: 1,
        state: 'one'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a428'),
        order: 2,
        state: 'one'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a429'),
        order: 3,
        state: 'one'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42a'),
        order: 4,
        state: 'two'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a42b'),
        order: 5,
        state: 'two'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42c'),
        order: 6,
        state: 'one'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42d'),
        order: 7,
        state: 'two'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42e'),
        order: 8,
        state: 'three'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a42f'),
        order: 9,
        state: 'three'
      }
    ]
  ]
}

SPL provides the function of grouping by adjacent records, and the code is concise and clear:
SPL code

The first three lines are readings, and the calculation code has only one line of A4. A5 returns the results to the json format, and the output results are identical to the original requirements.


esProc SPL is FREE to download: esProc SPL FREE Download

on March 27, 2025
Trending on Indie Hackers
From building client websites to launching my own SaaS — and why I stopped trusting GA4! User Avatar 41 comments I built a tool that turns CSV exports into shareable dashboards User Avatar 40 comments The “Open → Do → Close” rule changed how I build tools User Avatar 32 comments $0 to $10K MRR in 12 Months: 3 Things That Actually Moved the Needle for My Design Agency User Avatar 31 comments I lost €50K to non-paying clients... so I built an AI contract tool. Now at 300 users, 0 MRR. User Avatar 26 comments Everyone is Using AI for Vibe Coding, but What You Really Need is Vibe UX User Avatar 23 comments