3
0 Comments

SPL assists MongoDB: Grouping adjacent documents

The data of a collection (named states) in the MongoDB database is as follows:

[
  {order: 1, state: 'one'},
  {order: 2, state: 'one'},
  {order: 3, state: 'one'},
  {order: 4, state: 'two'},
  {order: 5, state: 'two'},
  {order: 6, state: 'one'},
  {order: 7, state: 'two'},
  {order: 8, state: 'three'},
  {order: 9, state: 'three'}
]

Requirement: Group by adjacent states.
The expected results are as follows:

[
  [
    {order: 1, state: 'one'},
    {order: 2, state: 'one'},
    {order: 3, state: 'one'}
  ],
  [
    {order: 4, state: 'two'},
    {order: 5, state: 'two'}
  ],
  [
    {order: 6, state: 'one'}
  ],
  [
    {order: 7, state: 'two'}
  ],
  [
    {order: 8, state: 'three'},
    {order: 9, state: 'three'}
  ]
]

In MongoDB, it is difficult to directly use query statements to implement the function of grouping adjacent states, because MongoDB’s query language itself does not support this complex grouping logic. This requirement can be indirectly implemented by combining an Aggregation Pipeline with some custom logic.

Here is a way to write it:

db.states.aggregate([
  {
    $sort: { order: 1 }
  },
  {
    $group: {
      _id: null,
      documents: { $push: "$$ROOT" }
    }
  },
  {
    $project: {
      grouped: {
        $reduce: {
          input: "$documents",
          initialValue: {
            previousState: null,
            groups: []
          },
          in: {
            $let: {
              vars: {
                currentState: "$$this.state",
                lastGroup: { $arrayElemAt: ["$$value.groups", -1] }
              },
              in: {
                $cond: [
                  { $eq: ["$$value.previousState", "$$this.state"] },
                  {
                    previousState: "$$this.state",
                    groups: {
                      $concatArrays: [
                        { $slice: ["$$value.groups", { $subtract: [{ $size: "$$value.groups" }, 1] }] },
                        [
                          {
                            $concatArrays: [
                              "$$lastGroup",
                              ["$$this"]
                            ]
                          }
                        ]
                      ]
                    }
                  },
                  {
                    previousState: "$$this.state",
                    groups: {
                      $concatArrays: [
                        "$$value.groups",
                        [["$$this"]]
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      }
    }
  },
  {
    $project: {
      _id: 0,
      grouped: "$grouped.groups"
    }
  },
  {
    $unwind: "$grouped"
  },
  {
    $project: {
      _id: 0,
      documents: "$grouped"
    }
  },
  {
    $group: {
      _id: null,
      result: { $push: "$documents" }
    }
  },
  {
    $project: {
      _id: 0,
      result: 1
    }
  }
])

The running result is:

{
  result: [
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a427'),
        order: 1,
        state: 'one'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a428'),
        order: 2,
        state: 'one'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a429'),
        order: 3,
        state: 'one'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42a'),
        order: 4,
        state: 'two'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a42b'),
        order: 5,
        state: 'two'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42c'),
        order: 6,
        state: 'one'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42d'),
        order: 7,
        state: 'two'
      }
    ],
    [
      {
        _id: ObjectId('67c65e846d497a00cd02a42e'),
        order: 8,
        state: 'three'
      },
      {
        _id: ObjectId('67c65e846d497a00cd02a42f'),
        order: 9,
        state: 'three'
      }
    ]
  ]
}

SPL provides the function of grouping by adjacent records, and the code is concise and clear:
SPL code

The first three lines are readings, and the calculation code has only one line of A4. A5 returns the results to the json format, and the output results are identical to the original requirements.


esProc SPL is FREE to download: esProc SPL FREE Download

on March 27, 2025
Trending on Indie Hackers
From Ideas to a Content Factory: The Rise of SuperMaker AI User Avatar 27 comments Why Early-Stage Founders Should Consider Skipping Prior Art Searches for Their Patent Applications User Avatar 21 comments Codenhack Beta — Full Access + Referral User Avatar 17 comments I built eSIMKitStore — helping travelers stay online with instant QR-based eSIMs 🌍 User Avatar 15 comments Building something...? User Avatar 12 comments Do Patents Really Help Startups Raise Funding? Evidence from the U.S. and Europe User Avatar 11 comments