Intersection of arrays

January 19, 2019
I try to bend the internet to my will.
Wizard 11 posts
Followers: 9 people
0

Intersection of arrays

I try to bend the internet to my will.
Wizard 11 posts
Followers: 9 people
January 19, 2019

If you have two arrays of values and you want to know which values are common to both then you are after the intersection of the two sets. You may have seen this written as A ∩ B.

There isn’t a native CFML method to do this for you, but fortunately there is one in Java which we can take advantage of. It’s retainAll. The Java docs describe the purpose of this method as:

Retains only the elements in this list that are contained in the specified collection (optional operation). In other words, removes from this list all of its elements that are not contained in the specified collection.

Here’s an example of how to use it:

function intersection(required array a, required array b) {
    a.retainAll(b);
    return a;
}

set1 = [0,1,2,3,4,5,6,7,8,9,10,11];
set2 = [0,2,4,6,8,10,12,14,16,18,20,22];

common = intersection(set1, set2);
writeDump(common); // output: [0,2,4,6,8,10]

One thing to note here is that retainAll mutates the original array, it does not return a new array (it actually returns a boolean). This is fine in the above example as ColdFusion by default, passes arrays by value. So the array inside the function is a copy of the passed in array. However, in ColdFusion 2016 an option to enable arrays to be passed by reference was added using the application wide this.passArrayByReference = true; setting. Let me demonstrate:

common = intersection(set1, set2);
writeDump(common); // output: [0,2,4,6,8,10]

// default ACF behaviour is to pass arrays by value, 
// so set1 is not changed
writeDump(set1); // output: [0,1,2,3,4,5,6,7,8,9,10,11]

When we enable passArrayByReference then run the exact same code works slightly differently.

common = intersection(set1, set2);
writeDump(common); // output: [0,2,4,6,8,10]

// when behaviour is pass arrays by reference, 
// set1 is changed
writeDump(set1); // output: [0,2,4,6,8,10]

This could easily be an unexpected side effect, so it’s a good idea to protect ourselves from this. Yet again we can use a java method to reduce the amount of code required to copy the array – clone().

So now we have the following code which does not mutate the original array.

function intersection(required array a, required array b) {
    var result = a.clone();
    result.retainAll(b);
    return result;
}

set1 = [0,1,2,3,4,5,6,7,8,9,10,11];
set2 = [0,2,4,6,8,10,12,14,16,18,20,22];

common = intersection(set1, set2);
writeDump(common); // output: [0,2,4,6,8,10]
// original array is not mutated
writeDump(set1); // output: [0,1,2,3,4,5,6,7,8,9,10,11]

So now we have a handy function which we can use to get the intersection of two arrays. But what if we had 3 arrays? We could just call the method twice like so:

set1 = [0,1,2,3,4,5,6,7,8,9,10,11];
set2 = [0,2,4,6,8,10,12,14,16,18,20,22];
set3 = [0,3,6,9,12,15,18,21,24,27,30,33];

common = intersection(set1, set2);
common = intersection(common, set3);
writeDump(common); // output [0,6]

What if we had multiple arrays? We could pass in an array of arrays and iterate through. Something like:

function intersection(required array sets) {
    var setCount = sets.len();
    var result = sets[1].clone();
    var i = 1;
    while (i < setCount) {
        i++;
        result.retainAll(sets[i]);
    }
    return result;
}

data = [
    [0,1,2,3,4,5,6,7,8,9,10,11],
    [0,2,4,6,8,10,12,14,16,18,20,22],
    [0,3,6,9,12,15,18,21,24,27,30,33]
];

common = intersection_n(data);
writeDump(common); // [0,6]

Now we can find the intersection of multiple arrays. However, I’m not that fond of having to pass in an array of arrays. It doesn’t make the code very easy to read. Could we do better?
Being able to call the intersection function with 2 or more arguments would be quite nice.

function intersection(required array a, required array b /*, ...rest*/) {
    var setCount = arrayLen(arguments);
    var result = arguments[1].clone();
    var i = 1;
    while (i < setCount) {
        i++;
        result.retainAll(arguments[i]);
    }
    return result;
}

Now we can do:

set1 = [0,1,2,3,4,5,6,7,8,9,10,11];
set2 = [0,2,4,6,8,10,12,14,16,18,20,22];
set3 = [0,3,6,9,12,15,18,21,24,27,30,33];

common = intersection(set1, set2, set3);
writeDump(common); // output: [0,6]

common = intersection(set1, set2);
writeDump(common); // output: [0,2,4,6,8,10]

You may or maynot like this approach, but I think it makes the code readable – when I see intersection(set1, set2, set3) I know exactly what is happening.
To make the function easier to understand I’ve added a comment of /*, ...rest*/ after the arguments. This comment has no impact on how the code behaves,
it is there purely for readability and is in the style of JavaScript’s rest parameter syntax.

Comments (0)
Add your comment