Intersection and diff of two structs with a bit of refactoring thrown in.

I recently blogged about finding the union of two structs. Following on from that I wanted to look at finding the intersection (keys common to both) and diff (keys unique to the struct).

So without further ado, here’s how to find the intersection:

function intersection(a, b ) { return a.reduce(function(accumulator, key, value) { if (b.keyExists(key)) { accumulator[key] = value; } return accumulator; }, {}); } set1 = {a:1,"b":2}; set2 = {"B":20,c:3}; // intersection - note only checks keys! writeDump(intersection(set1,set2)) // output {b:2}

If you’ve not used structreduce before (or the member function version of `mystruct.reduce`

as I do in the example), then it can be a bit hard to see what is happening. Essentially what it does is to iterate over each key/value pair in the struct and call the Anonymous function each time building up a result to return.

Set1 contains `{"b":2}`

and set2 contains `{"B":20}`

. CFML treats upper and lowercase keys the same, so that’s why we get the match. What the intersection does not check is the values – we are just interested in the keys. We can only have one value per key, so I’ve chosen to use the value from `set1`

.

One thing we should check is if our original `set1`

and `set2`

have mutated. The answer is, no – they are not changed. Now you may be thinking that CFML passes structs by reference – and you’d be right – so why isn’t it mutated? One of the benefits of using `structReduce`

is that it returns a new struct (the accumulator) and doesn’t change the original struct.

Finding the keys that only exist in `set1`

can be done with something like this:

function diff( a, b ) { return a.reduce(function(accumulator, key, value) { if (!b.keyExists(key)) { accumulator[key] = value; } return accumulator; }, {}); } set1 = {a:1,"b":2}; set2 = {"B":20,c:3}; // intersection - note only checks keys! writeDump(diff(set1,set2)) // output {a:1}

Running this does as expected, we just get back `{a:1}`

.

Looking a bit closer, the code for `diff`

and `intersection`

are very similar – in fact, one extra character. That seems like unwanted duplication. Time to refactor!

Before refactoring, I know I have a working implementation so I’m not touching anything without tests (usually I’d have started off with a test before writing any code – forgive me reader!). Here’s a very simple test suite – normally I’d use a proper testing framework like TestBox, but I’m keeping it simple here for this blog post.

function assert( expected, actual ) { if (expected == actual) { writeoutput('PASS<br>'); } else { writeoutput('FAIL - expected #expected# got #actual#<br>'); } } set1 = {a:1,"b":2}; set2 = {"B":20,c:3}; // intersection - note only checks keys! writeOutput("Intersection Tests...<br>"); result=intersection( set1, set2 ); assert( 1, structCount(result) ); assert( 2, result.b ); // diff - keys in set1 not in set2 writeOutput("Diff Tests...<br>"); result=diff( set1, set2 ); assert( 1, structCount(result) ); assert( 1, result.a ); writeOutput("Set1 Mutation Tests...<br>"); assert( 2, structCount(set1) ); assert( 1, set1.a ); assert( 2, set1.b ); writeOutput("Set2 Mutation Tests...<br>"); assert( 2, structCount(set2) ); assert( 20, set2.b ); assert( 3, set2.c );

The first step in refactoring is to extract the part that determines if I should include or exclude if the key is found – `(b.keyExists(key))`

vs `(!b.keyExists(key))`

– into new functions.

function include(b, key) { return b.keyExists(key); }

and the inverse of that is:

function exclude(b, key) { return !include(b, key); }

Now to use these functions in the `diff`

and `intersection`

functions.

function intersection( a, b ) { return a.reduce(function(accumulator, key, value) { if (include(b, key)) { accumulator[key] = value; } return accumulator; }, {}); } function diff( a, b ) { return a.reduce(function(accumulator, key, value) { if (exclude(b, key)) { accumulator[key] = value; } return accumulator; }, {}); }

The tests still pass so onto the next refactoring step. I want to extract the common code and be able to pass in either the `include`

or `exclude function. In CFML functions are first class citizens, so we can pass the function in as an argument.`

function reducer( condition ) { return function( a, b ) { return a.reduce(function(accumulator, key, value) { if (condition(b, key)) { accumulator[key] = value; } return accumulator; }, {}); }; }

I’ve combined the previous `intersection`

and `diff`

functions into one function called `reducer`

` which accepts a function as the `

`condition`

argument. The `reducer`

function returns a new function. I can then call the `reducer`

function passing in the condition I want to use and assign the returned function to a variable.

intersection = reducer(include); diff = reducer(exclude);

The tests still pass, and we have no code duplication. For completeness here it is all together (without the testing code):

function include(b, key) { return b.keyExists(key); } function exclude(b, key) { return !include(b, key); } function reducer(condition) { return function( a, b ) { return a.reduce(function(accumulator, key, value) { if (condition(b, key)) { accumulator[key] = value; } return accumulator; }, {}); }; } intersection = reducer(include); diff = reducer(exclude);

Now you may find this code confusing to read, but even if you choose not to write code like this hopefully you’ve found it interesting!

You must be logged in to post a comment.