A look at unsynchronised arrays in CFML

March 3, 2019
I try to bend the internet to my will.
Wizard 31 posts
Followers: 19 people
8

A look at unsynchronised arrays in CFML

I try to bend the internet to my will.
Wizard 31 posts
Followers: 19 people
March 3, 2019

In the ColdFusion 2016 release, the ability to create unsynchronised arrays was added. Adobe’s ColdFusion 2016 Performance Whitepaper claims a significant speed increase when you use them.

I thought I’d have a dig into this.

First up, how do you create them?

You create an unsynchronised array with arraynew and a value of false for the 2nd argument (the default is true). So for a one-dimensional unsynchronised array you’d write this:

a = arrayNew(1, false);

I think the syntax is a bit clunky and we’d largely left arrayNew behind but hey-ho

I knocked up the following to have a look at the datatype:

function objectType(o) {
    return getMetaData(o).getCanonicalName();
}

unsynchronisedArray = arrayNew(1, false);
writeDump("unsynchronisedArray => " & objectType(unsynchronisedArray));

synchronisedArray = arrayNew(1, true);
writeDump("synchronisedArray => " & objectType(synchronisedArray));

legacyArray = arrayNew(1);
writeDump("legacyArray => " & objectType(legacyArray));

modernArray = [];
writeDump("modernArray => " & objectType(modernArray));

clonedArray = unsynchronisedArray[:];
writeDump("clonedArray => " & objectType(clonedArray));

Running the above on ColdFusion 2018 Update 2 we get:

unsynchronisedArray => coldfusion.runtime.FastArray
synchronisedArray   => coldfusion.runtime.Array
legacyArray         => coldfusion.runtime.Array
modernArray         => coldfusion.runtime.Array
clonedArray         => coldfusion.runtime.FastArray 

From this we can see that there is a new datatype of coldfusion.runtime.FastArray and that the previous ways to create an array all return a coldfusion.runtime.Array – which although I’ve never stopped to think about it – means that we’ve been using synchronised arrays all along.

I was pleased to see that cloning the unsynchronisedArray preserves the datatype.

Now that we have a FastArray – what use is it?

My first thought was does it get passed around as an ‘traditional’ array does. In ColdFusion, by default, arrays are passed my value, not by reference (you can override this behaviour in you application by using this.passArrayByReference)

function mutateArray(array a) {
    a.append(2);
    return a;
}

before = arrayNew(1, false);
before.append(1);
after = mutateArray(before);

writeDump(before);
writeDump(after);

Sure enough, the before and after arrays are different, so the array is being passed by value, so the behaviour is the same as we’ve been used to with arrays.

Next up, I thought I’d go for a simple speed test.

iterations = 100000;
function doSomething(boolean isSynchronized) {
    var result = arrayNew(1, isSynchronized);
    var i = 0;
    while (i++ < iterations) {
        result.append(i);
    }
    return result;
}
function testAppendSpeed(type) {
    var s = getTickCount();
    var result = doSomething(type);
    var e = getTickCount();
    return {
        "first": result[1],
        "last": result[iterations], 
        "ms": e-s
    };
}

writeDump(testAppendSpeed(true));
writeDump(testAppendSpeed(false));

I ran this a few times on 2016 Update 8 and here’s the overall average benchmark in milliseconds.

Synchronised:   83.9ms
Unsynchronised: 75.3ms

I then ran it the same number of times on 2018 Update 2 and here’s the overall average benchmark in milliseconds.

Synchronised:   82.0ms
Unsynchronised: 88.3ms

From that it looks like the Unsynchronised array is faster in ColdFusion 2016 Update 8, but slower on ColdFusion 2018 Update 2. Here’s the code on cffiddle:

https://cffiddle.org/app/file?filepath=4c188709-ff5f-4985-8890-95667e16ea54/4fa02d04-b7f7-4f4a-bc85-634ea2eea621/103fd830-06db-4c46-8f15-8ef1c12135ad.cfm

Let’s give sorting a go to see how that compares.

iterations = 100000;
function buildArray(boolean isSynchronized) {
    var result = arrayNew(1, isSynchronized);
    var i = 0;
    while (i++ < iterations) {
        result.append(i);
    }
    return result;
}
function testSortSpeed(type) {
    var array = buildArray(type);
    var s = getTickCount();
    arraySort(array, "numeric", "desc");
    var e = getTickCount();
    return {
        "first": array[1],
        "last": array[iterations], 
        "ms": e-s
    };
}

writeDump(label="synchronized", var=testSortSpeed(true));
writeDump(label="unsynchronized", var=testSortSpeed(false));

I ran this a few times on 2016 Update 8 and here’s the overall average benchmark in milliseconds.

Synchronised:   4.2ms
Unsynchronised: 5.7ms

I then ran it the same number of times on 2018 Update 2 and here’s the overall average benchmark in milliseconds.

Synchronised:   10.5ms
Unsynchronised: 9.5ms

From those results is seems that the Synchronised array is actually marginally faster on 2016 and slower on 2018.

So what is the difference between the two types of arrays?

I had a look at the class hierarchy (on 2018 Update 2) to see if they were different Java datatypes. Here’s the unsynchronised ‘fast’ array:

SELF	          coldfusion.runtime.FastArray
PARENT	          java.util.ArrayList
GRANDPARENT	   java.util.AbstractList
GREATGRANDPARENT java.util.AbstractCollection

and the unsynchronised ‘traditional’ array hierarchy is:

SELF	           coldfusion.runtime.Array
PARENT	           coldfusion.runtime.FastArray
GRANDPARENT	    java.util.ArrayList
GREATGRANDPARENT  java.util.AbstractList

So that was interesting, but doesn’t really give up any clues as there is no difference at the Java level.

So, we’ll have to go with what the docs say:

In ColdFusion 11, and previous versions, arrays were always thread-safe.

But thread safety comes at a cost of synchronizing the array object. In a synchronized array, two or more threads cannot access the array at the same time. The second thread has to wait until the first thread completes its job, resulting in significant performance deterioration.

In ColdFusion (2016 release), you can use an unsynchronized array and let multiple threads access the same array object simultaneously.

You can use unsynchronized arrays in situations where you need not worry about multiple threads accessing the same array. In other words, if you are defining an unsynchronized array in a thread safe object, like a UDF or a CF-Closure, you can use an unsynchronized array.

While a normal array is slower but thread safe, an unsynchronized array is faster by more than 90%.

https://helpx.adobe.com/coldfusion/cfml-reference/coldfusion-functions/functions-a-b/arraynew.html

From that it sounds like we can use unsynchronised arrays anywhere that can’t be simultaneously accessed (race condition). So don’t use them in shared scopes or if you have a component stored in the application scope which mutates an array used for state.

The Bottom line

From my rudimentary tests, it’s unlikely that the speed of the ‘traditional’ arrays is going to be a performance bottle neck in your application in the first place. I’m creating arrays with 100,000 elements in my example in a 10th of second, so not exactly sluggish.

From the speed gains I’m seeing (and I may be completely missing how you should use them to get that 90% speed increase the docs refers to) I think it’s best to write code using synchronised (traditional) arrays until you have a performance issue with mutating arrays and then look at if you can use an unsynchronised (fast) array instead.

Comments (8)
2019-03-05 15:25:12
2019-03-05 15:25:12

UPDATE: I’ll clarify that I’m not testing ACF2018 and ACF2016 on the exact same setup so do not look at the speed differences between the two engine versions. I posted the timings to show the difference in speed between Synchronished and Unsynchronised arrays on the same version of ColdFusion.

Like
2019-03-04 14:39:20
2019-03-04 14:39:20

Thanks for digging into this, John (aka Yorik). But I do think you’ve missed an opportunity. As mentioned in the doc quote, the feature is about concurrent access to a given array in multiple threads at the same time.

Your speed tests are all taking place in a single template, thus a single thread. As such, your testing more the implication of using the feature with leveraging its benefit. There’s some value to that, sure (to assess the cost, in single-thread processing)

But there would seem far more value in showing multiple requests (or cfthread threads) running the code concurrently. And especially editing content in the array, rather than just reading it (though timing of reading alone would be valuable, too).

Finally, as for your observations of speed differences between cf2016 and cf2018, it would be valuable to know if these were on the same machine,  and with identical admin settings (and identical application cfc/cfm processing–best if they both have a blank application file to ensure no unexpected inheritance of another).

All good fodder for another post, if it interests you. Hope that’s helpful. And I’m sure I speak for most in saying we appreciate the many posts you make.

Like
(6)
>
Charlie Arehart
's comment
2019-03-04 19:14:42
2019-03-04 19:14:42
>
Charlie Arehart
's comment

Hi Charlie Arehart,

It was my intention to dig into running it via runAsync and to see how race conditions could occur, but when I saw that the difference in speed wasn’t going to be like day-and-night then I went off into a tangent of trying 2016 vs 2018 and sorting etc to see what that produced.

I’ll see if I can get around to writing a follow up – and I write nowhere near as much as you do! 

Like
>
aliaspooryorik
's comment
2019-03-04 20:40:29
2019-03-04 20:40:29
>
aliaspooryorik
's comment

Thanks for that, John. As for trying things out with the runasync feature (new in 2018), while that would be interesting to many, I would propose that since the feature was added in 2016, you may do well to stick with testing things via cfthread (or just some means of firing off multiple requests).

That way, those NOT on 2018 could see the benefit–and that could benefit those even on CF11 and earlier, who might then try out your code to see how things perform for them, with the old array type, to compare to your tests in CF2016 or 2018.

Of course, there can be other explanations for differences between one CF version and another. Indeed, I had asked if you might clarify for us if there were any that might explain the differences you saw between 2016 and 2018. I will assume you meant that you would try to address that in any follow-up.

And as for how you “write nowhere near as much as” I do, I think that was a compliment. 🙂 If you mean I post a lot of blog entries or answers, sure. But some think I wrote “too much” in those, and in replies like this. I write like I would speak…which may not be any better an explanation! 🙂

Cheers.

Like
>
Charlie Arehart
's comment
2019-03-04 22:24:43
2019-03-04 22:24:43
>
Charlie Arehart
's comment

Hi Charlie,

All valid points.

The tests between ACF2016 and ACF2018 aren’t on the same setup – I wasn’t trying to compare speed betweenACF2016 and ACF2018, rather the speed of synchronised vs unsynchronised arrays on the same engine. I was trying to find a good use case where an unsynchronised array was a no-brainer, but I just can’t get a big improvement whatever I try!

I have tests which use cfthread that show how you can end up with missing array elements (so reasons NOT to use unsynchronised array) but my tests don’t show a noticeable improvement in performance, so not quite sure what to blog as my goal of showing where unsynchronised arrays are beneficial just isn’t revealing itself to me! At the moment I can’t really see where I’d use an unsynchronised array, unless I wanted to save a few milliseconds.

It was meant as compliment Charlie – your posts are a valuable resource to the CF community!

Like
>
aliaspooryorik
's comment
2019-03-05 00:11:03
2019-03-05 00:11:03
>
aliaspooryorik
's comment

Thanks for that, especially for the kind regards in your last paragraph.

As for your having trouble showing the performance difference (in a given version) between the two types, well, again that was really the main point of my first comment.

I think you’d see the impact more clearly (if there is one) by running tests that modify and/or access such arrays (or each type) in requests running in multiple concurrent threads. As the docs read, that’s when the performance hit is noticeable: “The second thread has to wait until the first thread completes its job, resulting in significant performance deterioration.”

So you want to show it being used in that way. Perhaps you could also show how then the the unsynched array would suffer reliability problems (due to race conditions) . But I know your main focus was to find the speed difference: I think you’ll only notice it when using the two approaches in more multi-threaded way.

(And I realize that you may wonder still, why should I use an async one then, if I know I can’t use it in a multithreaded sitaution. But the question will be “how does it perform when multiple threads are indeed only reading the shared array?” Perhaps think of it more as a read-only flag to speed up such processing, when using multiple threads.)

That said, I have not used it myself, so I can’t say any more than these thoughts so far. I will add, though, that since the concept (of synch’ed/unsync’ed arrays) is in fact a generic one, you could look to other resources that discuss it, for more insights. Here’s one:

https://www.geeksforgeeks.org/synchronization-arraylist-java/

And it says that by default Java’s equivalent ARE unsynched. There’s no date on that article to see when it was written and what JVM it refers to, but we could reasonably assume that had not changed. Perhaps it’s that Adobe inherited from Macromedia (and even Allaire) CF’s default to have them be synched, and they wanted to offer this means to unsynch them, for those occasions where it would speed things up. I know it’s the “where” that you are seeking. 🙂

Finally, just to be clear, when your last point addressed the speed differences reported between CF2016 and 2018, I’ll just say that it wasn’t stated in the original post that they were on different machines, right? So we couldn’t have known. 🙂

And while you may not have intended for your observed differences between them to be used as a comparison BETWEEN them, it’s only natural that folks would infer that, right? 🙂

I suppose now that you have clarified those things, that may suffice. But I would propose that editing the post to make them more clear could help many readers. Your call.

I know it’s an annoying characteristic of this portal that if we (as writers) edit our posts, the post is then taken OFFLINE while awaiting moderation. That’s just shocking to me–I would understand if it left the previous version online while the new one awaited moderation–and I have complained of it many times to Adobe.

Like
>
Charlie Arehart
's comment
2019-03-05 15:21:19
2019-03-05 15:21:19
>
Charlie Arehart
's comment

Yes, having the post taken offline is very annoying. I’ve also just discovered you have to login just to *view* these replies – what the heck is all that about! I think I’ll post an update in the comments.

I’ve tested read-only and no difference. As I understand it then it’s the mutation that causes the issues. Synchronised needs to lock it for updates, which adds a bit of overhead, but it seems to minimal at best. I’ll keep digging and hopefully the fog will clear and I’ll uncover something worth posting!

Like
>
aliaspooryorik
's comment
2019-03-05 16:53:05
2019-03-05 16:53:05
>
aliaspooryorik
's comment

Thx, on all points.

Like
Add your comment