Input validation to avoid XSS

August 4, 2019

Bardnet Follow

Summary

How to perform input validation to avoid XSS?

I recently had a code reviewed for security issues. The report read “In application code, untrusted user data is displayed in the user’s browser without input validation and with deprecated output encoding”

How can input validation look? Is it an option to remove invalid or unwanted HTML with a library like JSOUP from a string before it is entered into a database?

What methods do you use? Why is input validation important, when output validation takes place.

How can input validation look? Is it an option to remove invalid or unwanted HTML with a library like JSOUP from a string before it is entered into a database?

What methods do you use? Why is input validation important, when output validation takes place.

ColdFusion

discussion

language

modern cfml

security

(0)

Comments

(8)

Bardnet Follow

You must be logged in to post a comment.

JS_Webtrax

2020-09-05 00:06:51

JS_Webtrax

2020-09-05 00:06:51

[Wrong Place again and no delete…]

()

JS_Webtrax

2020-09-05 00:03:55

JS_Webtrax

2020-09-05 00:03:55

[Meant as a reply to the comments from Charlie]

Thank you Charlie for all the support that you are still provide the ColdFusion community when resources for any of us maintaining or still developing ColdFusion applications are getting harder to find that are reasonably up-to-date.

Is XSS much of a threat in an environment where the ColdFusion application is a SaaS product where all content requires user authentication and is isolated to each separate business? All dynamic user input is saved to a SQL database before any other user sees it. Even if a user wanted to introduce some XSS, they would just be sabotaging the application for other users within their organization that have access to that data.

The application I have inherited is on CF10 and is slated for an eventual rewrite to another language. Our client recently had a consultant do an audit and of course XSS came up. They demonstrated what happens if a user types in some HTML or JavaScript in the place of a customer name as an example. Since the application currently just displays the dynamic data, it of course proved their point of what could happen. As unlikely as it seems to me that a real user would want to do that when everything they do is tracked, it is nonetheless a “big” issue for the client now.

I am trying to minimize what changes need to be made to satisfy the client’s concerns and want to be sure I am not missing something. Our data is currently clean and I would prefer to focus on keeping it that way.

Looking over all the various solutions and other information out there, I am concerned that I am missing something because from my limited understanding it seems to me that ALL XSS protection solutions should be focused on INPUT validation. For instance, why would a site or application store any data that has XSS issues to start with and then, on the display side continually have to strip it out? It also seems that it would be far easier to control the few places were stored data can come from than it would be to make sure every place the data might possibly be accessed has protection in place. If there is concern about existing data before implementing validation, shouldn’t tools be used to clean out the existing data?

What would be ideal for my needs would be to have a function that could be used to verify that a string input does not contain any HTML/JavaScript/etc. when I know it never should, such as the case of Customer Name. Even an attribute for cfqueryparam that would go beyond just the cfsqltype of varchar and add restriction for plain text or something would be nice for several reasons.

The Canonicalize, getSafeHTML, use of AntiSamy files, and so on appear to be focused on fields where some HTML is acceptable. I am not very familiar with them. Could I be missing how they might be used for validation that no HTML is present?

I have thought of some various options for a custom validation function. Taking the value and comparing it to the value after applying EncodeForHTML to see if the values still matched seemed simple enough at first except that many characters that it encodes would be valid entries in the original input that would cause the text values to not match: Bob’s Burgers would have encoded value of Bob’s Burgers and would look like it was not a match. I don’t want to save encoded data either because then I would have to address issues like searches, PDF reports, emails, and other places where the value is not just displayed as HTML. Regular expression pattern matching, keyword checks, and other options come to mind and might be the route we might have to take. Or we might just have to go through all the code and handle it on the display end.

Before I head down that track I just wanted to confirm if there was a solution I am missing or if someone else may have already created a solution where all I want to do is validate that a string input does not have any content that would affect the DOM before we ever even store it. We could even trigger some sort of data quarantine or review, even terminate the user session if the solution was solid and not prone to false positives.

Maybe I am just a dreamer that needs some schooling. Any help or opinions would be appreciated.

()

(1)

Charlie Arehart

2020-09-16 13:36:16

Charlie Arehart

2020-09-16 13:36:16

JS_Webtrax

's comment

Sorry that I missed this when you posted last week.

So first, your confusion and doubt is understandable. Security is a complicated topic, and just when you may feel you’ve buttoned something up, along comes some hack that makes you vulnerable again. Sadly, one must become expert in it to truly be protected, but most folks won’t be able to/bother.

And just as some people can get by with just a lock on the doors and windows of their home, others may need (in increasing order) alarms, cameras, security fences, guards at the gate, guards at the door, or guards outside their bedroom door. It’s all about needs and costs.

And that leads to your question about whether to worry about xss on in internal site. Sadly, the answer is yes, typically. Sure, you say it’s risky for a bad guy to try, as their actions could be detected…but bad guys will take such risks.

It seems your bigger concern is having to change a lot of code, right? And thus you focus on “where” to put such protection: upon input? Upon storage to the dB? Upon retrieval from the dB? Upon display to user? And it can seem overkill to be redundant.

But again, for some people, they will opt for at least SOME redundancy depending on their needs. For instance, protecting only on input/storage won’t help if there was already bad guy code in the dB. Protecting only on display could allow bad data that might later be displayed by code without output protection. Each org needs to decide what makes sense for their needs, resources, risk in case of exposure, etc. There is no one-size-fits-all strategy.

That said, as you seem to want to lean more toward protecting input (whether to pass the scan, or if you do worry about possible attacks), I had mentioned options for that, including issafehtml, getsafehtml, and canonicalize.

And I pointed to resources with more info. One should at least assess those to learn from the experience of others, to weigh options, etc. One resource may cover something another does not, or differently.

Indeed, there’s yet another option that may suit your needs which I neglected to mention: a web app firewall (or WAF) would be another way to sanitize/protect against bad guy input (and lots more). And this requires no change to each page accepting input. It is caught BEFORE such code runs, whether implemented at the cf application level, the web server level, the network level, or even as a SaaS solution. I list several options, free or commercial, at each level, at a category of my cf411 page:

https://www.cf411.com/protection

Hope you may find a solution among those or other options you may find and consider.

()

Charlie Arehart

2019-08-06 13:39:50

Charlie Arehart

2019-08-06 13:39:50

XSS is of course a problem that has affected web apps, including CF-based ones, for decades. But yep, sometimes we may only become sensitive to the issues (and seek solutions) once we get hit with an attack or (perhaps better) have some security scan help us see we may have vulnerabilities.

And CF has had various solutions for addressing it in various ways, whether for input validation or output protection, for about as long. The oldest were far less capable than those added since CF10 and above. When you say your errors report use of “deprecated output encoding”, that would be related to any output protection you may be using. Perhaps your code uses the older approaches, and you may just need to update to the more modern ones.

Let’s look at options for output protection first, as those help with dealing with bad content that may already be in your database, etc. These functions tend to focus on stripping out from a given string any html, javascript, or other content that seems “unsafe”. The oldest “solution” may be the htmleditformat function, but it was quite limited.

CF10 then added the encodeforHTML function, along with several other encodefor* functions (encodeforjavascript, encodeforurl, encodeforcss, and more). These all do MUCH more than the old htmleditformat (which handled only a few “worrisome” characters), being based on the OWASP ESAPI project. (For the sake of completeness, even later updates of CF8 and 9 had offered the ESAPI library built-into CF, and some folks showed then how to use that to do such encoding even before the new CF functions were added.)

CF2016 then added member functions for those encoding functions (so that rather than wrapping the string with a call to the function, you could append the function to the variable holding the string, which suits some coding styles better). CF2016 also added the encodefor attribute in tags like cfoutput and as an arg for functions like writeoutput, when wrapping each output item seems tedious. There are pros and cons to all these solutions, and you can google to find the docs for each, but I offer below some resources that discuss them and CF/XSS protection in general.

As for validating the input (which is indeed just as important), there are also various solutions. First and most simplistically there has long been (since CF7) an isvalid function, which could at least validate something for any of a number of specific expected patterns of input (like if it was numeric, or a date, zipcode, email, etc.). But that’s won’t help with checking if an input text field has “unsafe” content.

Instead, for real validation/sanitization of input text such as for XSS and related vulnerabilities, look to the isSafeHTML and getSafeHTML functions, added in CF11, which sanitizes input using an antisamy policy file (either CF’s default found in cfusion\lib\antisamy-basic.xml, or one you create and can specify at the code or application level). Consider also the Canonicalize function (added in CF10), to help remove encodings from a string before it may be validated.

There are many resources discussing these things and XSS with regard to CF. First I list a couple of seminal ones, then ones addressing changes in each release, in descending order of recency:

Finally, there are also tools to help you find and fix such problems in CFML (both output protection and input validation). First was the CF Enterprise Security Code Analyzer (built into CFBuilder 2016 and above, working with the Enterprise edition only of CF 2016 and above).

More recently is Pete Freitag’s Fixinator (https://fixinator.app) tool which (though commercial) works with any edition of CF and does not require using any editor/IDE. I highly recommend folks implement either one of these tools to secure their code.

Long answer to what may seem a “simple question”, but then it’s a rather complex problem, again with various solutions and which have evolved over the years. Hope that was helpful. (And I should probably create a blog post out of this, though I will wait to see if any comments may suggest additions or enhancements.)

(2)

(4)

Charlie Arehart

2019-08-06 14:01:56

Charlie Arehart

2019-08-06 14:01:56

Charlie Arehart

's comment

Bernhard, in case you (or anyone else following this post and comments) may have gotten an email of (or have already read) my initial comment, please note that I have just updated it since originally posting it about 15 mins ago.

When I first read your post and replied, I focused on what is for most folks the most common first step for protecting against XSS, by controlling the display of potentially XSS-injected output.

Then I remembered you were also asking about validation of such input. So I have added a bit more about that. Again, see the list of resources, as some (and the tools) do go into both facets of the problem.

(1)

Bardnet

2019-08-07 21:59:19

Bardnet

2019-08-07 21:59:19

Charlie Arehart

's comment

Hi Charlie, thank you so much for your answer. Much more extensive, than I had hoped for. It will take me some time to read and to understand everything.

I accept the fact, that my knowledge is not very recent. I received results of security test of ColdFusion projects in the past. They never really made the impression the testers – or the software they use to test the application – knew CF. It came as a surprise to read such a specific point in the report.

I split the issue in two questions for a simple reason. Regarding the securing of the output the solution was clear. Use the newer functions (7 years old) instead of the older ones.

Regarding the validation of the input I’m open to suggestions. You advise to use isValid. This is definitely useful to get meaningful data, not only XSS-safe data.

Thank you again,

Bernhard

()

Charlie Arehart

2019-08-07 22:57:06

Charlie Arehart

2019-08-07 22:57:06

Bardnet

's comment

Understood, and happy to help. But don’t stop at isvalid. 🙂 That is the minimal solution. Be sure to check out the issafehtml and getsafehtml, for more (and for their ability to be extended).

()

JS_Webtrax

2020-09-05 00:10:19

JS_Webtrax

2020-09-05 00:10:19

Charlie Arehart

's comment

Charlie, please see comment on original post. I did not realize that when looking at the replies area of the forum you also have to click again on Reply to save as a reply. Also, I am not sure if you would receive a comment left on the original post or not so I am testing leaving a reply again.

()