Is there a validating XML parser for flex/actionscript? The XML class verifies that it is well formed XML, but not that it follows the rules of the DTD. Java has a validating XML parser, but is there one for flex/actionscript?
R – Validating XML parser for Flex/actionscript
actionscript-3apache-flexparsingvalidationxml
Related Solutions
I think you are attacking it from the wrong angle by trying to encode all posted data.
Note that a "<
" could also come from other outside sources, like a database field, a configuration, a file, a feed and so on.
Furthermore, "<
" is not inherently dangerous. It's only dangerous in a specific context: when writing strings that haven't been encoded to HTML output (because of XSS).
In other contexts different sub-strings are dangerous, for example, if you write an user-provided URL into a link, the sub-string "javascript:
" may be dangerous. The single quote character on the other hand is dangerous when interpolating strings in SQL queries, but perfectly safe if it is a part of a name submitted from a form or read from a database field.
The bottom line is: you can't filter random input for dangerous characters, because any character may be dangerous under the right circumstances. You should encode at the point where some specific characters may become dangerous because they cross into a different sub-language where they have special meaning. When you write a string to HTML, you should encode characters that have special meaning in HTML, using Server.HtmlEncode. If you pass a string to a dynamic SQL statement, you should encode different characters (or better, let the framework do it for you by using prepared statements or the like)..
When you are sure you HTML-encode everywhere you pass strings to HTML, then set ValidateRequest="false"
in the <%@ Page ... %>
directive in your .aspx
file(s).
In .NET 4 you may need to do a little more. Sometimes it's necessary to also add <httpRuntime requestValidationMode="2.0" />
to web.config (reference).
The fully RFC 822 compliant regex is inefficient and obscure because of its length. Fortunately, RFC 822 was superseded twice and the current specification for email addresses is RFC 5322. RFC 5322 leads to a regex that can be understood if studied for a few minutes and is efficient enough for actual use.
One RFC 5322 compliant regex can be found at the top of the page at http://emailregex.com/ but uses the IP address pattern that is floating around the internet with a bug that allows 00
for any of the unsigned byte decimal values in a dot-delimited address, which is illegal. The rest of it appears to be consistent with the RFC 5322 grammar and passes several tests using grep -Po
, including cases domain names, IP addresses, bad ones, and account names with and without quotes.
Correcting the 00
bug in the IP pattern, we obtain a working and fairly fast regex. (Scrape the rendered version, not the markdown, for actual code.)
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
or:
(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])
Here is diagram of finite state machine for above regexp which is more clear than regexp itself
The more sophisticated patterns in Perl and PCRE (regex library used e.g. in PHP) can correctly parse RFC 5322 without a hitch. Python and C# can do that too, but they use a different syntax from those first two. However, if you are forced to use one of the many less powerful pattern-matching languages, then it’s best to use a real parser.
It's also important to understand that validating it per the RFC tells you absolutely nothing about whether that address actually exists at the supplied domain, or whether the person entering the address is its true owner. People sign others up to mailing lists this way all the time. Fixing that requires a fancier kind of validation that involves sending that address a message that includes a confirmation token meant to be entered on the same web page as was the address.
Confirmation tokens are the only way to know you got the address of the person entering it. This is why most mailing lists now use that mechanism to confirm sign-ups. After all, anybody can put down president@whitehouse.gov
, and that will even parse as legal, but it isn't likely to be the person at the other end.
For PHP, you should not use the pattern given in Validate an E-Mail Address with PHP, the Right Way from which I quote:
There is some danger that common usage and widespread sloppy coding will establish a de facto standard for e-mail addresses that is more restrictive than the recorded formal standard.
That is no better than all the other non-RFC patterns. It isn’t even smart enough to handle even RFC 822, let alone RFC 5322. This one, however, is.
If you want to get fancy and pedantic, implement a complete state engine. A regular expression can only act as a rudimentary filter. The problem with regular expressions is that telling someone that their perfectly valid e-mail address is invalid (a false positive) because your regular expression can't handle it is just rude and impolite from the user's perspective. A state engine for the purpose can both validate and even correct e-mail addresses that would otherwise be considered invalid as it disassembles the e-mail address according to each RFC. This allows for a potentially more pleasing experience, like
The specified e-mail address 'myemail@address,com' is invalid. Did you mean 'myemail@address.com'?
See also Validating Email Addresses, including the comments. Or Comparing E-mail Address Validating Regular Expressions.
Related Topic
- Python – How to parse XML and count instances of a particular node attribute
- Xml – What does in XML mean
- Java – How to read XML using XPath in Java
- Php – How to parse and process HTML/XML in PHP
- Actionscript – How to pass “Null” (a real surname!) to a SOAP web service in ActionScript 3
- Xml – No grammar constraints (DTD or XML schema) detected for the document
Best Answer
Ok, good news, bad news time.
First the bad news: Unfortunately, no. Actionscript does not support any form of DTD validation. It also does not support any form of XSL/XSLT validation or transformation. There are projects out there which will eventually make some of this possible (XPath-AS3 for one), but for now you are stuck out of luck.
But, there is good news (sort of): First, most servers support both. This means that you could accomplish the same thing in Flash using a round-trip to the server. It is, perhaps, less than ideal, especially when dealing with large amounts of information, but it will guarantee consistent results.
Second, JavaScript supports both XSL and DTD functionality This means, if absolutely necessary, you have the ability to us ExternalInterface to force the transformations for you (and you could even adapt the JavaScript/HTML abilities on Adobe Air to do the same). This, of course, means that you will need to do even more cross-browser testing.
Sorry there is no better news.