There are a few possibilities (last one would be the easiest and most sensical, in my opinion, except if that's meant to be a long-term and reusable piece of code):
Use a web-testing framework
They are meant to do this sort of stuff, so obviousy they do it well. But I think they're a bit heavyweight for what you want to do. For instance, Adel recommended Selenium, which is a great testing tool but a freaking monster to get started with (and will fire up browsers, except if you use the new WebDriver-based API that will use a browser-less driver like HtmlUnit.
So, that's why I'd recommend, if you go down this route, to just use something like HtmlUnit (which you could invoke from a Java program, or from any other JVM-based language: Groovy, Scala, Clojure...). But I'd still regard this as relatively heavy.
Use a general-purpose scripting language
Python, Perl and a herd of others will allow you to write this from scratch quickly, or to reuse an existing library to implement your own HTTP client to send your POST
requests.
If this think is going to be maintained in the future, I'd go with Python. If it's going to be quick and dirty, Perl is a hacker's best friend (and CPAN its favorite sledgehammer).
Use bare shell scripting and something like curl
Go even more minimalistic: bare shell-scripting to process your inputs and format your data, some curl
invocations to POST
to the server, and voila!
If you're on Windows, Powershell will be your friend.
There are hundreds of other possibilities to do this, I barely mention the ones that come to mind and that I'd use.
If that was my task, I'd probably write an ugly command line that:
I'only use a more advanced testing framework if:
- I'm likely to be asked to do this again for different data formats,
- I'm likely to be asked to do this again for different datasources and target DBs,
- I'm likely to be asked to do this often.
In that case, a more engineered approach does make some sense, for maintainability and extensibility.
In all cases, remember to backup that script and pass it along, and to document it (a README
would do) and implement a usage
. If they have one, version it in their SCM.
Note: another reason why taking the web-form submission approach might be better than the "direct to SQL" approach is that the server receiving the form might be doing extra checks you aren't aware of at this time.
Not saying it's the case, but maybe security wasn't the only reason.
Good luck with the job.
EDIT: just noticed you tagged this as "PHP". I don't really see why, as that would imply for the code to be server-side (well, you could use PHP for any kind of scripting, but why do this to yourself?)
Three main reasons I can think of:
- Parent Scope Access
- Privacy
- Reduction of names defined in higher scopes
Parent Scope Access: Inline function definitions allow the inline code to have access to variables defined in parent scopes. This can be very useful for many things and can reduce the amount or complexity of code if done properly.
If you put the code in a function defined outside of this scope and then call the code, you would then have to pass any parent state that it wanted to access to the function.
Privacy: Code inside an inline anonymous definition is more private and cannot be called by other code.
Reduction of names defined in higher scopes: This is most important when operating in the global scope, but an inline anonymous declaration keeps from having to define a new symbol in the current scope. Since Javascript doesn't natively require the use of namespaces, it is wise to avoid defining any more global symbols than minimally required.
Editorial: It does seem to have become a cultural thing in Javascript where declaring something anonymously inline is somehow considered "better" than defining a function and calling it even when parent scope access is not used. I suspect this was initially because of the global namespace pollution problem in Javascript, then perhaps because of privacy issues. But it has now turned into somewhat of a cultural thing and you can see it expressed in lots of public bodies of code (like the ones you mention).
In languages like C++, most would probably consider it a less-than-ideal practice to have one giant function that extends across many pages/screens. Of course, C++ has namespacing built in, doesn't provide parent scope access and has privacy features so it can be motivated entirely by readability/maintainability whereas Javascript has to use the code expression to achieve privacy and parent scope access. So, JS just appears to have been motivated in a different direction and it's become somewhat a cultural thing within the language, even when the things that motivated that direction aren't needed in a specific case.
Best Answer
First, the "self-calling functions" aren't actually self-calling. I know that's what people in the Javascript community call them, but it's really misleading; the functions never reference themselves, and in fact a lot of the time there's no way to call that particular function more than once. If they were actually self-calling functions, then recursive functions would have been the best way to name them.
What you really have are (usually) immediately executed lambdas whose results are stored in a variable in order to limit the scope of their internals. "IEL" isn't as catchy as "self-calling functions", so I guess that's why the real name never caught on.
The thing is though, that's entirely too low-level; it's an implementation detail that nobody cares about (it's like saying "here be for loops"). Generally, when you're using those immediate-execution functions, the reason why you're using them is because you're making some sort of a module, which needs its own namespace.
If that's the case, then instead of saying "self-executing functions", you should say "this script contains modules that do <stuff>". Otherwise, you should figure out what you're trying to do with the functions, and say that's what's in your script.
Now, the reason why you use modules in Javascript is because otherwise everything goes into the global scope. Those other functions you're writing, that aren't going to be inside the modules (or whatever you decide they are), are going to end up there. So use that - "this script file contains both modules that do <stuff> and global functions that do <more stuff>".