Javascript – Scripting language for filling out web form

javascriptperlPHPpythonscripting

I have a job as an intern at a technology company, I was given the unfortunate job of performing some data entry into our web management system. The information entered into the web form is stored in a MySQL DB. Upon receiving the data I realized I would have to submit this online form about 1000 different times all consisting of about 10 different text fields / check boxes per form. (So in other words, would be completely mind numbing and be a ridiculous waste of time and resources, or so I thought…)

Having used databases a good bit prior to this, my immediate reaction was to just write a short MySQL script to bulk import all of the data, especially since it was already presented to me in an excel spreadsheet ready to go. Thought it may have been some sort of a test since it seemed too obvious. I wrote the script which consisted of about 10 lines of code but was then informed I couldn't be trusted with MySQL Admin privileges to run said script. So my next thought would be to write a script to just enter the information through the web form (Which will take ten times longer but it's what I have to)

Being unfamiliar with scripting of this nature (seems like I would need something similar to a bot, but the good kind) I was unsure of how to proceed to do this. Is there a preferred language to use to enter the data i have into the web form I do have access to? I'm not particularly looking for this to be done for me by any means just a nice point in the right direction as far as what scripting language to use and how to pair that with the data I have that needs to be entered.

Thanks for the help/ valuable input!

EDIT:

Is there a way to perform this using perl without having access to place any files on the server?

Would I be able to run some Javascript loops to pull the data out of .csv or just a .txt format with line delimiters and insert it into the web form?

Best Answer

There are a few possibilities (last one would be the easiest and most sensical, in my opinion, except if that's meant to be a long-term and reusable piece of code):

  • Use a web-testing framework

    They are meant to do this sort of stuff, so obviousy they do it well. But I think they're a bit heavyweight for what you want to do. For instance, Adel recommended Selenium, which is a great testing tool but a freaking monster to get started with (and will fire up browsers, except if you use the new WebDriver-based API that will use a browser-less driver like HtmlUnit.

    So, that's why I'd recommend, if you go down this route, to just use something like HtmlUnit (which you could invoke from a Java program, or from any other JVM-based language: Groovy, Scala, Clojure...). But I'd still regard this as relatively heavy.

  • Use a general-purpose scripting language

    Python, Perl and a herd of others will allow you to write this from scratch quickly, or to reuse an existing library to implement your own HTTP client to send your POST requests.

    If this think is going to be maintained in the future, I'd go with Python. If it's going to be quick and dirty, Perl is a hacker's best friend (and CPAN its favorite sledgehammer).

  • Use bare shell scripting and something like curl

    Go even more minimalistic: bare shell-scripting to process your inputs and format your data, some curl invocations to POST to the server, and voila!

    If you're on Windows, Powershell will be your friend.


There are hundreds of other possibilities to do this, I barely mention the ones that come to mind and that I'd use.

If that was my task, I'd probably write an ugly command line that:

  • if it's a simple spreadsheet (not something where you'd need to cherry pick cell values):

    1. converts the .xls to .csv,
    2. pipe it through to whatever suits you to transform the data,
    3. pipe it to a curl command to POST.
  • if it's a complex spreadsheet (where you'd need to cherry pick cell values):

    • script the data extraction bit,
    • invoke curl or use the scripting language's built-in libs to POST.

I'only use a more advanced testing framework if:

  • I'm likely to be asked to do this again for different data formats,
  • I'm likely to be asked to do this again for different datasources and target DBs,
  • I'm likely to be asked to do this often.

In that case, a more engineered approach does make some sense, for maintainability and extensibility.


In all cases, remember to backup that script and pass it along, and to document it (a README would do) and implement a usage. If they have one, version it in their SCM.


Note: another reason why taking the web-form submission approach might be better than the "direct to SQL" approach is that the server receiving the form might be doing extra checks you aren't aware of at this time.

Not saying it's the case, but maybe security wasn't the only reason.

Good luck with the job.


EDIT: just noticed you tagged this as "PHP". I don't really see why, as that would imply for the code to be server-side (well, you could use PHP for any kind of scripting, but why do this to yourself?)

Related Topic