Code Generation – Arguments For and Against in C++ and SQL

ccode generationsql

I'm in a position where we've got some brittle code that constructs SQL-like queries via text concatenation with parameters for inputs. The data source that it queries is fast and scalable but lacking tool support. Over time, addition entities and properties have been added to the data source that need changing or have obsoleted others, so the queries need changing.

I can see that this will happen again in a few months and then again.

In order to reduce errors introduced into the text queries, I suggested writing the queries into a separate e.g. .SQL file and then running some kind of code generator tool that could get the schema from the data source and generate a code wrapper around the SQL-like query which was easy to re-generate at any time and would give compile errors for any out-of-date client code.

This idea was met with some skepticism and resistance, even when I offered to fund the development myself.

What are the reasons against doing this? and, for balance, the reasons to go ahead and do it?

(I already saw this post with a couple of answers, but its' not comprehensive)

Best Answer

Reasons for:

Lots of boilerplate code can be generated (getters/setters, toString(), clear)
Automated solution is less likely to miss schema changes if you're reading the schema to generate the code. In a large set of tables/POJOs this can prevent bugs.
Ability to generate API/schema documentation from your code.
Time saving in the future maintenance of your code base because you can generate new items quickly.

Reasons against:

Takes time to write a code generator (and you still have to write code for the requirements). My argument against this is the time I save in maintenance will make up for it.
Secondary set of code outside of your requirements to maintain.
You can never cover every case in your generator (so you end up with something that allows you to inject custom code)
If the project is small (less than 25 tables), it may be a overkill and the time savings may not be as great as expected.

EDIT: I have written 3 different code generators for projects and the greatest factor in deciding whether to do it was the size of the project. I did it for a smaller project and it wasn't as effective as the generator for maintaining the larger projects (maybe that's obvious, but I thought I would throw it out there). If the project is small to medium, I would lean toward not using one.

Related Solutions

Formatting SQL Based on an AST

I had a chance to look at your actual code. It looks like you're most of the way there but this "tree of tokens" concept is getting in the way. It turns out you probably don't need it at all (specifically, I'm referring to the stuff in the Expressions/ folder).

For example, in Join, you have a method called GetDeclarationExpression that returns an IExpressionItem that is a tree of tokens. What I'm suggesting is that you can simplify this and instead directly return a String which is the actual SQL generated from the join. In concrete terms, something like the following:

    String IJoinItem.GetDeclarationExpression(CommandOptions options)
    {
        // [ "(" ] <Left> <Combiner> <Right> [ "ON" <Filter> ] [ ")" ]
        StringBuilder expression = new StringBuilder();
        if (WrapInParentheses ?? options.WrapJoinsInParentheses)
        {
            expression.Append("(");
        }
        expression.Append(_leftHand.GetDeclarationExpression(options));
        expression.Append(" ")
        expression.Append(GetJoinNameExpression(options));
        expression.Append(" ")
        expression.Append(_rightHand.GetDeclarationExpression(options));
        expression.Append(" ")
        expression.Append(GetOnExpression(options));
        if (WrapInParentheses ?? options.WrapJoinsInParentheses)
        {
            expression.Append(")");
        }
        return expression.ToString();
    }

Of course, all other methods used during the AST traversal would also need to be modified to return String instead of IExpressionItem as needed. Each method that is called during the tree traversal would be responsible for generating the actual SQL string, which will all eventually get concatenated together.

C# – What would the general design for an XSD to C# class converter look like

What would the general design of such an application look like?

Leaving aside trivalities like UI, you'll want to have at least three major parts:

The "core" app, that translates the XML object into a C# string.
The portion that locates and reads the XSD file as XML, and passes each object over to the core app to process.
The portion that writes out the C# strings.

You'll want #1 to have a fair bit of available recursion, since XSD elements can be of types that contain elements and types and elements.

I assume that I will need to use one of the XML parsers in the .NET framework, but which one?

I'd load the whole XSD as a single System.XML.XMLDocument class, but I'm from the web and learned XML via the DOM. It's also the older model, and starting with things like selectSingleNode is probably better to start out with than performance-optimizing interfaces like XmlReader, or non-microsoft XML parsers.

And what will the resulting data structure look like? Will it be an expression tree?

Why wouldn't it be a namespace of structs or classes?

From there, what happens? Do I generate code by concatenating strings, or are there better, more sophisticated ways to do it?

There are no better ways to output strings from a program than concatenating strings. There is just syntactic sugar to make the code easier to write. But you want to use a System.Text.StringBuilder here, instead of mucking about with code like follows:

string Line = new String();
Line = "public class " + xmlDoc.localName + "() {\n";
Line += "etc, etc.";

Best Answer

Related Solutions

Formatting SQL Based on an AST

C# – What would the general design for an XSD to C# class converter look like

Related Topic