C# Design – Best Way to Design Batch Job Processing

cdesignerpvisual studio

I'm working on a portion of an ERP system where I need to process data in a way that's similar to a series of batch jobs, and I'm struggling with deciding the best program architecture to use. I'm asking here because I know I'm ignorant of a lot of state-of-the-art programming methods, and I'm worried about designing my software in an archaic way without realizing it.

My system needs to process batches of data and have the processing jobs be able to be scheduled at regular intervals, and also to be run on demand. Some of the processing also relies on outside web services.

Right now my architecture is that I have one single Visual Studio C# solution (with several projects inside of it). The solution produces a program which has an interface for running jobs on demand also for configuring a schedule for the jobs to run automatically. The single program also has all of the batch processing code in it.

Within the program, each "batch job" is called by calling a method on my main controller class, such as public async Task BillOrders(), which then calls the appropriate method from a high-level service class, which calls lower-level services, and so on. The classes themselves are all pretty well separated in terms of the single responsibility principle. The main controller calls each batch job method asynchronously and handles monitoring them, reporting errors, throttling, and so on.

This all works, but I'm worried that this isn't the way I should be doing it.
Specifically, my concerns are:

  1. Since I have several separate batch-process-type functions all in the same program, I'm worried that a crash or bug in the program could cause all the batch processes to stop working. It also seems bad to have one single program that does a bunch of different things.

  2. Even though each batch process is related to one area of my ERP, I'm worried that having a single codebase with different functions in it will end up creating something monolithic and difficult for several developers to work on different aspects of simultaneously.

One of the main reasons why I've done this as a single program (and Visual Studio solution) is my Entity Framework context and service classes are shared between each batch job. If I split the program into several smaller programs, each program would have a lot of classes and the EF context that are the same code. And any bugfix or extension to a service in one program would need to be copied to the others. Also, my EF context has several dozen tables that are done with Fluent API.

I considered making a series of microservices, but there are so many cross cutting concerns that each microservice would need a lot of the same EF context and service classes that I described above, thereby defeating the goal of using microservices.

Is there a commonly accepted architecture for this sort of thing? Is what I'm doing OK, or am I programming as if it's 15 years ago?

Best Answer

Since I have several separate batch-process-type functions all in the same program, I'm worried that a crash or bug in the program could cause all the batch processes to stop working.

Are you running multiple jobs at the same time in the same process? In that case there is a bit of a risk that one error might terminate other jobs. This will only happen for errors which damage the whole process. This is fairly rare. Examples are out of memory conditions, stack overflows or people terminating the process. Otherwise, you can wrap your processing code in try-catch to isolate them from one another. This is what web applications do and normally it works very well.

You could also run each job as a separate OS process. Then, the jobs are totally isolated. You can reuse the same EXE file for this. Make an integer or string command line argument that tells the EXE which of the many possible jobs it is supposed to launch.

Reusing the same EXE file saves you from the need to create many C# projects and have the code management overhead that would come with those projects.

It also seems bad to have one single program that does a bunch of different things.

What is your concern? Maybe you are worried that this might go against the single responsibility principle. This principle is about the same code doing multiple things. As I understand it your jobs are fairly isolated from one another. Also consider that typical web application are doing 100s or 1000s of different things and it's not an issue. The SRP just a principle, not a hard rule. Do what works for you.

Even though each batch process is related to one area of my ERP, I'm worried that having a single codebase with different functions in it will end up creating something monolithic and difficult for several developers to work on different aspects of simultaneously.

If you are concerned the codebase might become unwieldy then I'd argue that splitting the code into multiple projects would not gain you anything. It's still the same code, the same dependencies and architecture in a different place.

Solving entanglement is never possible just by putting code into a different place (additional projects or services). You must make architectural changes.

So whether you build one EXE/microservice per job or if you make each job merely it's own C# class is not much of an architectural difference. It just adds a lot of development complexity.

Since your jobs seam fairly isolated from one another I'm not concerned about putting all of them in the same project.

It also sounds like you want to share some infrastructure such as Entity Framework models. That's also a good reason to keep things together.

Related Topic