Python Data Structures – Should a Data Class Manipulate Its Input?

data structurespython

Background

I'm a scientist and trying to incorporate better software development practices in my work. In my niche field there's a standard model everyone uses that accepts a whopping 56 arguments. I'm attempting to create a public toolset that helps create and organize all the necessary inputs. The motivation is twofold: 1. it's really difficult to keep track of 56 variables before calling the model and 2. I can provide improved error handling to help track down the problem before the model is called (if you give the model a bad input, it just says "there was a bad input…" which is frustrating to say the least).

To do this, I'm thinking I should make classes that acts like structs—that is, they accept inputs and validate them, but don't do anything with these inputs. Once one of the inputs is created, it never needs modified and will rarely ever need to be accessed when creating the other inputs. I've organized all of the inputs into categories, and it seems sensible to me that each category should be its own class (correct me if I'm wrong, but that's not the meat of the question).

Question

This model needs 4 different angular inputs to run. Any user of this model will almost never have all 4 of these inputs; rather, they'll almost always have something that can can easily be manipulated into the expected inputs. With this in mind, I think it would be best to design this class to do absolutely no manipulation, and simply add functions that can do the manipulations and return instances of the class. I think this would best adhere to the SRP. Is this a sensible idea, or am I being overly pedantic about things? Here is a sketch of my idea:

import numpy as np

class Angles:
    def __init__(self, angle0, angle1, angle2, angle3):
        self.angle0 = _Angle0(angle0)   # _Angle0 will validate the input 
        self.angle1 = _Angle1(angle1)
        ...

def convert(angle0, angle1, angle2, angle3):
    foo0 = np.cos(angle0)
    foo1 = np.cos(angle1)
    return Angles(foo0, foo1, angle2, angle3)

I guess it just feels wrong to have a class that does nothing but validate and store values, whereas the functions do all of the actual work of creating the inputs that I need. Can anyone comment on whether this is sensible or if there's an obviously superior way to design this basic code?

Best Answer

it just feels wrong to have a class that does nothing but validate and store values

For me, it feels perfectly fine - as long as you can come up with a meaningful name for the class, one which lets you think of the group of parameters "as a whole", without always having to remember what's inside. I don't know your domain, but Angles sounds pretty meaningless to me - but maybe it is just so inexpressive for the sake of this example, not in the real code. Same holds for convert - give the function a name which expresses clearly what kind of conversion it does - from what input to what output. And the comment behind _Angle0 indicates the method should have a name like _validateAngle0, for example.

Additionally, consider to implement the conversion functions as class methods of Angle. It is probably not possible to implement them as different Angle constructors, since they are cannot be distinguished just by their signature, so class methods should be the canonical choice here.

Note that if you cannot come up with a better name for this group of four angles, that might be a sign that you chose the wrong abstraction.

Related Topic