Object-Oriented Programming – Memory Organization of Objects and Classes in Assembly

assemblyobject-oriented

How are objects organized in memory?

For instance, I know that a function is a piece of code in memory, that expects parameters via the stack and/or the registers and handles it's own stack frame.

But objects are a much more complicated structure. How are they organized?
Does each object have "links" to methods and passes address to itself to that method?

It would be great to see a good explanation of this topic.

UPD. I made the question more exact, and I'm mainly interested in statically typing languages.

Best Answer

If there is no dynamic dispatch (polymorphism), "methods" are just sugary functions, perhaps with an implicit additional parameter. Accordingly, instances of classes with no polymorphic behavior are essentially C structs for the purpose of code generation.

For classical dynamic dispatch in a static type system, there is basically one predominant strategy: vtables. Every instance gets one additional pointer that refers to (a limited representation of) its type, most importantly the vtable: An array of function pointers, one per method. Since the the full set of methods for every type (in the inheritance chain) is known at compile time, one can assign consecutive indices (0..N for N methods) to the methods and invoke the methods by looking up the function pointer in the vtable using this index (again passing the instance reference as additional parameter).

For more dynamic class-based languages, typically classes themselves are first-class objects and each object instead has a reference to its class object. The class object, in turn, owns the methods in some language-dependent manner (in Ruby, methods are a core part of the object model, in Python they're just function objects with tiny wrappers around them). The classes typically store references to their superclass(es) as well, and delegate the search for inherited methods to those classes to aid metaprogramming which adds and alters methods.

There are many other systems that aren't based on classes, but they differ significantly, so I'll only pick out one interesting design alternative: When you can add new (sets of) methods to all types at will anywhere in the program (e.g. type classes in Haskell and traits in Rust), the full set of methods isn't known while compiling. To resolve this, one creates a vtable per trait and passes them around when the trait implementation is required. That is, code like this:

void needs_a_trait(SomeTrait &x) { x.method2(1); }
ConcreteType x = ...;
needs_a_trait(x);

is compiled down to this:

functionpointer SomeTrait_ConcreteType_vtable[] = { &method1, &method2, ... };
void needs_a_trait(void *x, functionpointer vtable[]) { vtable[1](x, 1); }
ConcreteType x = ...;
needs_a_trait(x, SomeTrait_ConcreteType_vtable);

This also means the vtable information isn't embedded in the object. If you want references to an "instance of a trait" that will behave correctly when, for example, stored in data structures that contain many different types, one can create a fat pointer (instance_pointer, trait_vtable). This is actually a generalization of the above strategy.

Related Solutions

Java – OOP Objects, nested objects, and DAO’s

Should I be nesting one object as part of another or should I be creating more specific objects for different usages.

Follow OO principles first which means "more specific objects." In doing so your classes may "line up" with your database schema or not but do not let DB schema trump good OO design.
Single Responsibility Principle will help guide you in what classes to build. SRP means, for example, that a Song is a song. It is not an artist, it is not a list of subscribers, so it should not have artist or subscriber stuff in it. Only song stuff.
The above means that you will have lots of small, independent, fully functional things - classes. By "fully functional" I mean, if a Song is a name, date, and id then that's what's in it. period. The fact that a certain artist sings that song does not inherently, fundamentally define what a Song is. This means other classes to model relationships like "an artist's song repertoire" for example.
Small functional classes leads to good DAO's and flexibility for your user interface.

Option 1 means wasting a lot of time/resources grabbing information I may not need for that page but easier to manage. Option 2 is messy and requires keeping track of which object is what but faster and far less db calls

You are falling victim to premature optimization. How can you know up front that option 2 will have "fewer DB calls?" What does that mean anyway?
This is simply the wrong way to think about your domain classes/model. This is why you end up duplicating your DB schema:

Class SongWithArtist {
  private $song; //Basic Song object
  private $artist; //Basic Artist object
}

When what you should have is something describing the real world:

Class ArtistPortfolio {
    private Artist $theArtist;
    private List<Song> $portfolio;  // list of songs (s)he sings 
}

Class ArtistSubscribers {
    private Artist $theArtist;
    private List<User> $subscribers;  // list of people who like this artist
}

// And it would probably make sense to combine the above 2 classes:

Class ArtistProfile {
    // an Artist object, not just an id. We're doing OBJECT oriented programming.
    private Artist $theArtist;

    private List<Song> $portfolio;  // list of Song objects 
    private List<User> $subscribers; // list of User objects
}

// and if you need a list of profiles...
Class ArtistProfiles {
    private List<ArtistProfile> $profiles; // a list of type ArtistProfile

    public ArtistProfile GetProfileByArtist (Artist thisArtist){}
    public ArtistProfile GetProfileByName (string name) {}
    public ArtistProfile GetProfileById (string id) {}
}

// I'd say a ArtistProfile could be part of an Artist..

Class Artist {
    private $id;
    private $name;
    private ArtistProfile $profile; // this is composition.
}


//In lieu of the above, an DAO oriented Artist composition ...
// Just go with the refactoring flow!
Class Profile {
    private $id;
    private $name;
    private $birthdate
}

 Class Repertoire {
    // a Profile object, not just an id. We're doing OBJECT oriented programming.
    private Profile $theArtist;

    private List<Song> $portfolio;  // list of Song objects 
    private List<User> $subscribers; // list of User objects
}

Class Artist {
    private Profile $me;
    private Repertoire $myStuff; 
}

Too Many DB Calls!

NOT. You can instantiate a Artist object without populating the $myStuff and in turn, defer populating $subscribers / portfolio lists until needed. This is called lazy loading.

Memory addresses and Assembly

Without any further coordination, at least one writer plus one reader can result in a classic race condition.

There are a number of factors involved.

If there is only one memory location involved (a byte, or aligned word) it is possible that two threads, one writer and one reader, accesing the same location, do effectively communicate. (Alignment is usually important in the context of the professor's memory model, because unaligned data acts like two or more independent memory locations)

However, keeping within these limitations alone does not allow a generous or rich interaction between two threads.

Involve more than one memory location or more than one writer and explicit synchronization is almost certainly required.

There are various processor instructions that facilitate synchronization.

One set works like an atomic read-modify-write, and allows multiple writers to do, among other things, increment a counter without loosing any counts. These are sometimes implemented as compare-and-swap instructions. There are a number of variations, including paired insructions load-linked and stored-conditional.

There are also memory barrier instructions that tell the processor something about when and how to flush individual processor caches to common main memory.

These primitives can be used to build larger locks. Most operating systems will provide some rich thread synchronization capabilities that are in some way built on these hardware primitives.

Programming languages and operating systems expose these hardware primitives thru locking, synchronized methods & blocks, and volatile variables.

Transactions and or transactional memory is another very interesting feature having some underlying, new hardware support, but is still very new.

Best Answer

Related Solutions

Java – OOP Objects, nested objects, and DAO’s

Memory addresses and Assembly

Related Topic