Architecture – How to Maintain User Avatars in NoSQL Databases

Architecturedatabasenosql

The issue at hand is dealing with user avatars that come from their social media accounts. When a user signs up with our app, the user will use his/her social media account and we simply save the URL to user's avatar in our database.

We're now running into an issue because in a NoSQL database, we're creating many copies of the same data such as user's avatar URL. There's a single source of truth about the user information which is his/her user record in our database. So it's easy enough to update this and detect a change in the avatar URL.

The question is: what is the best way to handle this change in all other places e.g. comments posted by user, etc. This wouldn't be an issue in a relational database scenario but we're using a few NoSQL databases and user avatar URLs are saved in multiple places.

I can think of four ways to handle this and wanted to see how others have handled this scneario:

  1. We save user's avatar URL in all related entities such as comments, etc. When the avatar URL changes, simply use a back-end process to update the URL in all related objects. I think this approach would give the best performance because we use whatever URL comes with the data. No need for any kind of real-time processing. However, this approach can be really costly in terms of maintenance e.g. updating hundreds or even thousands of entities as avatars change.

  2. Another way to handle this is not to save user avatar URL in any of the entities related to the user other than his/her primary record. We then grab the user's avatar URL on first request and cache it, say in Redis cache. As we process requests in the application, we simply pull their avatar URLs from the cache and link to appropriate entities on the fly. I see two major problems with this approach. First, it will create a small delay in managing user requests. For example, if a user requests a video with hundreds of comments under it, I'd have to pull the avatars for all users from my cache and some of them may not even be cached so I'd have to make a database call to get their avatars and cache them for future use. The second issue I see with this is the heavy use of cache, which I'm not against at all but in this scenario, it feels like I'm using the most expensive resource to manage such a trivial requirement.

  3. A third way is to get a user's avatar, say from his/her social login and store the file locally. From that point on, we have full control over that file and its URL will never change unless we change it. Three issues with this approach: the first issue is that I don't think there's a way to download social media profile images as they don't provide a link to a file. Second issue is legality. I'm not sure if we'd have the right to download and store a file that belongs to LinkedIn, Facebook, Google, etc. The third issue is that when the user changes his/her avatar in the social media account, user's avatar in our app won't update automatically. He/She would have to handle that manually. Not the end of the world but not great user experience either.

  4. Use gravatar service. I only know some basic information about this service so I can't say much about it. My primary concerns are that we're using yet another third party resource on top of a third party resource which will only complicate things further. Also, not exactly sure how gravatar works but what if the user doesn't have a gravatar account?

I'd like to see how others handled this scenario. Thanks.

P.S. Based on a few posts I read here and there, I was under the impression that avatar URLs didn't change because social media providers would generate a URL for the user's avatar which is not the file name. So even if the user changes their avatar on their social media account, the URL wouldn't change. I now know as a fact that this assumption is not correct and avatar URL coming from social media providers do change. It's happened to me with LinkedIn. A completely different URL is now my avatar URL. Funny thing is that I have not even updated my profile image with them. Somehow, LinkedIn decided to return a different URL to my profile image.

Best Answer

One of the possible solutions would be to add a route such as https://example.com/user/<id>/avatar which would redirect the browser to the actual avatar.

For instance, if the real avatar is stored at https://linkedin.com/avatars/40bd001563085fc35165, your website will only store this URI once, in users document, associated with the user 123. Everywhere in the user interface, i.e. in all entities such as comments, the avatar will be implemented like this:

<img src="https://example.com/user/123/avatar" alt="..." />

During a HTTP request to https://example.com/user/123/avatar, the server will load the stored URI and respond with:

HTTP/1.1 302 Found
Location: https://linkedin.com/avatars/40bd001563085fc35165

which would effectively force the browser to show the correct image.

Notes:

  • In terms of performance, there shouldn't be too much issues. The request is relatively fast to process, and uses only marginal bandwidth (unlike serving images yourself).

  • It is essential to use HTTP 302 and not HTTP 301; otherwise, users who changed avatars will sometimes continue to see the old avatar, possibly for a long time.

  • Proper client-side caching can be implemented to prevent the browser from requesting the same avatar over and over (usually, when changing an avatar, one wouldn't be surprised to still see the old one for several minutes on some sites).


Note that if you're experiencing this difficulty with the avatar, you'll probably have the same issue with other pieces of information as well, due to improper normalization/denormalization. While some NoSQL databases encourage you to duplicate data in order to make queries faster and guarantee data consistency within a document, this comes at a cost of not being able to easily change the data scattered all over your database. Therefore:

  • Make sure you understand when to duplicate data and where to reference a single piece of data from other documents.

  • When applicable, rely on techniques such as the one I presented here, which make it possible to store a piece of data once, while not referencing it directly in other documents. For instance, on Stack Exchange, the zone which displays a user shows not only the user's name and avatar, but also the badges, the geographical location, the link to the website and a signature. Since those data are not crucial for the website, they can be queried through AJAX after the page is loaded, meaning that a question/answer doesn't need to contain this information or link to it in any way.

Related Topic