Instagram System Design - Welcome to Tech by Example

Overview

Instagram is one of the most popular social media sharing apps which allows users to share their photos and videos with other users. Users can like and comment on other users’ photos and videos. Users can follow each other and can view their news feed which contains updates from the users they follow

Functional Requirements

Some of the functional requirements of the Instagram design are

Users can post a status update along with an image or a video

A timeline will be generated for a user which will contain the newsfeed for that user

Users will be able to like or comment on a post

One level of comment nesting is allowed

Users should be able to follow/unfollow each other

Non-Functional Requirements

Some of the Non-Functional requirements of the system are

In systems such as Instagram people will be viewing photos and new feeds more than uploading. So it is a read-heavy system. We need to design our system in such a way that latency is minimum. Read – Write ratio will be 80-20

The system should be highly available and able to serve 500 million users

The system should be durable. Any image, video, post uploaded to the system must always persist unless deleted by the user himself

The system needs to be eventually consistent. What it means is that once the user uploads any photo then it will visible in the timeline of its followers in some time

User APIs

Below will the APIs that will be needed

Create a POST with a photo or a video

Comment on a POST

Comment on a Comment itself

Like a Post

Like a Comment

Fetch the timeline

Data Storage

For data storage, we have below things to store

Image and Video

Post

Comments and Likes

Followers

News Feed

What database to use

For Instagram we will not have ACID requirements and also the data will be pretty large so we need a No SQL database

The system will be read-heavy.

We can use the Cassandra database as it is a No SQL database, it can store large amounts of data and also can handle a high number of read and as well as writes.

Now let’s see the data model for storage of each of the element

How upload of photos and videos is going to be uploaded

For storing images and videos we need cheap storage which could be a file system. For that, we can use Amazon S3 or HDFS. We can further use a CDN to cache the images and videos.

How Likes and Comments are going to be stored

First of all, let’s list all the requirements with respect to likes and comments

A post can have any number of likes

Post can have any number of comments

You can like a post as well a comment

One level of comment nesting is allowed

To keep things simple we will have

Two tables for likes. One for post_like and the other for comment_like

One table for posts, and

One table for comment.

Below will all the tables.

post table

Below will be the fields in the Post Table. This table will be partitioned on user_id.

post_id
title
description
tags – This field will be a hash
thumbnail
user_id
created
updated
image_id

This table will be sharded on user_id so that we are able to access all posts of a user from a single shardcomment table

Below will be the fields in the Comment Table. This table will be partitioned based on post_id. This is done so that all comments related to a post is in a single shard

comment_id
comment – This will be a text field
post_id
user_id
created
updated
parent_id – This will take care of nesting of comments

This table should be sharded on post_id so that we are able to fetch all comments belonging to a post from a single shard.

post_like table

Instagram shows you which post you have liked or not. Also, it shows what users have liked a particular post. All this information will be stored in this table. This table will be sharded on post_id so that you are able to fetch all likes related to a post using a single shard.

Below will be the fields in the

Post_Like Table

id
user_id
post_id
created
updated

This table should be sharded on post_id so that we are able to fetch all likes belonging to a post from a single shard.How we can fetch the number of likes for a given post. We can simply query this table to fetch this info. This count information can also be kept in a cache. This cache can be updated whenever a like is made on a post. The other way is to break the cache whenever a like is made on a post.

You can have a separate service which could be a worker that listens to a topic on which an event is published whenever a like is made on a post. This worker will then either update the cache or invalidate the cache.comment_like table

Below will be the fields in the Comment_Like Table. This table will be partitioned based on post_id as well. This is done so that all likes related to comments on a post are in a single shard

id
user_id
post_id
comment_id
created
updated

This table should be sharded on comment_id so that we are able to fetch all likes belonging to a comment from a single shard.

How to fetch the number of likes for a given comment. We can simply query this table for the same. Similar to the post_like table we can also keep this information in one kind of cache.

How followers and the following data will be stored.

For that, there will be a Follow table. Below will be the fields in the Follow table

user_id
follower_user_id
created
updated

How news feed will be stored

We will talk about news feed storage when we talk about how the news feed will be generated

High-Level Design

On a high level let’s discuss what will be the higher flow and what all services would exist.

There will be an API gateway on which every request from all the users will land.

There will be a User service that will be storing the user profile information

There will be a Token service that is going to generate and validate tokens. Basically, it is going to do everything related to token management

There will be a Post Service on which all requests related to the post will be received.

The Post service first creates an entry into the DB for the post in the post table

After the post is created in the database sends the messages to a Kafka + SNS/SQS system.

This message will be picked by a Timeline_Init Service which is a worker, this worker is going to make a call to the Follower Service to fetch all followers of the user who is the owner of the post. Then it is going to fan out the message for each of the followers to the Kafka + SNS/SQS system.

Each fanout message will be picked by another worker which will be a Timeline_Create Worker. It will create a timeline for the user. Later in this tutorial, we will study different ways and scenarios in which a feed will be generated.

There will be a Follower Service. As soon as any user follows any user, the call is going to come to this service. This service is going to create an entry in the database and push a message to the Kafka + SNS/SQS system.

This message will be picked by a Notification Service which will be a worker. This Notification Worker is going to send a notification to the user that was a follower

There will be a Feedback Service as well which is going to handle all API calls related to liking a post or comment, commenting on a post, or commenting on a comment itself. Again as soon as it receives any such activity it is going to publish a message to Kafka + SNS system.

This message will be picked by the Notification Service which is a worker that is going to send the notification to the post owner or comment owner whatever is applicable.

This message will also be picked by another worker whose name is Feedback_Counter worker. This worker is going to increment the count for the number of likes on a comment or a post whichever is applicable. The count will be increased in the cache

Let’s discuss each of the flows in detail and a diagram for each of them

Posting a Status Update

As mentioned earlier when someone creates a post then post service will come into the picture. A post might contain a photo or a video to be uploaded. Let’s see how this image and videos upload would work. For images and video uploads, we can make the assumption that the original size of the image or video will not be uploaded. A low res version of it will be created at the client’s end and then it will be uploaded. Even the low res version of any image and video would be of a few KBs. They can be uploaded to a storage provider directly. For eg let’s say that the storage provider is AWS S3 then below will be the flow

Let’s User A on its Instagram client wants to post a status that contains an image or video. The client will send a request to the server to send the presigned URL to which the client can upload the image or video

The server will respond with a pre-signed URL whose validity can be of few hours. You can read this article to get an idea of the pre-signed URL. Refer this doc to know more about presigned URL – https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html . Basically, it is an URL that is already signed with a token and hence it can be used to directly upload to the storage provider which is S3 here without requiring any further authentication. This is also called direct upload. The server will also return the image_id here

The client will upload the image to that URL. It will directly be stored in S3

Now the client will request to send the request to create the POST which has the image or video which got uploaded in the previous steps. It will also send the id of the uploaded video and photo.

This request will be received by the Post Service which is going to create an entry in the database for the post in the post table.

Then it sends the messages to a Kafka + SNS/SQS system.

This message will be picked by Timeline Services which includes the timeline_Init and timeline_create worker. They are going to create the timeline.

The Post Service is also going to cache the newly created post in Distributed Redis

A post might also contain a video. We can do one optimization related to videos streaming. For video streaming, we have two requirements

Video uploaded is should suitable to view across multiple devices.

Different people around the world will have different network speeds. So Adaptive Bit Rate Streaming means choosing different resolutions based upon network speed so that there is no buffering.

To fulfill the above two requirements we can do the below things

Transcode the video into a different format

Convert each transcoded video into different resolutions. Each of the different resolutions could be either 144p, 480p, 1080p

There could be separate services related to video management that does the above things.

Below is the high-level diagram for the same

How Timeline will be generated

Let’s see how the user timeline will be generated. A user’s timeline will be updated in advance in a cache. We have already seen what happens when somebody posts a status. Let’s see what happens after that so that a user timeline is generated.

First, let’s explore the different methods of timeline generation

Different Methods of Timeline Generation

There are four major approaches for generating the timeline

Fetch updates at Run Time (Client to Server)

Timeline Pre Generation Using the Pull Method (Fetch from DB and Client to Server)

Timeline Pre Generation Using the Push Method (Event-Driven and Client to Server)

Pushing updates as and when available ( Server to client)

We will use the combination of these methods to eventually generate the timeline of a user depending upon

If the user is an active user or an inactive user

The user follows other celebrities who could have millions of followers

All these methods will be dependent upon a timestamp value that will be kept for getting each user’s timeline. This timestamp will be used to build the timeline based on what updates the user has already seen. If the timestamp for the user is 23 Feb 2022 at 11:15 GMT then it means he has seen all updates before that timestamp. When the calls come to the server, then it will only return the timelines or updates after that timestamp

Let’s discuss these four methods now. But before we discuss all the methods, let’s discuss the user scenario first

There is user X who follows the below users

Below is more data

A, B, C, D are normal users who have followers in 100s
E is a celebrity who has millions of followers
F is a business account that also has millions of followers

First Method – Fetch updates at Run Time

In the first method, the client application will make a call to the server. The service will at run time fetch all updates of the customer.

Making a pull call to the server will fetch the current timestamp of the current user. Let’s that timestamp be t1. Then it will make a call to the POST service to fetch the updates of users A, B, C, D, E, and F that are created after t1. Once it has to fetch that data then it will send it to the client. Below is the diagram for the same

There are multiple problems with this approach-

First of all, it is a time-consuming approach. Why because it is going to fetch the up

Second, it is going to fetch updates from all the users and we have already mentioned that the post table is shared on the user_id. Hence this query is going to multiple shards which has a performance cost.

This approach is not scalable

Second Method – Timeline Generation Using the Pull Method (Fetch from DB)

In the second method, there will be an additional timeline_create service which is going to create the timeline of the user beforehand. This worker will be called for each user periodically. It will fetch the current timestamp of the current user. Let’s that timestamp be t1. Then it will make a call to the database to fetch the updates of users A, B, C, D, E, and F that are created after t1. Once it fetches the updates then it is going to insert them into some kind of database or cache. We will call this a timeline database or timeline cache. Below will be the schema for this database.

user_id
post_id
created
updated

Below is the diagram for the same

Some problems with this approach

There might not be new updates for the user. Hence even though we are trying to find new posts for the user but a couple of times it is just empty.

We are fetching and updating the timeline of a user at regular intervals. It could very well be the case that the user is not active at all and we might still be generating the timeline of the user

Third Method – Timeline Generation Using the Push Method

In this method

Users posts any status update then it publishes a message on SNS/Kafka

This message is picked by the timeline_init service which is a worker.

The timeline_init service fetches the first 100 batches of followers. For every follower, it fans out the message again to timeline_create service which is again a worker. This process is repeated for all the followers

timeline_create service on receiving the message will update the timeline of the follower with that post_id

Below is the diagram for the same

Some problems with this approach

A celebrity or a business account could have a large number of followers. And when such accounts post a status then all million followers’ timeline needs to be updated. This is not efficient.

Again similar to approach 2 we might be updating the timeline of a user who is not active

Fourth Method – Pushing updates to client as and when available

This is a server-to-client communication. It will be event-driven as well. Just that, as soon as the update is available it will be pushed to the client from the server. So there is no pre-generated timeline for the user since updates are pushed directly from the server to the client. Such server-to-client communication would require a different type of protocol such as web sockets.
Below is the diagram for this approach.

Some problems with this approach

Server-to-client communication requires sockets or any other similar technology. These technologies are expensive in terms of resource consumption and require a persistent connection to be maintained

What if the client is offline. In that case, while pushing the server realizes that the client is offline then it can save the updates to the DB and also cache.

Web sockets are useful for a very real-time application for example WhatsApp and might not be that suitable for Instagram. Hence this method can be used if Instagram uses web sockets

Recommended Approach

As we can see above that each of the methods above has some disadvantages. Therefore timeline generation of a user will depend on weather

The user is an active user

The user is an inactive user

The user is an active userI

n this case, we can use the combination of method 1 and method 3.

Using method 3 we can generate the timeline for that user only for those accounts which have followers in 100s

Then at run time we can fetch the updates from all the celebrities and business accounts and merge them with the generated timeline in step 1.

Let’s understand the above with an example. As we already mentioned that there is user X who follows below users

A – A1->A2
B – B1
C – C1
D – D1->D2
E – E1
F – F1->F2

A, B, C, D are normal users who have followers in 100s

E is a celebrity who has millions of followers

F is a business account that also has millions of followers

So when the user is an active user below is how the timeline will be generated.

A, B, C, D are normal users who have followers in 100s. Hence timeline for X having posts of these users will generate using Method 3. So the timeline for user X will be generated as below and saved in a cache

A1->A2->B2->C1->D1-D2

Timeline for user X will not be generated having posts from user E and user F

When the user makes a call to fetch the timeline, then at run time it will fetch the updates/posts from users E and F. It will then merge it with the already generated timeline from users A, B, C, D and return it back

The user is not an active user

In this case, we can only use method 1. Since the user is not an active user at all and only opens the app once a week then there is no point generating a timeline for the user beforehand as it is a wastage of storage.So in this case when the user makes a call to fetch the timeline, then at run time it will fetch the updates/posts from users A, B, C, D, E, F.

But this doesn’t seem efficient if a user has a large number of followers. This is where method 2 comes into the picture. When a user comes online we can use method 1 to fetch the updates for that user from some of its followers. Then it could trigger a background job to generate the remaining timeline for the user using method 2. Below is an example to understand it

When the user makes a call to fetch the timeline, then at run time it will fetch the updates/posts from A, B, C

A background job is triggered to create the rest of the timeline of the user using method 2 for updates/posts from D, E, F. By the time user is watching the first set of updates the second set of updates can be fetched from the generated timeline

So overall method 2 can be used at run time to generate the timeline for a user. Method 2 can also be preferred when Method 1 is very expensive.Below is the high-level diagram for Timeline Generation. Assume there are two users X and Y and Y is a follower of X. Y also follow other users which are celebrity user. In the below high-level diagram we are using a combination of Method 1 and Method 3.

Using Method 1 it fetches updates from celebrity users which User Y follows

Using Method 3, it pre-creates the timeline for the User Y. This timeline is created with posts from users who are not a celebrity and which User Y follows.

Below is the high-level flow

For User A

User A wants to create a post having an image. It calls the POST service to fetch the resigned URL. Using the resigned URL it uploads to tS3 directly.

Then it calls the Post service directly to create the post. The post is saved in the DB as well as in the cache.

After the post is created in the database sends the messages to a Kafka + SNS/SQS system.

This message will be picked by a Timeline_Init Service which is a worker, this worker is going to make a call to the Follower Service to fetch all followers of the user who is the owner of the post. Then it is going to fan out the message for each of the followers to the Kafka + SNS/SQS system.

Each fanout message will be picked by another worker which will be a Timeline_Create Worker. It will create the news feed for User B. This is where method 3 of Timeline Generation comes into the picture.

For User B

User B wants to fetch his timeline

Instagram makes a call to the Timeline App Service. Timeline App service does two things

First, it makes a call follower service to fetch the celebrity users which User Y follows. Then it fetches the updates for those celebrity users from the POST service. This is where method 1 comes into the picture

Second, it fetches the pre-created timeline for User Y from the cache and DB.

It merges both the results and returns back to the Instagram client for the User Y

Instagram then uses URLs in the returned timeline to directly download the images/videos from S3. We can also cache the photo/video on the CDN. This will enable faster retrieval of the media.

This is all about timeline generation in Instagram.

Other common components

Other common components could be

User Service – It holds the user profile information.

Token/Auth Service – Management of User tokens

SMS Service- It is used for sending any kind of message back to the user. For example – OTP

Analytics Service – This could be used to track any kind of analytics

Non-Functional Requirements

Let’s discuss some non-functional requirements now

Scalability

The first thing to consider with the above design is the scalability factor. The scalability of each of the components in the system is very important. Here are scalability challenges you can face and their possible resolutions

Each of the machines in the Post Service, Timeline Service etc could only serve a limited number of requests. Hence each service here should have proper autoscaling set in so that based on the number of requests we can add instances up and autoscale them when needed

Your Kafka/SNS system might not be able to take that much load. We can scale horizontally but to a limit. If that is becoming a bottleneck then depending upon the geography or userId we can have two or more such systems. Service discovery could be used to figure out which Kafka system a request needs to go to.

Another important factor of scalability here is that we have designed our system in such a way so that none of the services is bogged with too many things to do. There is a separation of concerns and wherever there was too much of a responsibility on service, we have broken it down

Another factor of scalability is sharding. There is a huge amount of data that needs to be stored and obviously, it cannot be stored on a single machine. It will be good to partition the data stored into different machines so that the overall architecture is scalability. We need to choose the number of shards based on our storage estimates. Also we need to choose the shard key or partition key smartly so that none of the queries is multi sharded

Low latency

We can cache the newly created posts with some expiry of course. As and when a post is created it is more likely to be visible in some other user timeline. It will reduce latency for many of the read calls.
There is one more optimization that we are doing here to improve the latency. We are caching the timeline of a user other than saving it to DB. With cache the timeline can be returned faster
We can cache the photos and videos to a CDN. It will enable faster retrieval of the media
Another area to further improve latency is optimizing video streaming.

Availability

In order for the system to be highly available, it is very important to have redundancy/backups for almost all components in the system. Here are some of the things that need to be done.

In the case of our DB, we need to enable replication. There should be multiple slaves for each of the master shard nodes.

For Redis we also need replication.

For data redundancy, we could be multi-region as well. This could be one of the benefits if one of the regions goes down.

Disaster Recovery could also be set up

Alerting and Monitoring

Alerting and Monitoring is also very important non-functional requirement. We should monitor each of our services and set up proper alerts as well. Some of the things that could be monitored are

API Response Time

Memory Consumption

CPU Consumption

Disk Space Consumption

Queue Length

….

Moving closer to user location

There are a couple of architectures that could be followed here. One such is Cell Architecture. You can read more about cell architecture here – https://github.com/wso2/reference-architecture/blob/master/reference-architecture-cell-based.md

Avoiding Single Point of Failures

A single point of failure is that part of a system that when stops working then it would lead the entire system to fail. We should try to prevent any single point of failure as well in our design. By redundancy and going multi-region we can prevent such things

Conclusion

This is all about the system design of Instagram. Hoped you have liked this article. Please share feedback in the comments