Sept. 9, 2021
Here is the article.
Below is a simple framework for Mobile System Design interviews. As an example, we are going to use the “Design Twitter Feed” question. The proposed solution is far from being perfect but it is not the point of a system design interview round: no one expects you to build a robust system in just 30 min — the interviewer is mostly looking for specific “signals” from your thought process and communication. Everything you say should showcase your strengths and help the interviewer to evaluate you as a candidate.
Disclaimer
The framework was heavily inspired by the similar “Scalable Backend Design” articles. Learning the framework does not guarantee success during the interview. The structure of the interview process depends on the personal style of the interviewer. The dialog is really important — make sure you understand what the interviewer is looking for. Make no assumptions and ask clarifying questions.
Disclaimer
The framework was heavily inspired by the similar “Scalable Backend Design” articles. Learning the framework does not guarantee success during the interview. The structure of the interview process depends on the personal style of the interviewer. The dialog is really important — make sure you understand what the interviewer is looking for. Make no assumptions and ask clarifying questions.
Server-side components:
- Backend
Represents the whole server-sider infrastructure. Most likely, your interviewer won’t be interested in discussing it. - Push Provider
Represents the Mobile Push Provider infrastructure. Receives push payloads from the Backend and delivers them to clients. - CDN (Content Delivery Network)
Responsible for delivering static content to clients.
Client-side components:
- API Service
Abstracts client-server communications from the rest of the system. - Persistence
A single source of truth. The data your system receives gets persisted on the disk first and then propagated to other components. - Repository
A mediator component between API Service and Persistence. - Tweet Feed Flow
Represents a set of components responsible for displaying an infinite scrollable list of tweets. - Tweet Details Flow
Represents a set of components responsible for displaying a single tweet’s details. - DI Graph
Dependency injection graph. - Image Loader
Responsible for loading and caching static images. Usually represented by a 3rd-party library. - Coordinator
Organizes flow logic between Tweet Feed and Tweet Details components. Helps decoupling components of the system from each other. - App Module
An executable part of the system which “glues” components together.
Providing the “signal”
The interviewer might be looking for the following signals:
- The candidate can present the “big picture” without overloading it with unnecessary implementation details.
- The candidate can identify the major building blocks of the system and how they communicate with each other.
- The candidate has app modularity in mind and is capable of thinking in the scope of the entire team and not limiting themselves as a single contributor (this might be more important for senior candidates).
Deep Dive: Tweet Feed Flow
After a high-level discussion, your interviewer might steer the conversation towards some specific component of the system. Let’s assume it was “Tweet Feed Flow”. Things you might want to talk about:
- Architecture patterns
MVP, MVVM, MVI, etc. MVC is considered a poor choice these days. It’s better to select a well-known pattern since it makes it easier to onboard new hires (compared to some less known home-grown approaches). - Pagination
Essential for infinite scroll functionality. For more details see Pagination. - Dependency injection
Helps to build an isolated and testable module. - Image Loading
Low-res vs full-res image loading, scrolling performance, etc.
Components
- Feed API Service — abstracts Twitter Feed API client: provides the functionality for requesting paginated data from the backend. Injected via DI-graph.
- Feed Persistence — abstract cached paginated data storage. Injected via DI-graph.
- Remote Mediator — triggers fetching the next/prev page of data. Redirects the newly fetched paged response into a persistence layer.
- Feed Repository — consolidates remote and cached responses into a Pager object through Remote Mediator.
- Pager — trigger data fetching from the Remote Mediator and exposes an observable stream of paged data to UI.
- “Tweet Like” and “Tweet Details” use cases — provide delegated implementation for “Like” and “Show Details” operations. Injected via DI-graph.
- Image Loader — abstracts image loading from the image loading library. Injected via DI-graph.
Providing the “signal”
The interviewer might be looking for the following signals:
- The candidate is familiar with the most common MVx patterns.
- The candidate achieves a clear separation between business logic and UI.
- The candidate is familiar with dependency injection methods.
- The candidate is capable of designing self-contained isolated modules.
API Design
The goal is to cover as much ground as possible — you won’t have enough time to cover every API call — just ask the interviewer if they are particularly interested in a specific part, or choose something you know best (in case they don’t have a strong preference).
Real-time notifications
We need to provide real-time notifications support as a part of the design. Below are some of the approaches you can mention during the discussion.
Push Notifications
Based on the operating system push notification service (OSPNS) and delivered from a 3rd-party push provider.
pros:
- easier to implement compared to a dedicated service.
- can wake the app in the background.
cons:
- not 100% reliable.
- may take up to a minute to arrive.
- relies on a 3rd-party service.
HTTP-polling
Polling requires the client to periodically ask the server for updates. The biggest concern is the amount of unnecessary network traffic and increased backend load.
Short HTTP-polling
The client sends lots of frequent requests to the server.
pros:
- simple and not as expensive (if the time between requests is long).
- no need to keep a persistent connection.
cons:
- the notification can be delayed for as long as the polling time interval.
- additional overhead due to TLS Handshake and HTTP-headers
Long HTTP-polling
A single request is sent to the server and the client is waiting for the response.
pros:
- instant notification (no additional delay).
cons:
- more complex and requires more server-side resources.
- keeps a persistent connection until the server replies.
Server-Sent Events
Allows the client to stream events over an HTTP/1.1 connection without polling.
pros:
- real-time traffic using a single connection.
cons:
- keeps a persistent connection.
Web-Sockets
Provide bi-directional communication between client and server.
pros:
- can transmit both binary and text data.
cons:
- more complex to set up compared to Polling/SSE.
- keeps a persistent connection.
The interviewer would expect you to pick a concrete approach most suitable for the design task at hand. One possible solution for the “Design Twitter Feed” question could be using a combination of SSE (a primary channel of receiving real-time updates on “likes”) with Push Notifications (sent if the client does not have an active connection to the backend).
Protocols
REST
A text-based stateless protocol is the most popular choice for CRUD (Create, Read, Update, and Delete) operations.
pros:
- easy to learn, understand, and implement.
- easy to cache using a built-in HTTP caching mechanism.
- loose coupling between client and server.
cons:
- less efficient on mobile platforms since every request requires a separate physical connection.
- schemaless — it’s hard to check data validity on the client.
- stateless — needs extra functionality to maintain a session.
- additional overhead — every request contains contextual metadata and headers.
GraphQL
A query language for working with API — allows clients to request data from several resources using a single endpoint (instead of making multiple requests in traditional RESTful apps).
pros:
- schema-based typed queries — clients can verify data integrity and format.
- highly customizable — clients can request specific data and reduce the amount of HTTP traffic.
- bi-directional communication with GraphQL Subscriptions (WebSocket based).
cons:
- more complex backend implementation.
- “leaky-abstraction” — clients become tightly coupled to the backend.
- the performance of a query is bound to the performance of the slowest service on the backend (in case the response data is federated between multiple services).
WebSocket
Full-duplex communication over a single TCP connection.
pros:
- real-time bi-directional communication.
- provides both text-based and binary traffic.
cons:
- requires maintaining an active connection — might have poor performance on unstable cellular networks.
- schemaless — it’s hard to check data validity on the client.
- the number of active connections on a single server is limited to 65k.
gRPC
Remote Procedure Call framework which runs on top of HTTP/2. Supports bi-directional streaming using a single physical connection.
pros:
- lightweight binary messages (much smaller compared to text-based protocols).
- schema-based — built-in code generation with Protobuf.
- provides support of event-driven architecture: server-side streaming, client-side streaming, and bi-directional streaming
- multiple parallel requests.
cons:
- limited browser support.
- non-human-readable format.
- steeper learning curve.
The interviewer would expect you to pick a concrete approach most suitable for the design task at hand. Since the API layer for the “Design Twitter Feed” question is pretty simple and does not require much customization — we can select an approach based on REST.
Pagination
Endpoints that return a list of entities must support pagination. Without pagination, a single request could return a huge amount of results causing excessive network and memory usage.
Offset Pagination
Provides limit
and offset
query parameters. Example: GET /feed?offset=100&limit=20
.
pros:
- easiest to implement — the request parameters can be passed directly to a SQL query.
- stateless on the server.
cons:
- bad performance on large offset values (the database needs to skip
offset
rows before returning the paginated result). - inconsistent when adding new rows into the database (Page Drift).
Keyset Pagination
Uses the values from the last page to fetch the next set of items. Example: GET /feed?after=2021-05-25T00:00:00&limit=20
.
pros:
- translates easily into a SQL query.
- good performance with large datasets.
- stateless on the server.
cons:
- “leaky abstraction” — the pagination mechanism becomes aware of the underlying database storage.
- only works on fields with a natural ordering (timestamps, etc).
Cursor/Seek Pagination
Operates with stable ids which are decoupled from the database SQL queries (usually, a selected field is encoded using base64 and encrypted on the backend side). Example: GET /feed?after_id=t1234xzy&limit=20
.
pros:
- decouples pagination from SQL database.
- consistent ordering when new items are inserted.
cons:
- more complex backend implementation.
- does not work well if items get deleted (ids might become invalid).
You need to select a single approach after listing the possible options and discussing their pros and cons. We’ll pick Cursor Pagination in the scope of the “Design Twitter Feed” question. A sample API request might look like this:
GET /v1/feed?after_id=p1234xzy&limit=20
Authorization: Bearer <token>
{
"data": {
"items": [
{
"id": "t123",
"author_id": "a123",
"title": "Title",
"description": "Description",
"likes": 12345,
"comments": 10,
"attachments": {
"media": [
{
"image_url": "https://static.cdn.com/image1234.webp",
"thumb_url": "https://static.cdn.com/thumb1234.webp"
},
...
]
},
"created_at": "2021-05-25T17:59:59.000Z"
},
...
]
},
"cursor": {
"count": 20,
"next_id": "p1235xzy",
"prev_id": null
}
}
Authentication
Although we left it out of scope, it’s still beneficial to mention HTTP authentication. You can include theAuthorization
header and discuss how to properly handle 401 Unauthorized
response scenarios. Also, don't forget to talk about Rate-Limiting strategies (429 Too Many Requests
).
Make sure to keep it brief and simple (without unnecessary details): your primary goal during a system design interview is to provide a "signal" and not to build a production-ready solution.
Providing the “signal”
The interviewer might be looking for the following signals:
- The candidate is aware of the challenges related to poor network conditions and expensive traffic.
- The candidate is familiar with the most common protocols for unidirectional and bi-directional communication.
- The candidate is familiar with REST-full API design.
- The candidate is familiar with authentication and security best practices.
- The candidate is familiar with network error handling and rate-limiting.
Conclusion
There’s a significant amount of randomness during a system design interview. The process and the structure can vary depending on the company and the interviewer.
Things you can control
- Your attitude — always be friendly no matter how the interview goes. Don’t be pushy and don’t argue with the interviewer — this might provide a bad “signal”.
- Your preparation — the better your preparation is, the bigger the chance of a positive outcome. Practice mock design interviews with your peers (you can find people on Teamblind).
- Your knowledge — the more knowledge you have, the better your chances are.
- Gain more experience.
- Study popular open-source projects: iOS, Android
- Read development blogs from tech companies
- Your resume — make sure to list all your accomplishments with measurable impact.
Things you cannot control
- Your interviewer’s attitude — they might have a bad day or simply dislike you.
- Your competition — sometimes there’s simply a better candidate.
- The hiring committee — would make a decision based on the interviewers’ reports and your resume.
Judging the outcome
You can influence the outcome but you can’t control it. Don’t let minor setbacks determine your self-worth.
No comments:
Post a Comment