Usually, when interviewing for an engineering or software position, you end up doing some sort of “coding challenge”. A friend of mine has recently left the company and is doing one at the moment. He asked me for help on solving a particular challenge, which I felt was a good exercise for me.
This blog post talks a bit about the challenge and my approach to solving it. Also, this post works as a personal reminder that sometimes it is not about picking the best tool, but the right data structure.
“90% of the time, you can solve your problem with just the right data structures” - someone, once, on Reddit probably.
The Challenge
The problem is, and I’m paraphrasing:
“Build a phone information aggregator API. This system takes a list of phone numbers and returns the count of valid phones broken down per prefix and business sector."
For example, given 5 phone numbers, where 4 of them are valid, the system should return:
- a count of 1 phone for Technology, and 1 phone for Banking associated with the
+1
prefix, - and a count of 2 phones for Clothing associated with the
+3519173
prefix.
|
|
|
|
How to approach the problem
The problem revolves around building an API that exposes the functionality of aggregating phone numbers by industry sector. We can divide it into 4 parts:
- Expose an API through an HTTP Service
- Check if the given phone numbers have valid prefixes
- Interact with an external service to get the corresponding sector
- Serialize the response in the expected format
The interesting part of this problem is the 2nd step, “validate phone number prefix”:
{ "number": "+1", "sector": "Technology" } api->>client: HTTP 200 OK
{ "1": { "Technology": 1 } } deactivate api end deactivate client
Valid phone numbers
Validating a phone number in this context means:
- we have an extensive list of “phone prefixes” (around 900k prefixes) that should be considered the “ground truth”
- and for every given phone number, we want to cross-check against that list, and if valid, make an API call to an external service
|
|
So for example: If I have the phone number “+1983236248”, and a list of prefixes of [1, 2, 44]
, the prefix is “+1”.
Getting the prefix sector
To group by “sector” we need to use an external API. The OG challenge gives us an URL to use, but since that would be too much on the nose for what company this challenge is for, I’ll just explain what it does 👀:
- The API has a GET request that goes to “https://a-challenge.company.com/sector/:number" where it returns the corresponding industry sector. For example: given the number
+98 72 349
, which belongs to a company in the Banking sector, the GET request tohttps://a-challenge.company.com/sector/+98%2072%20349
would return the following:
|
|
phone number | API end API --> | Get sector | ExternalService
Developing the API
For tackling this challenge, I’m experimenting with Starlette, which is a simple async Python framework. This project is built by following the tutorial on Starlette documentation.
You can find this projects codebase on https://github.com/andreffs18/phone-validator
First, we create a single server.py
file that opens two endpoints, and starts the server:
|
|
To run the service just do the following:
|
|
And to test that everything is working, we just need to POST /aggregate
endpoint:
|
|
Validating inputs
We need to check for valid phone numbers. The rules of the challenge are:
“A number is considered valid if it contains only digits, an optional leading
+
, and whitespace anywhere except immediately after the+
. A valid number has exactly 3 digits or more than 6 and less than 13.00
is acceptable as replacement for the leading+
. All dashes and parentheses are ignored."
This can be easily mapped to:
|
|
Checking prefixes
The core of the problem.
In the spirit of “let’s just make this work and then improve", we can check if a given number contains a valid prefix just by iterating through the list of prefixes and asserting that the phone number starts with that string:
|
|
This is doing what we want, but it is not the best and is far from being an optimized solution.
Let’s put a pin in this and revisit the topic later
Fetching sectors
Instead of making actual requests to an external provider, we will be mocking the service using the following OpenAPI Spec:
|
|
I’m creating a mock service with prism. Without going into much detail, prism allows you to generate a mock/proxy server from a given OpenAPI Spec or Postman collection. It has this cool feature that makes all responses “dynamic” which provides some variability to the tests we are doing:
|
|
Requesting the mock server we get:
|
|
Adding this external call to our project is as mapping that curl
request into a python request
|
|
All together now
We have all the pieces that we need:
- The
is_valid_phone_number
method checks if the given phone number is valid - The
get_prefix
(the part of this challenge I’m interested in exploring) finds the prefix in a given phone number and returns it - Finally the
get_sector
, which talks to our mock server, returns a “fake” sector given a phone prefix
We are now in a condition where we can refactor our “/aggregate” route with the behavior we need:
|
|
Spinning up our service, and making the original request from the challenge instructions, gives us this:
|
|
Benchmarking this solution
Thinking back on the solution we have, the get_prefix
that we’ve implemented is arguably not the best approach. That is the point of this blog post: What is the most performant solution for this problem? But first, how bad is this solution?
- Is this slow, and how slow? In the worst-case scenario (ie: validating the last prefix) how much time it takes?
- Is the
PREFIX
list that big of a memory footprint? Or is it just a couple of kilobytes? - How many parallel requests am I able to serve with this solution?
To do this assessment I restructured the project and added some extra functionality:
- Create
routes.py
file where theaggregate()
andhealth()
routers live - Created a new
backend/
folder where I abstracted the functionget_prefix
into the filebackend/in_memory.py
- Configured the
startup()
even hook of the Starlette server to choose its “backend” using an environment variable - Added docker-compose to the project so I can start the whole project with just
docker-compose up
- Initialized
prometheus-middleware
on the Startlette app, so we can have a/metrics
endpoint to have some observability using Grafana dashboards
This is what the project looks like now:
|
|
|
|
To find the grafana dashboard or check the project’s code, you can go here: https://github.com/andreffs18/phone-validator
Qualifying the solution
Let’s take a look question by question:
Is this slow, and how slow? In the worst-case scenario (ie: last prefix) how much time it takes?
We can see this by measuring the time it takes to get a response. This involves all of the steps of requests on the sequence diagram below:
{ "number": "+1", "sector": "Technology" } api->>client: HTTP 200 OK
{ "1": { "Technology": 1 } } deactivate api deactivate client
To test this we will monitor the “Requests Time Taken” panel on our Grafana dashboard, which tells us how much time each request takes, plus we will be using “hey” to benchmark the endpoint.
A quick side note: with this blog post I ended up spending most of the time playing around with benchmarking tools and afaik hey doesn’t handle well high concurrency levels. So if your planing on doing some heavy loads, the recommended alternative is apache benchmark.
The question we want to answer is: What’s the P95 of the /aggregate
endpoint, with a phone number that has the first, middle, or last prefix of the list?
|
|
For 60 seconds, we ran the following command simulating one HTTP client with the number +1983248
, then +344999813123
, and lastly +6983248
.
|
|
Keep in mind that this was done on my Laptop, a 2015 MacBook Pro with 3.1 GHz Dual Core Intel i7, with 16GB of memory.
Prefix | Req/s | P95 | Total Reqs |
---|---|---|---|
+1983248 (first) | 27.7820 | 0.0731 secs | 1668 responses |
+344999813123 (middle) | 6.8536 | 0.2894 secs | 412 responses |
+6983248 (last) | 4.0407 | 0.4103 secs | 243 responses |
Answering our question, we can see that with the current implementation, when we validate the phone prefix, the further away we are from the beginning of the "PREFIX"
list, the slower the API takes to handle the request. Concretely, we get a 7X performance hit on the P95 of the service, depending on the input.
Is the PREFIX
list that big of a memory footprint?
Looking at our “Services Memory” panel, we can see how much loading all prefixes into memory users from our available memory.
How many parallel requests am I able to serve with this solution?
Similar to the question above, how many requests can this solution handle during 60 seconds by either having 2, 4, or 8 concurrent clients making those requests?
Baseline: 1 client
Solution | Clients | Prefix | Req/s | P95 | Total Reqs |
---|---|---|---|---|---|
in_memory | c1 | +1983248 (first) | 27.7820 | 0.0731 secs | 1668 responses |
in_memory | c1 | +344999813123 (middle) | 6.8536 | 0.2894 secs | 412 responses |
in_memory | c1 | +6983248 (last) | 4.0407 | 0.4103 secs | 243 responses |
in_memory | c2 | +1983248 | 48.3398 | 0.0836 secs | 2902 responses |
in_memory | c2 | +344999813123 | 9.4420 | 0.4440 secs | 568 responses |
in_memory | c2 | +6983248 | 4.3614 | 0.8041 secs | 263 responses |
in_memory | c4 | +1983248 | 59.2556 | 0.1272 secs | 3558 responses |
in_memory | c4 | +344999813123 | 9.5636 | 0.6002 secs | 577 responses |
in_memory | c4 | +6983248 | 3.8022 | 1.8013 secs | 232 responses |
in_memory | c8 | +1983248 | 46.7732 | 0.3957 secs | 2814 responses |
in_memory | c8 | +344999813123 | 7.9378 | 1.4435 secs | 481 responses |
in_memory | c8 | +6983248 | 2.3447 | 9.0418 secs | 144 responses |
Similar to our first question, the further away we are from the beginning of the “PREFIX”
list, the least amount of requests we can handle in parallel.
Alternatives
Of course, if I were to submit this challenge, this would not be the solution that I would be most proud of… in a nutshell, it doesn’t scale (slow and would need many instances to support the increased load).
Although very simple, it’s naive to assume that we would want to do string comparisons on a set of 900k elements for every request we get, on a possible very busy service.
So how can I solve this? Well:
- My first thought is to use a database and leverage its capabilities of searching + indexing.
- My second thought (and to be honest, going back to my “Algorithms and Data Structures” book) is to find a better data structure that solves this problem cleanly
To answer the first idea, I’ll be using Mongo, Postgres, and Redis databases to check the differences in RPS, P95, and Total Requests. For the second one, I’ll be comparing the original “In Memory” with a “Trie” data structure.
Setting up backends
Not spending too much time showing the setup of these solutions as before (the code can be found in the repo) but for all these different solutions I did:
- Installed the dependencies to be able to play with the different solutions
- Added new files to the
backend/
folder to then initialize each solution using a .env variable - Added both the database + an open-source dashboard to my docker-compose so I could see what was going on
|
|
phone number | Mongo API <--> | Validate
phone number | Postgres API <--> | Validate
phone number | Redis API <-.->| Validate
phone number | InMemory API <-.->| Validate
phone number | Trie end API ---> | Get sector | ExternalService
Using databases
Mongo
For Mongo, I just added all prefixes to a collection with the key “prefix”, and make an $in
query with the “exploded” number.
|
|
Postgres
For Postgres, I’ve also initialized the DB with a “prefix” column and added all prefixes but the approach to query is a bit different:
- Find in the “prefix” table a “phone number” that “LIKE"s each prefix.
- Because there might be more than one (ie: “+344999813123” find prefixes “1”, “2”, “44”, “3449998”), we want to get the “longest common string” which would be the last one of the ordered list.
|
|
Redis (is it a database?)
For Redis, the simplest way I could find was to mimic the solution I did for Mongo:
- add all prefixes to a key as a member,
- then we can use the
smismember
command to check if a given list of values belongs to the set.
That returns a list of “0’s” and “1’s” that match the list of “exploded prefixes”, so just doing an XOR between the results and our “exploded number”, gives us our prefix
|
|
Using data structures
Trie structure
Trie is more than a binary tree and the thing to notice is that with this data structure (compared against the original “in_memory” solution) our memory footprint increases a lot.
“A naive implementation of a trie consumes immense storage due to larger number of leaf-nodes caused by sparse distribution of keys” - Wikipedia
The best part of this solution is that it keeps operations to find a valid prefix in constant time.
|
|
Comparing results
The same tests were made: for each solution, during 60 seconds, simulate 1, 2, 4, and 8 concurrent clients, asking for the first, middle, and last prefix of the list.
This bash script should handle this nicely:
|
|
Some highlights:
- The most consistent solution across every dimension (prefix used and number of clients) is the Trie
- Strangely, Mongo uses almost 2GB of memory and I haven’t figured it out yet 🤔
- It comes as a surprise to me that with Postgres, we get very few RPS
- “hey” can’t handle very well high load. Apache Benchmark is a good alternative
- Redis is amazing 🤩 its comparable with the Trie solution.
Conclusion
Choosing the right tool for the job is important, but it’s not always the most crucial factor. In many cases, the right data structure is more essential when solving complex problems.
For example, in a coding challenge, you may be tempted to use a built-in function like reverse()
to check for a palindrome, but using a stack could be a faster and more efficient solution.
When working on software projects, it’s not always necessary to choose the latest tool or framework, sometimes it’s better to stick with what you know, or maybe even re-learn old solutions by going back to basics.
By the end of this post, I figure that I could use this project to test new versions of each tool, or new data structures that I might learn more about in the future.
For example, if Postgres gets bumped, or if I get the time to play with a “Patricia Tree” (funny name), I can just implement a backend, docker-compose up
and compare with the other solutions.
The project is self-contained enough that it allows me to expand with new solutions “as plugins” and keep the core functionality being tested.
Which I think is pretty sweet 🤓
Resources
- Setting up grafana dashboards: https://medium.com/javarevisited/monitoring-setup-with-docker-compose-part-2-grafana-2cd2d9ff017b
- Trie data structure: https://en.wikipedia.org/wiki/Trie
👋