The rise of COVID-19 has prompted an unprecedented selection of changes in the digital marketplace almost overnight, Google Explains
The world transformed into a more physically distant one, with offices and buildings shutting down rapidly to limit the spread of the virus.
This shift drove millions of new users to collaborative tools like Microsoft Teams, Slack, Zoom Video Communications, and Google Meet. Many companies struggled to handle the rapid influx of demand. Zoom faced significant issues with security and privacy as it attempted to scale up too fast. Elsewhere, other brands fought hard to reduce potential outages.
Even Google, one of the biggest tech companies in the world, had it’s problems. However, as the Google team revealed in a recent blog post – they figured out how to thrive in this new world.
Responding to Sudden Demand
According to Google, the SRE team for Google Meet began getting requests for additional capacity mid-way through February. The company immediately began searching for additional resources, but they knew they were going to have to think further ahead too.
The SRE team began formulating a response to a need for video and meeting technology greater than anyone had ever seen. At the same time, the Google team was also in the process of transitioning into a new period of from-home working due to Covid-19. Google quickly realised that the scope of the mission was huge, and the response they issued needed to be long-term.
Eventually, the company decided to set up a series of workstreams, focusing on:
- Capacity: Finding and issuing resources
- Dependancies: Handling Meet’s infrastructure and scale
- Bottlenecks: Identifying and removing scaling limits
- Control knobs: Building new mitigations into the system
- Production changes: Delivering new capacity and servers with optimised tuning
As incident responders, Google constantly evaluated the operational structure it was using and whether it made sense. According to the team – it was a marathon, not a sprint.
Google rapidly jumped into action, responding to the need for new regional availability with more than 20 data centers around the world. The company quickly made use of the raw resources already available to them, which was enough to double the capacity of Meet.
At the same time, the business worked on identifying and removing any inefficiencies that were limiting the serving stack. They began tuning binary flags and managing resource allocations, while rewriting code to make it more affordable. Making server instances more efficient was a large and multi-dimensional effort. However, the company managed to work together to come up with new solutions and increased throughput.
Using canary environments, Google could test in advance that each new strategy worked properly, while simultaneously making functional improvements to the codebase. Another strategy Google decided on, was crafting fire escapes. A group began identifying and building more production controls and safety nets that the team hoped it would never need.
All of these processes were part of a comprehensive plan to achieve more operational stability and sustainability. With daily handover meetings with people around the world, Google kept working to find the right solution for its customers, and the new landscape.
By the time Google’s efforts were over, Meet had more than 100 million meeting participants every day. You can learn more about Google’s process here.
This news was originally published at uctoday.com