The goal of this blog is to showcase, in detail, the work that Vaibhav Upreti did on CircuitVerse during Google Summer of Code 2023, which took place from May 29, 2023, to 28 August 2023.
CircuitVerse is a cool open-source platform which allows users to construct digital logic circuits online.
Table of Contents
The primary objective of my GSoC project was to upgrade CircuitVerse’s deployment infrastructure to meet the 12 factor standards, that would pave the way for a more efficient, scalable, maintainable, and robust platform. The project involved several important tasks, each contributing to the overall enhancement of the platform.
For a detailed description of the project, refer to the project page.
Here’s a concise summary of my achievements:
I prioritized the implementation of 12 Factor principles throughout the development process.
An achievement was customizing CircuitVerse’s Docker image for wider usability, reducing memory consumption(by using jemalloc) and reducing Docker image build time.
Initialized CircuitVerse runbooks, as suggested by my mentor, which provide comprehensive documentation for production deployment, including all necessary background information.
Large-Scale Migration: I led the migration of nearly a million assets, including user profile pictures and circuit images from old, deprecated Configuration (CarrierWave, PaperClip) to rails solution for handling file uploads called ActiveStorage on AWS S3. This transition not only improved storage efficiency but set the stage for seamless expansion.
My approach: Ensure zero downtime for users by mirroring uploads to both new(ActiveStorage) and old(Paperclip, CarrierWave) configurations, followed by data migrations and background jobs to backfill data.
Initially, we employed the data_migrations approach, maintaining a Redis counter for tracking progress and enhancing logging for insights. However, with growing server traffic, memory issues arose, leading us to transition to background jobs via Sidekiq. For this we utilized Shopify’s maintenance_tasks gem, employing a single job to migrate 1000 records.
Scalability & Cost Reduction: Migrating to object storage, specifically S3, not only reduced infrastructure costs compared to EBS due to its cost-effectiveness but also ensured scalability, making it a preferred choice for storing large volumes of data and accommodating future growth.
I configured distributed tracing with OpenTelemetry for CircuitVerse and exported the telemetry data to jaeger and new relic backend. This tracing system provides invaluable insights into our platform’s performance, enabling us to identify bottlenecks and enhance user experiences
OpenTelemetry’s architecture and its utilization in our service-
Jaeger Dashboard
New Relic Dashboard
Inspecting a trace
Successfully set up a Continous Deployment Pipeline that deploys CircuitVerse Docker images to production using GitHub Actions and kamal with zero downtime.
Kamal uses the dynamic reverse-proxy Traefik to hold requests, while the new app container is started and the old one is stopped ā working seamlessly across multiple hosts, using SSHKit to execute commands. Originally built for Rails apps, Kamal will work with any type of web app that can be containerized with Docker.
The workflow consists of two jobs:
build-production
:
This job builds the Docker image and pushes it to the registry for linux/amd64 and linux/arm64 architectures.
The build process is optimized using docker buildx caching, significantly reducing build times.
deploy
:
After the build job completes, the deploy workflow requires a review by a repository committer.
Once approved, it sets up Kamal and deploys the latest Docker image tagged with the GitHub SHA hash from the repository’s current origin.
As we can see in the image above the deploy job has protection rules for the “production” environment in GitHub Actions. When a newer deploy
job is enqueued, it cancels the previous workflow to ensure the latest image is deployed.
In the deploy action, Kamal performs several key tasks:
http://localhost:3999/up
route.Hence, in CircuitVerse CI workflows, we build Docker images for each pull request to the master branch, helping developers validate their code for production readiness.
Memory Optimisation: Configured Jemalloc for Docker image, reducing memory fragmentation.
Deploying CircuitVerse to staging environment successfully.
Feeback
Introduced Monit, Monit is an open source server monitoring tool, it conducts automatic maintenance and repair and can execute meaningful tasks.
I added Monit configuration for the following services:
Monit promptly restarts services and sends SMTP alerts when a service goes down or reaches its alert limit
Monit Alerts
HyperLogLog is a probabilistic data structure that estimates the cardinality of a set. As a probabilistic data structure, HyperLogLog trades perfect accuracy for efficient space utilization. Thus this algorithm can estimate the number of unique values within a very large dataset using little memory and time.
Transition Strategy: I evaluated multiple HLL (HyperLogLog) libraries, prioritizing solutions aligned with ease of setup, precision, and strong community support.
We had three options:
Most of the libraries that evaluated HLLs were outdated, hence the idea of storing HLLs as text in the database was temporarily shelved. Additionally, others had external dependencies that could complicate setup for new contributors. Using Redis HyperLogLog counters appeared viable(just like GitLab uses HLL counters) but would entail higher infrastructure costs. After discussions with my mentor, we decided to exclude this from the program’s scope due to the need for further research and potential complexities.
Pull Request | Description |
---|---|
fix: erb tags | Fix for erb tags in the codebase |
feat: mirror pfp & projects, backfill profile_pictures | Added a feature to mirror pfp & projects,while simultaneously backfilled profile_pictures |
feat: migrate image_preview to AWS S3 | Migration of image_preview to AWS S3 storage |
chore: update rails to 7.0.5.1 | Updated Rails version to 7.0.5.1 |
fix: use env[] instead of fetch | Code fix to use env[] instead of .fetch |
feat: make member since field more readable | Added a feature to make the ‘member since’ field more readable |
feat: distributed tracing using OpenTelemetry | Implemented distributed tracing using OpenTelemetry |
feat: continuous deployment workflow using GitHub Actions and Kamal | Added a continuous deployment workflow using GitHub Actions and Kamal |
feat: serve profile_pictures with ActiveStorage | Implemented serving profile pictures with ActiveStorage |
chore: disable generating spans for default settings | Disabled generating spans for default settings |
fix: commentator profile_picture error | Fixed commentator profile_picture error |
chore: rerun image preview migration | Reran the image preview migration |
feat: migrate image_preview using Sidekiq | Migrated image_preview using Sidekiq |
chore: make maintenance tasks migrations safe | Made maintenance tasks migrations safe |
chore: mark maintenance tasks migrations safe | Marked maintenance tasks migrations as safe |
feat: deploy CircuitVerse to staging using Kamal | Deployed CircuitVerse to staging using Kamal |
feat: Serve assets using active storage | Serve Image Preview using ActiveStorage |
feat: production deployment using kamal | Deploy CircuitVerse to production using kamal |
Pull Request | Description |
---|---|
feat: monit config files #1 | Added Monit configuration files |
feat: Intialise runbook #3 | Initialized CircuitVerse runbooks |
docs: distributed tracing using OpenTelemetry #5 | Documented distributed tracing using OpenTelemetry |
docs: Kamal documentation #6 | Added Kamal documentation |
I published weekly blog posts throughout this period, which you can read at https://vaibhavupreti.github.io/hugo-blog/tags/gsoc
Featured posts:
Iām excited to continue as a Core Team member, maintaining this incredible open-source project.
Additionally, we plan to implement a blue-green deployment approach implement the CD pipeline after rigorous testing in the staging environment.
Blue
- older serverGreen
- current staging environmentThis involves copying the latest production data to staging(latest pg_dump
and redis data),
Production traffic will continue on ‘blue’ until we replicate and scale ‘green’ to match or exceed its capacity.
Once performance and stability are confirmed, we’ll transition production traffic to ‘green’, the staging server
and phase out the older ‘blue’ instance, ensuring a risk-minimized transition.
I’m grateful to my mentor Aboobacker M.K who helped me whenever I faced challenges and never overlooked any part of their mentoring. Taught me a lot of stuff around Ruby, Rails and Software Development in general. The weekly meetings were exceptionally informative, and I cannot overstate how much I learned through my interactions with my mentor. I doubt I will ever encounter a similar experience. Their dedication motivates me to aspire to become a software engineer like them and to share my learnings with others.