The Dictionary Hub
An online dictionary that compiles results from different popular dictionaries to one place. Designed for convenience among students.
Project Overview
The Dictionary Hub is one of my earliest "useful" project. Completed on June 2022. It was built with Python FastAPI as its backend and JavaScript React as its frontend.
Motivation
This project aims to provide a place for students to look up definitions of words quickly. When I was learning English, I often found myself not understanding the definition of a word in a dictionary, and searching the word in a different one. This gave me the idea of this website, which compiles definitions from different dictionaries. Hoping to help my fellow students.
Current Features (Implemented)
• Word Search Functionality
Users can enter a word via a simple search bar, and the application queries definitions from multiple sources (e.g., Cambridge, Oxford, and Merriam-Webster) using web scraping techniques.
• Multi-Source Definitions Compilation
The app aggregates and displays definitions, pronunciations, and examples from various dictionaries in a single, unified interface, making it easier for users to compare results without switching sites.
• Synonyms and Antonyms Display
In addition to definitions, the application provides synonyms and antonyms for searched words, to enhance vocabulary building and offer a more complete linguistic resource.
• Word of the Day Feature
Upon a user's launch of the site, it automatically displays definitions, pronunciations, and related information for Merriam-Webster's "word of the day," encouraging daily vocabulary building and engagement.
• Chrome Extension Integration
A dedicated Chrome extension that allows users to select text on any webpage and perform a quick search directly through The Dictionary Hub, providing on-the-fly lookups for enhanced convenience.
• Responsive Design for Mobile and Desktop
The application is optimized for various screen sizes, offering a fluid and adaptable layout that works intuitively on both mobile devices and desktops.
• Light and Dark Themes
Users can switch between light and dark modes to personalize their experience.
Technical Architecture
Backend (Python)
- FastAPI: to provide api endpoints
- BeautifulSoup: to scrape the information out of different online dictionaries
Frontend (TypeScript + React)
- TypeScript: for static type checking and code organization
- React: for building the user interface
- Chakra UI: for providing a modern, responsive design system
Challenges and Solutions
Complex dictionary landscape
When I was building this, web scraping was a big headache because these dictionary sites have very complex layouts and it's vastly different from scraping other data. With the natural of a dictionary, there can be multiple interpretation of a word with different meanings, position of speech and examples. It's very hard scrape them in an orderly way.
At the end, I successfully wrote scripts to scrape them. But it comes at a cost of messy code and nearly impossible maintenance.
Strict dictionary security
Initially, for the "Synonyms and Antonyms" feature, I tried scraping the thesaurus.com website. After some exploring, I found that the frontend of the thesaurus.com website exposes its API endpoint for fetching the synonyms and antonyms from its database. I took advantage of that and called its API directly in my backend, to avoid the hassle of some complex scraping script.
And 2 years later, I think they found out and removed the API. To ensure the feature still works, I used the datamuse API for getting synonyms and antonyms. This way is more reliable and consistent as if I built another script, they may change the layout and my feature will be unusable again.
CI/CD pipeline
Although this is a small project and it probably doesn't require a CI/CD pipeline, it still implemented one just for the fun of it. I used "cloud build" to automatically build the docker image of the backend whenever the master branch of the GitHub repository have a new commit. After successfully building the image it will be deployed on "cloud run". Setting up the google cloud platform is a pain. I had to read A LOT of documentation and set a lot permissions. I took me 1 whole day to get it working.
Lessons Learned and Future Enhancements
Being one of the first project I wrote. I learned a lot.
Why clean code is important
After 3 years, I read the web scraping code and it is pretty much uninterpretable. The code I wrote was messy and has little comments which doesn't explain the code. Modifying the code is impossible even for myself. The website is still working but if it breaks, fixing it will require massive amount of work and time.
I understanding now maintaining a project is as important as building it. Writing clean code and documentation is absolutely crucial for myself and other coder who might maintain it.
You should read the docs first
As a beginner in web development, web-scraping and cloud services. I had to learn and figure out a lot of new stuff. I spent most of my time searching for ways to accomplish my development goals.
Turns out a lot of things I want to do is already shown in the docs as an example. If only I had read the docs earlier, I wouldn't have wasted that much time trying to figure out the methods and settings.
Conclusion
Wrapping this up, The Dictionary Hub was one of my first real projects that actually felt useful, and I'm pretty proud of how it turned out back in 2022. It's basically a one-stop spot for looking up words from places like Cambridge, Oxford, and Merriam-Webster, all without ads or jumping between sites—super handy for students who were struggling with English vocab. Sure, dealing with web scraping got messy, and I learned the hard way that clean code and docs are a must if you want to fix stuff later without pulling your hair out. But hey, it taught me tons about FastAPI, React, BeautifulSoup, and even setting up a CI/CD pipeline on Google Cloud (which was a total pain but fun to figure out). For the future, I'd love build a prettier interface or maybe some AI-generated examples. If you're curious, go check out the live demo or the GitHub repo—let me know what you think!