Self-Hosted Video Library Management and Automation Platform for an E-Learning Organisation
About This Project
Client Profile: Online Learning Platform — thousands of active students and staff
Executive Summary
A growing e-learning platform was spending 20+ hours per week on manual video management — renaming files, adding tags one by one, dragging content into Telegram, and typing captions by hand. The process was slow, error-prone, and completely unscalable as their library grew into the thousands.
We designed and built a self-hosted, full-stack web application that replaced their entire manual workflow with a single, automated pipeline. The system scans their video library automatically, indexes all metadata, enables intelligent search and tagging, and uploads content to the correct Telegram channels with auto-generated captions — all without human intervention.
Since launch, the platform has processed thousands of videos. Content management time dropped from 20+ hours per week to under 2. There have been zero wrong-channel distribution incidents. The client's team now focuses entirely on creating content, not managing it.
Client Background
Our client operates a growing EdTech platform serving thousands of students and staff across multiple disciplines and course categories. Their content library — lectures, tutorials, demonstrations, and supplementary materials — is distributed to learners through private, category-specific Telegram channels.
The platform had grown faster than its internal processes. Systems that worked at a few hundred videos were breaking down at thousands. The content team was being stretched thin not by their core responsibilities, but by the operational overhead of managing and distributing files.
The Challenge
A Workflow That Couldn't Scale
When we first engaged with the client, their content distribution process involved seven discrete manual steps for every single video:
- Locate the file in Windows Explorer
- Rename it to include relevant keywords
- Open a separate, outdated content dashboard and search for the file
- Add tags one by one through a slow, unresponsive interface
- Drag the file into Telegram Desktop and wait for the upload
- Type a caption manually, attempting to apply consistent hashtag formatting
- Confirm the file reached the correct channel
With a library growing by dozens of files per week — and every video requiring all seven steps — the team was losing more than half their working week to a process that added no creative value.
The True Cost Was More Than Time
Beyond the hours lost, the manual workflow introduced compounding operational risk:
Wrong audience distribution. Staff-only materials occasionally reached student channels due to channel confusion during the drag-and-drop upload step. For an educational platform with compliance obligations, this was a serious concern.
Inconsistent metadata. The same subject might be tagged differently by different team members on different days — making the content library progressively harder to search and audit.
Broken discoverability. Telegram messages without properly formatted hashtag captions couldn't be found through in-channel search. Content was being created and effectively buried.
No audit trail. There was no reliable record of what had been uploaded, when, to which channel, or by whom. For a team distributing protected educational content, the absence of accountability was a growing liability.
Conversion bottlenecks. Raw .ts video files from the client's recording software had to be manually converted to .mp4 before any upload could happen — adding a blocking, manual step to an already laborious process.
Recovery failures. When uploads failed — due to connection drops, API rate limits, or file size issues — there was no retry mechanism. Team members had to monitor uploads and restart failures manually.
The client needed a solution that was fast to use, reliable at scale, and simple enough for non-technical staff to operate without training.
Our Approach
Discovery and Workflow Mapping
Before any technical decisions were made, we spent time inside the client's workflow. We observed the content team, mapped every step of their process, documented all edge cases, and identified the highest-friction points and most common failure modes.
This produced a clear, prioritised requirements list:
- Automatic video library scanning and incremental indexing
- Fast, intelligent search across filenames, tags, and categories
- Inline tag and category editing with autocomplete
- Automated Telegram upload queue with caption generation and channel routing
- Automatic retry logic for failed uploads
- Large file support beyond Telegram's standard 50 MB limit
- Background video conversion and compression
- Automated duplicate file detection and cleanup
- Complete audit trail of all system actions
We also established a non-negotiable constraint: the system had to run entirely on-premises, with no cloud dependency. All content and metadata would stay on the client's hardware. No data would leave the building.
Technology Selection
Given the on-premises constraint, Windows Server environment, and need for a lightweight, single-machine deployment, we selected:
- Python 3.12 + Flask 3.0 for the backend — mature, well-documented, and deployable as a standalone process with no external server dependency
- SQLAlchemy ORM + Alembic migrations for database management — providing a clean data model with version-controlled schema changes
- SQLite in WAL (Write-Ahead Logging) mode for the database — enabling concurrent reads from the web thread while background workers write, without requiring a separate database server
- Python
threadingwith daemon threads for background workers — keeping the architecture simple and avoiding the overhead of a task queue like Celery or Redis - FFprobe for automatic video metadata extraction
- FileConverter CLI + FFmpeg for video conversion and compression
- python-telegram-bot 21.4 with Telegram Local Bot API Server integration for large file uploads
- Tailwind CSS + Alpine.js + Jinja2 for a responsive, dark-themed frontend that works entirely in the browser
The Solution
Automatic Library Management
The platform's scanner runs recursively through the client's video directory tree, detecting new files, changed files, moved files, and deleted entries. Every video is indexed with full technical metadata — duration, resolution, codec, bitrate, file size, modification timestamp — extracted automatically via FFprobe. The team opens the dashboard to a live, accurate view of their entire library without touching a file manually.
Incremental updates mean re-scans stay fast regardless of library size. Only changed or newly detected files are processed; the rest are confirmed in place.
Intelligent Search and Tagging
The search engine normalises strings before matching, collapsing spaces, hyphens, and underscores so that JEE-Adv_Phy-ElectroStatics_L04 and electrostatics revision and jee advanced physics electrostatics lecture 4 return the same results. Searches run simultaneously across filenames, tags, and categories.
Tag editing is inline — directly on the video card, with comma-separated input and a keyboard-navigable autocomplete dropdown populated from the existing tag library and the video's own filename tokens. Category assignment is a single dropdown selection. Both save instantly, without a separate save step or page reload.
Bulk operations allow tag and category assignment across entire filtered result sets — essential for the initial migration of the client's existing library.
Automated Telegram Upload Queue
The upload queue is the operational heart of the platform.
Videos are added to a persistent FIFO queue backed by SQLite. From that point, the system handles everything:
Automatic caption generation. Tags and category are pulled from the video's database record, formatted as hashtags, and assembled into a Telegram-ready message. Every caption is consistent, complete, and requires zero manual input.
Channel routing. The correct Telegram channel is determined by the video's category assignment. Routing errors — previously a regular occurrence — are structurally impossible.
Retry with exponential backoff. Failed uploads are retried automatically: first at 5 seconds, then 15, then 45, before being flagged as failed and surfaced to the team. The queue keeps processing other videos while a failed entry waits for its next retry.
Large file support. Standard Telegram Bot API uploads are capped at 50 MB — far below the size of most educational video files. Our integration with the Telegram Local Bot API Server lifts this to 2 GB, handling full-quality video without compression or file splitting.
Queue persistence. The SQLite-backed queue survives machine restarts. If the system shuts down during an upload session, processing resumes automatically from the correct position when restarted.
Background Video Conversion
Raw .ts files from the client's recording software are processed by a dedicated conversion worker running in a daemon thread. The worker uses FileConverter CLI as the primary engine with FFmpeg as a fallback, converting files to .mp4 and compressing large outputs to meet upload requirements.
Conversion runs entirely in the background — the web interface stays fully responsive while files are being processed. Live status updates appear directly on the video card, so the team always knows exactly where each file is in the pipeline.
Duplicate Detection and Bulk Cleanup
The platform automatically identifies .ts / .mp4 duplicate pairs — original recordings sitting alongside their converted counterparts — and surfaces them in a dedicated cleanup view. Each pair is shown side by side with hover-triggered video previews. The team can visually verify the pair, then delete, move, or batch-convert originals in one operation.
Security and Access Control
The system binds exclusively to 127.0.0.1 and is never exposed to the network. All routes are protected by a session-based PIN gate using constant-time comparison to prevent timing attacks. The Telegram bot token is stored in the database and never written to logs or config files.
System Architecture
Single-process Flask with
threaded=True. SQLite WAL mode allows the web thread and both daemon threads to operate concurrently without locks or a separate database server.
Results
| Metric | Before | After | Change |
|---|---|---|---|
| Weekly content management time | 20+ hours | Under 2 hours | −90% |
| Wrong-channel distribution incidents | Multiple per month | Zero since launch | −100% |
| Caption consistency rate | Inconsistent, manual | 100% automated | +100% |
| Upload failure recovery | Manual restart | Automatic retry | Fully automated |
| Video conversion workflow | Manual, blocking | Background, non-blocking | Eliminated as bottleneck |
| Duplicate file cleanup | Hours per session | Minutes per session | −95% |
| Content library auditability | None | Complete log | New capability |
The platform has processed thousands of videos since launch with zero critical failures.
Client Impact
The measurable numbers tell part of the story. The operational shift tells the rest.
Before this system, the client's content team spent the majority of their working week on file management. After it, they spend almost none of it. The work that used to require constant attention — locating files, adding tags, uploading to Telegram, checking captions, cleaning up duplicates — now happens automatically or takes seconds of interaction.
More importantly, the team has something they didn't have before: confidence in their own content library. They know exactly what's been published, where it went, when it was sent, and what it was tagged with. For a platform distributing educational content to thousands of learners, that confidence isn't a convenience — it's a requirement.
Key Takeaways for Technology Decision-Makers
1. The cost of manual workflows compounds over time. What starts as a manageable process becomes an operational liability as volume grows. The question isn't whether automation pays for itself — it's how much you've already spent by not having it.
2. Generic tools solve generic problems. SaaS platforms, off-the-shelf CMSes, and no-code solutions are excellent for standard use cases. But when your workflow has specific file types, specific distribution logic, and specific compliance requirements, adapting a generic tool to fit is often more expensive than building one that fits precisely.
3. On-premises doesn't mean outdated. This entire system runs on a standard Windows machine with no cloud dependency. Local-first architecture is increasingly the right choice for organisations with data sensitivity requirements, inconsistent connectivity, or a preference for operational simplicity.
4. Simplicity in architecture enables reliability. No Redis. No Celery. No Docker. No microservices. A single Flask process with two daemon threads and SQLite handles thousands of files reliably because the architecture is matched to the actual requirements — not to what looks impressive on a diagram.
About This Engagement
This project was scoped, designed, and delivered by our development team over approximately 8 weeks, from workflow discovery to production deployment.
We work with clients across education technology, media, logistics, finance, and professional services to design and build custom internal tools, automation systems, Python backends, REST APIs, AI-powered workflows, and full-stack web and mobile applications.
If your organisation is managing content, data, or operations through a combination of manual steps and disconnected tools, we'd be glad to map what an automated alternative could look like.
Custom Software Development · Python · Flask · FastAPI · Next.js · Node.js · React · React Native · SQLite · PostgreSQL · REST API · Telegram Bot Integration · FFmpeg Automation · Workflow Automation · Internal Tools · AI-Powered Applications · EdTech Solutions · On-Premises Deployment
Project Details
- Sector
- EdTech
- Timeline
- 1 week
- Engagement
- Custom Internal Tool Development · Workflow Automation
Tech Stack
Want Results Like This?
Tell us what you're building. We'll scope it, price it, and ship it — faster than you expect.
We respond within 24 hours. No sales pitch — just a straight conversation about your project.